Processing Malaysian Indigenous Languages:
A Focus on Phonology and Grammar
Citing: Asmah Haji Omar (2019). Processing Malaysian Indigenous Languages: A Focus on Phonology and Gramma. Journal of Asian Linguistic Anthropology. 1, 1, 3-19.
Malaysian indigenous languages are of two entirely different families: Austronesian and Austroasiatic. The former consists of Malay and all the languages of Sabah and Sarawak, while the latter the aboriginal languages found only in Peninsular Malaysia. Except for Malay and a few more in Sabah and Sarawak, most of these languages have not been put into writing. This means that no writing system has been ascribed to them, despite the fact that quite a number have been described in terms of phonology, morphology and syntax. From the descriptions available, one gets a picture of their typologies and systems for processing purposes. Concerning typology, there is not much difference between the two families as far as phonemic inventories go, but there are differences in the phonological structures of the syllable and the word. As for morphology, the Austronesian languages are agglutinative, while the Austroasiatic ones are isolative. There is also a difference in the syntactical status of the word, where the former has the two categories of the full word and the particle, and the latter only the full word. This last mentioned difference leads to a divergence between them in the types of phrase, the clause, and the complex sentence. Natural language processing (NLP) is a methodology which is now being applied in the analysis of various aspects of languages. This paper discusses the constraints faced by most of the Malaysian indigenous languages in the application of this methodology.