Open Source Spelling Checker for Kîmîîrû Language
Computational Linguistics has been of extensive research interest in Europe, America, South Africa, and other parts of the world. However, very few Human Language Technology Projects have existed in Kenya, particularly for Bantu Languages because Kiswahili and English are the dominant languages. Therefore there is a need to develop tools to support electronic document preparation in resource-poor languages. This work describes the development of an open source spellchecker for Kîmîîrû language using the Hunspell language tools which examines the morphological analysis of Kîmîîrû language, highlighting nouns and verbs derivation and also provides a suggestion component used to generate probable suggestions for a misspelled word. The focus of this project is the creation of two major Hunspell files namely; the affix file (.aff) and the dictionary file (.dic). The affix file enables the creation of all the rules involved in deriving the Kîmîîrû nouns and the verbs from the root words (stems). All the stems plus the appended rules are stored in the dictionary file. The developed spellchecker is the first spellchecker for Kîmîîrû language and it can correctly classify Kîmîîrû words with an accuracy rate of 80%, precision rate of 100% and a recall rate of 78%. This Functional system is aimed at being adopted in major opensource products such as Open Office, Mozilla products such as ThunderBird and FireFox, Google Chrome.