A Knowledge-Light Approach to Luo Machine Translation and Part-of-Speech Tagging
Date
2010Author
Pauw, Guy De
Maajabu, Naomi
Wagacha, Peter Waiganjo
Type
ArticleLanguage
enMetadata
Show full item recordAbstract
This paper describes the collection and exploitation of a small trilingual corpus English - Swahili - Luo (Dholuo). Taking advantage of
existing morphosyntactic annotation tools for English and Swahili and the unsupervised induction of Luo morphological segmentation
patterns, we are able to perform fairly accurate word alignment on factored data. This not only enables the development of a workable
bidirectional statistical machine translation system English - Luo, but also allows us to part-of-speech tag the Luo data using the projection
of annotation technique. The experiments described in this paper demonstrate how this knowledge-light and language-independent
approach to machine translation and part-of-speech tagging can result in the fast development of language technology components for a
resource-scarce language.
Citation
Proceedings of the Second Workshop on African Language Technology - AfLaT 2010 -Publisher
School of Computing & Informatic, University of Nairobi CLiPS - Computational Linguistics Group University of Antwerp, Belgium