Developing an annotated corpus for Gıkuyu using language-independent machine learning techniques
View/ Open
Date
2006Author
Wagacha, Peter W
De Pauwy, Guy
Getao, Katherine W
Type
PresentationLanguage
enMetadata
Show full item recordAbstract
Networking the development of computational resources for African languages can be greatly advanced if researchers aim to develop
tools that are to a large extent language-independent and therefore reusable for other languages. In this paper we describe a particular
case study, namely the development of an annotated corpus of G k uy u, using language-independent machine learning techniques. The
general aim of our work on G k uy u is two-fold: on the one hand we wish to digitally preserve this resource-scarce language, while
on the other hand it serves as a feasibility study of using language-independent machine learning techniques for linguistic annotation
of corpora. To this end we investigate established annotation induction techniques like unsupervised learning and knowledge transfer.
These methods can provide interesting perspectives for the linguistic description of many other resource-scarce languages.
Citation
Peter W.Wagacha , Guy De Pauwy and Katherine W. Getao (2006). Developing an annotated corpus for Gıkuyu using language-independent machine learning techniquesPublisher
School of Computing & Informatics CNTS - Language Technology Group University of Antwerp, Antwerpen, Belgium