Resource-Light Bantu Part-of-Speech Tagging

Pauwy, Guy De; de Schryverz, Gilles-Maurice; Looy, Janneke van de

dc.contributor.author	Pauwy, Guy De
dc.contributor.author	de Schryverz, Gilles-Maurice
dc.contributor.author	Looy, Janneke van de
dc.date.accessioned	2013-06-21T13:47:48Z
dc.date.available	2013-06-21T13:47:48Z
dc.date.issued	2012
dc.identifier.citation	Workshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL8/AfLaT2012)	en
dc.identifier.uri	http://tshwanedje.com/publications/BantuPOS.pdf
dc.identifier.uri	http://hdl.handle.net/11295/37634
dc.description.abstract	Recent scientiﬁc publications on data-driven part-of-speech tagging of Sub-Saharan African languages have reported encouraging accuracy scores, using off-the-shelf tools and often fairly limited amounts of training data. Unfortunately, no research efforts exist that explore which type of linguistic features contribute to accurate part-of-speech tagging for the languages under investigation. This paper describes feature selection experiments with a memory-based tagger, as well as a resource-light alternative approach. Experimental results show that contextual information is often not strictly necessary to achieve a good accuracy for tagging Bantu languages and that decent results can be achieved using a very straightforward unigram approach, based on orthographic features	en
dc.language.iso	en	en
dc.title	Resource-Light Bantu Part-of-Speech Tagging	en
dc.type	Presentation	en
local.publisher	CLiPS - Computational Linguistics Group University of Antwerp, Belgium	en
local.publisher	School of Computing and Informatics, University of Nairobi	en

Files in this item

Name:: Resource-Light Bantu Part-of-Speech ...
Size:: 839.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Science & Technology (FST) [853]

Show simple item record