dc.contributor.author | Pauw, Guy De | |
dc.contributor.author | Wagacha, Peter Waiganjo | |
dc.contributor.author | de Schryver, Gilles-Maurice | |
dc.date.accessioned | 2013-06-21T13:12:59Z | |
dc.date.available | 2013-06-21T13:12:59Z | |
dc.date.issued | 2009 | |
dc.identifier.citation | Proceedings of the EACL 2009 Workshop on Language Technologies for African Languages – AfLaT 2009, pages 9–16, Athens, Greece, 31 March 2009 | en |
dc.identifier.isbn | 1-932432-25-6 | |
dc.identifier.uri | http://dl.acm.org/citation.cfm?id=1564511 | |
dc.identifier.uri | http://hdl.handle.net/11295/37612 | |
dc.description.abstract | Research in data-driven methods for Machine Translation has greatly benefited
from the increasing availability of parallel corpora. Processing the same text in
two different languages yields useful information on how words and phrases are
translated from a source language into a
target language. To investigate this, a parallel corpus is typically aligned by linking
linguistic tokens in the source language to
the corresponding units in the target language. An aligned parallel corpus therefore facilitates the automatic development
of a machine translation system and can
also bootstrap annotation through projection. In this paper, we describe data collection and annotation efforts and preliminary experimental results with a parallel
corpus English - Swahili. | en |
dc.language.iso | en | en |
dc.publisher | Association for Computational Linguistics | en |
dc.title | The SAWA corpus: a parallel corpus English - Swahili | en |
dc.type | Presentation | en |
local.publisher | School of Computing and Informatics, University of Nairobi, Kenya | en |
local.publisher | African Languages and Cultures, Ghent University, Belgium Xhosa Department, University of the Western Cape, South Africa | en |
local.publisher | CNTS - Language Technology Group, University of Antwerp, Belgium | en |