A parallel corpus for amharic–english machine translation / Andargachew Mekonnen Gezmu, Andreas Nürnberger, Tesfaye Bayu Bati (Data and Knowledge Engineering Group)
Anzeigen / Download515.95 KB
Discovery
1041211511
URN
urn:nbn:de:gbv:3:2-101264
DOI
ISBN
ISSN
Beiträger
Körperschaft
Erschienen
Magdeburg : Fakultät für Informatik, Otto-von-Guericke-Universität Magdeburg, [2018]
Umfang
1 Online-Ressource (5 Seiten, 0,5 MB)
Ausgabevermerk
Sprache
eng
Anmerkungen
Inhaltliche Zusammenfassung
This paper describes the acquisition, preprocessing, segmentation and alignment of an Amharic-English parallel corpus. In doing so we addressed language-specific issues such as normalization and end-ofsentence disambiguation. The corpus consists of 145,820 Amharic-English parallel sentences (segments) from various sources. This corpus is larger in size than previously compiled corpora. It is released for research purposes and can be used to train or support Amharic-English machine translation systems.
Schriftenreihe
Technical report ; 2018, 04 ppn:570164265