Statistical machine translation of croatian weather forecasts: how much data do we need? / Ljubešić, Nikola ; Bago, Petra ; Boras, Damir.
By: Ljubešić, Nikola, informatičar
.
Contributor(s): Boras, Damir [aut]
| Bago, Petra [aut]
.
Material type: 
This research is the first step towards developing a system for translating Croatian weather forecasts into multiple languages. This step deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic, con- sisting of 7, 893 sentence pairs. Evaluation is performed by the automatic evaluation measures BLUE, NIST and METEOR, as well as by manually evaluating a sample of 200 translations. We have shown that with a small- sized training set and the state-of-the art Moses system, decod- ing can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is expected by increasing the training set size. Finally, the correlation of the recorded evaluation measures is explored.
Projekt MZOS 130-1301679-1380
ENG
There are no comments for this item.