Normal view MARC view ISBD view

Statistical machine translation of Croatian weather forecast: How much data do we need? / Ljubešić, Nikola ; Bago, Petra ; Boras, Damir.

By: Ljubešić, Nikola, informatičar.
Contributor(s): Boras, Damir [aut] | Bago, Petra [aut].
Material type: materialTypeLabelArticleDescription: 91.Other title: Statistical machine translation of Croatian weather forecast: How much data do we need? [Naslov na engleskom:].Subject(s): 5.04 | statistical machine translation, weather forecast, automatic evaluation, human evaluation hrv | statistical machine translation, weather forecast, automatic evaluation, human evaluation engOnline resources: Click here to access online In: ITI 2010 32nd International Conference on Information Technology Interfaces (21.-24.06.2010. ; Cavtat / Dubrovnik, Hrvatska) Proceedings of the ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES str. 91Luzar-Stiffler, V.Summary: This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7, 893 sentence pairs. Evaluation is performed by best known automatic evaluation measures BLUE, NIST and METEOR, as well as by evaluating manually a sample of 200 translations. In this research we have shown that with a small-sized training set and the state-of-the art Moses system, decoding can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is to be expected by increasing the training set size.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7, 893 sentence pairs. Evaluation is performed by best known automatic evaluation measures BLUE, NIST and METEOR, as well as by evaluating manually a sample of 200 translations. In this research we have shown that with a small-sized training set and the state-of-the art Moses system, decoding can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is to be expected by increasing the training set size.

Projekt MZOS 130-1301679-1380

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha