Normal view MARC view ISBD view

From Digitisation Process to Terminological Digital Resources / Seljan, Sanja ; Dunđer, Ivan ; Gašpar, Angelina.

By: Seljan, Sanja.
Contributor(s): Dunđer, Ivan [aut] | Gašpar, Angelina [aut].
Material type: ArticleArticleDescription: xx-xx str.Other title: From Digitisation Process to Terminological Digital Resources [Naslov na engleskom:].Subject(s): 5.04 | digitization, term and collocation extraction, Multi-Word Unit (MWU), statistical and language approaches, evaluation, English, Croatian hrv | digitization, term and collocation extraction, Multi-Word Unit (MWU), statistical and language approaches, evaluation, English, Croatian eng In: International Convention on Information and Communication Technology, Electronics and Microelectronics (20-24.05.2013. ; Opatija, Hrvatska) Proceedings of the 36th International Convention MIPRO 2013 str. xx-xxBiljanović, P.Summary: Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on statistical or language approaches, or on hybrid model, but require considerable human expertise in evaluation and final compilation. The paper describes the whole process: from digitisation of printed material, OCR techniques, sentence alignment and creation of translation memories, up to terminology extraction and evaluation. The performance of tools and applied methodology is assessed through standard statistical measures of precision, recall and F-measure. Experimental results are produced, deficiencies of semi-automatic statistical and linguistic system highlighted and recommendations for further research suggested.
Tags from this library: No tags from this library for this title. Log in to add tags.
Item type Current location Call number Status Notes Date due Barcode Item holds
Rad s konferencije Rad s konferencije Knjižnica FFZG
SNZ
Available zatražite skeniranje na snz@ffzg.hr 1305195717
Total holds: 0

Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on statistical or language approaches, or on hybrid model, but require considerable human expertise in evaluation and final compilation. The paper describes the whole process: from digitisation of printed material, OCR techniques, sentence alignment and creation of translation memories, up to terminology extraction and evaluation. The performance of tools and applied methodology is assessed through standard statistical measures of precision, recall and F-measure. Experimental results are produced, deficiencies of semi-automatic statistical and linguistic system highlighted and recommendations for further research suggested.

Projekt MZOS 130-1300646-0909

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha

//