Normal view MARC view ISBD view

Extracting terminology by language independent methods / Sanja Seljan ; Ivan Dunđer, Hrvoje Stančić.

By: Seljan, Sanja.
Contributor(s): Stančić, Hrvoje [aut] | Dunđer, Ivan [aut].
Material type: ArticleArticleDescription: 141-147 str.Other title: Extracting terminology by language independent methods [Naslov na engleskom:].Subject(s): 5.04 | automatic terminology extraction, statistical tools, language independent methods, evaluation, indexing hrv | automatic terminology extraction, statistical tools, language independent methods, evaluation, indexing eng In: International Translata Conference (2 ; 2014. ; Innsbruck, Austrija) Translation studies and translation practice : proceedings of the 2nd International TRANSLATA Conference 2014. Part 1Summary: Automatic extraction of corpus-based terminology can help in building terminology lists which represent valuable resource for the research, education and practical implementation. Specific terminology lists represent an intermediate step between the free text and the controlled vocabulary. Such lists can be used in information retrieval, in document indexing, in machine learning, in education, or extended to cross-language information access. Terminology extraction could be performed on monolingual or bilingual/ multilingual texts by . various terminology extraction methods relying on statistical or language approaches, or on hybrid model. Evaluation of extracted terminology candidates requires considerable human expertise in evaluation and final compilation. The paper presents automatic extraction process from monolingual text performed by three language independent tools, but relying on different principles. The research is conducted on the specific text domain relating to medical documentation consisting of reports, approvals and decisions on chemical and pharmaceutical documentation and instructions of use. After the digitization process and use of OCR techniques, the automatic extraction process is performed by use of three language independent tools. Results are compared and evaluated. Results are discussed in the frame of possible integration into more complex information system.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

MZOS i potpora Sveučilišta

Automatic extraction of corpus-based terminology can help in building terminology lists which represent valuable resource for the research, education and practical implementation. Specific terminology lists represent an intermediate step between the free text and the controlled vocabulary. Such lists can be used in information retrieval, in document indexing, in machine learning, in education, or extended to cross-language information access. Terminology extraction could be performed on monolingual or bilingual/ multilingual texts by . various terminology extraction methods relying on statistical or language approaches, or on hybrid model. Evaluation of extracted terminology candidates requires considerable human expertise in evaluation and final compilation. The paper presents automatic extraction process from monolingual text performed by three language independent tools, but relying on different principles. The research is conducted on the specific text domain relating to medical documentation consisting of reports, approvals and decisions on chemical and pharmaceutical documentation and instructions of use. After the digitization process and use of OCR techniques, the automatic extraction process is performed by use of three language independent tools. Results are compared and evaluated. Results are discussed in the frame of possible integration into more complex information system.

Projekt MZOS 130-1300646-0909

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha

//