Normal view MARC view ISBD view

Croatian web text summarizer (CroWebSum) / Mikelić Preradović, Nives ; Ljubešić, Nikola ; Boras, Damir.

By: Mikelić Preradović, Nives.
Contributor(s): Boras, Damir [aut] | Ljubešić, Nikola, informatičar [aut].
Material type: materialTypeLabelArticleDescription: 109-114.Other title: Croatian web text summarizer (CroWebSum) [Naslov na engleskom:].Subject(s): 5.04 | Newspaper text summarizer, SweSum, Croatian language, extract, inflected language hrv | Newspaper text summarizer, SweSum, Croatian language, extract, inflected language eng In: ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES (21.-24.06.2010. ; Cavtat, Hrvatska) Proceedings of the ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES str. 109-114Luzar-Stiffler, V.Summary: The paper describes automatic summarization of newspaper texts in Croatian language. The goal of the CroWebSum is to generate high-quality extracts that are both coherent and keep relevant information from the original text. The preliminary evaluation shows that extracts in the size of 10 % of the original text have good coherence, while the extract in the size of 5 % of the original text still conveys the most relevant information. Also, while cutting down news to SMS size (maximum 160 characters), CroWebSum performed quite well. The research brought us to conclusion that we should develop a technique that uses context vectors to calculate the semantic similarity between the terms in the document as well as pronoun resolution algorithm in order to improve the text summarization for Croatian language.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

The paper describes automatic summarization of newspaper texts in Croatian language. The goal of the CroWebSum is to generate high-quality extracts that are both coherent and keep relevant information from the original text. The preliminary evaluation shows that extracts in the size of 10 % of the original text have good coherence, while the extract in the size of 5 % of the original text still conveys the most relevant information. Also, while cutting down news to SMS size (maximum 160 characters), CroWebSum performed quite well. The research brought us to conclusion that we should develop a technique that uses context vectors to calculate the semantic similarity between the terms in the document as well as pronoun resolution algorithm in order to improve the text summarization for Croatian language.

Projekt MZOS 130-1301679-1380

Projekt MZOS 130-1301799-1999

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha