Normal view MARC view ISBD view

Document Representation Methods for News Event Detection in Croatian / Ljubešić, Nikola ; Agić, Željko ; Bakarić, Nikola.

By: Ljubešić, Nikola, informatičar.
Contributor(s): Bakarić, Nikola [aut] | Agić, Željko [aut].
Material type: materialTypeLabelArticleDescription: 79-84.Other title: Document Representation Methods for News Event Detection in Croatian [Naslov na engleskom:].Subject(s): 2.09 | 5.04 | 6.03 | document representation, document clustering, news event detection hrv | document representation, document clustering, news event detection engOnline resources: Click here to access online In: 6th International Conference on Formal Approaches to South Slavic and Balkan Languages (FASSBL 2008) (25-28.09.2008. ; Dubrovnik, Hrvatska) Proceedings of the 6th International Conference on Formal Approaches to South Slavic and Balkan Languages str. 79-84Tadić, Marko ; Dimitrova-Vulchanova, Mila ; Koeva, SvetlaSummary: Constant increase in the amount of available data in the world in general demands new organizational and representational ideas and approaches. Document clustering as a method for event detection uses, supplements and upgrades existing information retrieval methods in order to improve knowledge management and representation. This article describes the research done in order to determine the impact of various methods of document representation on cluster analysis. Several statistical and linguistic NLP morphological normalization methods of document representation are tested in an event detection scenario. Event detection was conducted using online newspaper articles issued on a single day. A cluster analysis was done using the various document representation methods and a clustering algorithm. The results were then compared against a human evaluated golden standard. The results show that both statistical and linguistic methods simplify the representational complexity and minimally improve the results which lead to the conclusion that for this task statistical methods should be preferred.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

Constant increase in the amount of available data in the world in general demands new organizational and representational ideas and approaches. Document clustering as a method for event detection uses, supplements and upgrades existing information retrieval methods in order to improve knowledge management and representation. This article describes the research done in order to determine the impact of various methods of document representation on cluster analysis. Several statistical and linguistic NLP morphological normalization methods of document representation are tested in an event detection scenario. Event detection was conducted using online newspaper articles issued on a single day. A cluster analysis was done using the various document representation methods and a clustering algorithm. The results were then compared against a human evaluated golden standard. The results show that both statistical and linguistic methods simplify the representational complexity and minimally improve the results which lead to the conclusion that for this task statistical methods should be preferred.

Projekt MZOS 130-1300646-1776

Projekt MZOS 130-1301679-1380

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha