Normal view MARC view ISBD view

A Generic Method for Multi Word Extraction from Wikipedia / Bekavac, Božo ; Tadić, Marko.

By: Bekavac, Božo.
Contributor(s): Tadić, Marko [aut].
Material type: materialTypeLabelArticleDescription: 663-667.Other title: A Generic Method for Multi Word Extraction from Wikipedia [Naslov na engleskom:].Subject(s): 5.04 | 6.03 | multi word expressions, multi word extraction, Croatian, Wikipedia hrv | multi word expressions, multi word extraction, Croatian, Wikipedia eng In: 30th International Conference on Information Technology Interfaces (ITI 2008) (23-26.06.2008. ; Cavtat / Dubrovnik, Hrvatska) Proceedings of the 30th International Conference on Information Technology Interfaces str. 663-667Lužar-Stiffler, Vesna ; Hljuz Dobrić, Vesna ; Bekić, ZoranSummary: This paper presents the generic method for multiword expression extraction from Wikipedia. The method is using the propreties of this specific encyclopedic genre in its HTML format and it relies on the intention of the autors of articles to link to other articles. The relevant links were processed by applying local regular grammars within the NooJ development envi-ronment. We tested the method on a Croatian version of Wikipedia and we present the results obtained.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

This paper presents the generic method for multiword expression extraction from Wikipedia. The method is using the propreties of this specific encyclopedic genre in its HTML format and it relies on the intention of the autors of articles to link to other articles. The relevant links were processed by applying local regular grammars within the NooJ development envi-ronment. We tested the method on a Croatian version of Wikipedia and we present the results obtained.

Projekt MZOS 036-1300646-1986

Projekt MZOS 130-1300646-0645

Projekt MZOS 130-1300646-1002

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha