Merkler, Danijela

Babel Treebank of Public Messages in Croatian / Merkler, Danijela ; Agić, Željko ; Agić, Ana. - 490-497 str.

The paper presents the process of constructing a publicly available treebank of public messages written in Croatian. The messages were collected from various electronic sources – e-mail, blog, Facebook and SMS – and published on the Zagreb Museum of Contemporary Art LED facade within the Babel art project. The project aimed to use the facade as an open-space blog or social interface for enabling citizens to publicly express their views. Construction and current state of the treebank is presented along with future work plans. A comparison of Babel Treebank with Croatian Dependency Treebank and SETimes.HR treebank regarding differing domains and annotation schemes is briefly sketched. The treebank is used as a test platform for introducing a new standard for syntactic annotation of Croatian texts. An experiment with morphosyntactic tagging and dependency parsing of the treebank is conducted, providing first insight to computational processing of non-standard text in Croatian.



10.1016/j.sbspro.2013.10.673 doi

Agić, Ana ; Agić, Željko ;

Powered by Koha