Normal view MARC view ISBD view

A framework for consolidating most important digitized Croatian dictionaries from 1595 to 1945 / Bago, Petra ; Boras, Damir.

By: Bago, Petra.
Contributor(s): Boras, Damir [aut].
Material type: ArticleArticleDescription: str.Other title: A framework for consolidating most important digitized Croatian dictionaries from 1595 to 1945 [Naslov na engleskom:].Subject(s): 5.04 | information technology, dictionary heritage, Text Encoding Initiative (TEI), digitization hrv | information technology, dictionary heritage, Text Encoding Initiative (TEI), digitization eng In: Information Technology and Journalism (28.05.-01.06.2012. ; Dubrovnik, Hrvatska) 17. International Information Technology and Journalism conferenceSummary: The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, mainly in the humanities, social sciences and linguistics. The Guidelines have been widely used for online research, teaching, and preservation. Dictionaries are considered one of the most complex text types in TEI because of their high degree of structuring and compression of information [1]. This research deals with most important Croatian printed bilingual and multilingual dictionaries from 1595 to 1945. Currently, all of the dictionaries are in the process of digitization. All pages of the dictionaries have been digitally photographed (most of them in color) in a very high resolution and transformed each into its own text database. The transformation was conducted in three steps. First, the photographs were processed with an OCR software. Second, an automatic method was used to correct some of the mistakes. Third, everything was manually corrected to minimize errors in the documents. All of the dictionaries went through the first and second phase ; a small number of them still have to be manually checked. The aim of this research is to provide an eXtensible Markup Language (XML) platform for selected Croatian printed bilingual and multilingual dictionaries based on TEI: P5 Guidelines for Electronic Encoding and Interchange. The idea is to design a universal structure that would contain all views of TEI Guidelines for each dictionary: the typographic view, the editorial view and the lexical view.
Tags from this library: No tags from this library for this title. Log in to add tags.
No physical items for this record

The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines which specify encoding methods for machine-readable texts, mainly in the humanities, social sciences and linguistics. The Guidelines have been widely used for online research, teaching, and preservation. Dictionaries are considered one of the most complex text types in TEI because of their high degree of structuring and compression of information [1]. This research deals with most important Croatian printed bilingual and multilingual dictionaries from 1595 to 1945. Currently, all of the dictionaries are in the process of digitization. All pages of the dictionaries have been digitally photographed (most of them in color) in a very high resolution and transformed each into its own text database. The transformation was conducted in three steps. First, the photographs were processed with an OCR software. Second, an automatic method was used to correct some of the mistakes. Third, everything was manually corrected to minimize errors in the documents. All of the dictionaries went through the first and second phase ; a small number of them still have to be manually checked. The aim of this research is to provide an eXtensible Markup Language (XML) platform for selected Croatian printed bilingual and multilingual dictionaries based on TEI: P5 Guidelines for Electronic Encoding and Interchange. The idea is to design a universal structure that would contain all views of TEI Guidelines for each dictionary: the typographic view, the editorial view and the lexical view.

ENG

There are no comments for this item.

Log in to your account to post a comment.

Powered by Koha

//