Title Strukturiranost i interoperabilnost mrežnih enciklopedičkih sadržaja iz tehničkoga područja
Title (english) Structure and interoperability of online encyclopedic content in the field of technology
Author Ivan Smolčić
Mentor Petra Bago (mentor)
Mentor Zdenko Jecić (mentor)
Committee member Nives Mikelić Preradović (predsjednik povjerenstva)
Committee member Tomislava Lauc (član povjerenstva)
Committee member Slaven Ravlić (član povjerenstva)
Granter University of Zagreb Faculty of Humanities and Social Sciences (Department of information and Communication sciences) Zagreb
Defense date and country 2020-11-27, Croatia
Scientific / art field, discipline and subdiscipline SOCIAL SCIENCES Information and Communication Sciences
Universal decimal classification (UDC ) 030 - General reference works 004 - Computer science and technology. Computing. Data processing
Abstract Izvedbom enciklopedičkih djela kao suvremenih mrežnih projekata omogućena je nadogradnja tih informacijskih sustava u epistemološkome smislu što upućuje na potrebu redefiniranja leksikografske i enciklopedičke djelatnosti. Zadržavajući temeljne (tradicionalne) odrednice enciklopedičkoga koncepta, poput točnosti, objektivnosti i relevantnosti, mrežna enciklopedička izdanja nadilaze granice tradicionalne enciklopedike unaprjeđenjem epistemoloških značajki svoga sadržaja. Ovaj rad posebno obrađuje problematiku i vezu između strukturiranja i interoperabilnosti mrežnih enciklopedičkih sadržaja. Kako bi se donijeli zaključci o mogućnostima strukturiranja enciklopedičkoga sadržaja na temelju njegove sažetosti (informativnosti), provedena je kvantitativna i kvalitativna analiza sadržaja u cilju bilježenja unificiranih faktografskih podataka pogodnih za stvaranje strukture. Ispitano je ukupno 455 kategoriziranih enciklopedičkih članaka iz tehničkoga područja iz 10 zasebnih mrežnih enciklopedičkih projekata na hrvatskom, engleskom i njemačkom jeziku. Kako bi se pospješilo prepoznavanje naziva u enciklopedičkim tekstovima, ispitane su vrijednosti evaluacijskih mjera alata namijenjenih automatskom prepoznavanju nazivlja (engl. Named Entity Recognition) na enciklopedičkim tekstovima uključenima u uzorak analize sadržaja. Izračunom standardnih evaluacijskih mjera (preciznost, odziv, točnost, F mjera) na temelju parametara proizašlih iz označenoga teksta s pomoću alata doneseni su rezultati analize evaluacijskih mjera označavanja alata za pojedine kategorije i vrste naziva na mješovitom uzorku sastavljenom od članaka više enciklopedičkih izdanja. Na temelju analize sadržaja prikazan je način strukturiranja enciklopedičkoga sadržaja te korištenje te strukture u postizanju interoperabilnosti putem izrade posebnoga modela. U konačnici, ova doktorska disertacija donosi zaključak vezan za sve njegove sastavnice, odnosno skreće pozornost na enciklopedičke vrline kroz postavke enciklopedičkog koncepta, ulogu strukture i mogućnosti postizanja interoperabilnosti mrežnog enciklopedičkog sadržaja.
Abstract (english) The implementation of encyclopedic works as contemporary network projects enabled the upgrade of these information systems in an epistemological sense, ie the necessary redefinition of lexicographical and encyclopedic activities. While retaining the basic (traditional) determinants of the encyclopedic concept, such as accuracy, objectivity and relevance, online encyclopedic editions go beyond the traditional ones in their content creation, removed scope limits (thanks to digital content), enhanced content retrieval and user adaptability. This work addresses in detail the issues of structure and interoperability of encyclopedic content, which enable the encyclopedic content to interact and function in the global information system. An overview of the main features and the application of the structure and interoperability of encyclopedic content in the network space is provided. In order to draw conclusions about the possibilities of structuring encyclopedic content on the basis of its informative nature, a quantitative and qualitative content analysis was conducted in order to record unified factual data suitable for structure creation. A total of 455 encyclopedic articles from 10 separate online encyclopedic projects in Croatian, English and German were examined. Entities recorded in the content analysis of encyclopedic texts were divided into three categories of encyclopedic articles: persons, organizations and general terms. The article category for persons contains a total of 33 types of entities suitable for structuring, the article category for organization marks 23 entity types, and the article category for general terms contains only seven entity types. In order to facilitate automatic identification of named entities in encyclopedic texts, the effectiveness of Named Entity Recognition tools on encyclopedic texts in the technical field has been tested. Three tools (applications) were used for testing: CroNER and ReLDI for the Croatian samples and the Stanford CoreNLP tool for the samples in English. Based on the parameters derived from the tool-marked text, using a mixed sample of articles from multiple encyclopedic publications, standard performance measures (precision, response, accuracy, F measure) were calculated for each tool, providing evaluation of the tools for individual categories and types of names. On the basis of the content analysis, the method of structuring the encyclopedic content and the use of that structure in achieving interoperability by developing a special model are presented. The model represents a central database that binds the structure of multiple editions and enables it to be exchanged to all related online encyclopedic editions by mapping the elements of the structure. Interoperability has been proven via experimental method by testing models and detecting changes. Ultimately, this PhD thesis draws a conclusion regarding all its constituents, that is, points to encyclopedic virtues through the settings of the encyclopedic concept, the role of structure and the possibility of interoperability of online encyclopedic content.
Keywords
enciklopedika
leksikografija
enciklopedički koncept
strukturiranost
interoperabilnost
automatsko prepoznavanje nazivlja
analiza sadržaja
model interoperabilnosti
metapodatci
Keywords (english)
encyclopaedic science
lexicography
encyclopedic concept
content structure
interoperability
Named Entity Recognition
content analysis
interoperability model
metadata
Language croatian
DOI https://doi.org/10.17234/diss.2020.8746
URN:NBN urn:nbn:hr:131:220947
Promotion 2021
Study programme Title: Postgraduate (Doctoral) Program in Information Science Study programme type: university Study level: postgraduate Academic / professional title: doktor/doktorica znanosti, područje društvenih znanosti, polje informacijske i komunikacijske znanosti (doktor/doktorica znanosti, područje društvenih znanosti, polje informacijske i komunikacijske znanosti)
Type of resource Text
Extent 204 str.
File origin Born digital
Access conditions Open access
Terms of use
Created on 2021-03-04 09:16:45