Sažetak (engleski) | Numerous entities of extra-linguistic reality are in some way categorized in the speaker’s mind, and speakers of one language share the categorization inscribed in the language. The lexicon of a language is organized into certain groups, so for example, dog, cat, etc. belong to the animal group, and tulip, rose, etc. belong to the plant group. A dog, which on the one hand belongs to the animal group, on the other hand also possesses a muzzle, a tail and a leg. This thesis focuses on ... Više hierarchical lexical-semantic relations, i.e. hyponymy and meronymy, in the Croatian language. Hyponymy is a relation between a lexical unit that denotes a species (životinja, ‘animal’) and a lexical unit that denotes a representative of that species (pas, ‘dog’). Meronymy is a relation between a lexical unit that denotes a whole (pas, ‘dog’) and a lexical unit that denotes a part of that whole (njuška, ‘muzzle’). These relations are referred to as hierarchical because they form hierarchies in which the superordinate has its subordinates: the species has representatives or the whole has parts. In a hyponymic relation, the superordinate is a hyperonym and the subordinate is a hyponym, and in a meronymic relation, the superordinate is a holonym and the subordinate is a meronym. Lexical units that have a common immediate hyperonym are called co-hyponyms, and lexical units that have a common immediate holonym are called co-meronyms. The theoretical part of the thesis rests on available existing literature, exploring comprehensive insights into hyponymy and meronymy, their definitions, examples, divisions. Hierarchical lexical-semantic relations were the subject of interest of some Greek philosophers (Plato, Aristotle), as well as medieval thinkers (Peter Abelard, Thomas Aquinas, William Ockham) and later philosophers such as Edmund Husserl, Lothar Ridder and especially the Polish philosopher Stanisław Leśniewski. He was the first to use the term mereology for the philosophical study of part-whole relations. Hierarchical relations are also a subject matter of interest for anthropologists, they are important in computer science and, of course, in linguistics, especially since linguists became more focused on them in the second half of the 20th century. Linguistic pioneers in studying hyponymy and meronymy are the English structuralists John Lyons (Lyons 1977) and his student David A. Cruse (Cruse 1986), and after that, numerous researchers followed. The thesis adopts the classification of hyponymic relations according to Chaffin and Herrmann (1984), Wierzbicka (1984), Murphy (2003) and the classification of meronymic relations according to Winston, Chaffin and Herrmann (1987), Iris, IV Litowitz and Evens (1988), Chaffin, Herrmann and Winston (1988), Gerstl and Pribbenow (1995), Odell (1998), etc. Even still, when it comes to lexical-semantic relations, both in foreign and in Croatian linguistics more attention has been paid to synonymy and antonymy. Furthermore, hyponymy and meronymy are relations whose main structural principles are the relation of dominance (on a tree diagram, the vertical relation between units) and the relation of difference (horizontal relation) (Cruse 1986: 113). For example, cvijet (‘flower’) is in a relation of dominance with ruža (‘rose’) and tulipan (‘tulip’), and ruža (‘rose’) and tulipan (‘tulip’) do not necessarily match, so their meanings are close, but sufficiently incompatible or different (otherwise they would be synonymous). In contrast to synonymy and antonymy, hyponymy and meronymy are asymmetric relations. For example, two antonyms are antonymous to each other (crn – bijel, ‘black – white’), and a hyperonym and a hyponym are in different relations, being either a hyponym or a hyperonym to each other (cvijet > ruža, ruža < cvijet; ‘flower > rose, rose < flower’). The thesis relies on distributional semantics, and based on a semi-automatic corpus analysis, lexical-syntactic patterns for expressing hyponymy and meronymy are extracted. With the help of these patterns from unstructured and unmarked texts, the desired examples can be obtained. The assumption is that certain regularities and distributional realizations indicate a type of something or a part of something, meaning that the lexical-syntactic pattern clearly points to a certain lexical-semantic relation. The research included an analysis of the vocabulary and the general and specialized corpus of the Croatian language. The vocabulary corpus refers to the first Croatian online dictionary – Mrežnik (ihjj.hr/mreznik, see e.g. Hudeček, Mihaljević and Jozić 2024), the general corpus to the Croatian Web Corpus hrWaC (Ljubešić and Klubička 2014), and the specialised corpus to Hrvatski jezikoslovni korpus (Croatian Linguistic Corpus) (Mihaljević and Marković 2024), a corpus of linguistic papers in Croatian language. With the patterns obtained from multi-million corpora in the multi-dimensional analysis, numerous hyperonym-hyponym and holonym meronym pairs were identified. In the first step of the analysis from the TLex database (Joffe and de Schryver 2023), in which the online dictionary Mrežnik is being created, the dictionary definitions were exported and then loaded onto the Sketch Engine system (Kilgarriff et al. 2004, Kilgarriff et al. 2014) to make the corpus searchable according to its morphosyntactic features. In certain parts of the analysis of V the dictionary corpus, the Python programming language was also used to extract definitions with a similar structure and thus obtain the desired examples (for instance, examples of co hyponyms). After the dictionary corpus, queries were created in CQL, which were used to check and supplement the results from the first step of the analysis. Based on the examples from the analysed material and in comparison with information from the available literature, the definition of the relations, the difference in the relations and their overlapping, as well as the specific features of both relations were presented. Hyponymy is ultimately defined as a hierarchical lexical-semantic relation between a lexical unit that denotes a sort in general and a lexical unit that denotes a specific representative of a sort. Furthermore, meronymy is a hierarchical lexical-semantic relation between a lexical unit that denotes a whole and a lexical unit that denotes a part. When speaking of hyponymic relations, primarily there is a difference between direct hyponymy, which is realised without lexical units such as vrsta (‘sort’), and taxonomy, which is expressed with specific lexical units such as vrsta (‘sort’). These lexical units are in this thesis referred to as semantic links. In addition to the lexical unit vrsta (‘sort’) in the Croatian language, the lexical units such as tip (‘type’), skupina (‘group’), grana (‘branch’), razred (‘class’), rod (‘gender’), kategorija (‘category’), etc. and verbs such as ubrajati se (‘count’), svrstavati se (‘classify’), spadati/pripadati (‘belong’), određivati (‘determine’) are also important. However, the most common lexical unit mediated in the hyponymic relation is vrsta (‘sort’), which in Croatian, unlike, for example, in English, refers to both species of plants and sorts of things in general. Dictionary definitions proved particularly useful in the analysis of verbs. Similar definitions were extracted using Python and subsequently classified. This process yielded groups of verb hyponyms, leading to the conclusion that verb hyponyms and hyperonyms in Croatian share the same verb aspect. Additionally, such extraction can lead to examples of other lexical semantic relations. One such relation is synonymy because synonyms have the same or similar definitions, and then antonyms because antonyms are distinguished by a minimal difference in meaning, often written in the second part of the definition (differentia specifica). In addition, the verb hierarchy can be multi-levelled (kretati se > hodati > šetati se ‘move > walk > stroll’). Adjectives were also observed to play an important role in structuring hyponymic hierarchies, occurring more often in hyponyms (which is why hyponyms are morphologically more complex, e.g. ulje > jestivo ulje > maslinovo ulje > djevičansko maslinovo ulje ‘oil > edible oil VI > olive oil > virgin olive oil’) (as stated by Cruse 1986: 146). This conclusion is confirmed with examples from the linguistic corpus, in which the role of adjectives in hierarchies is also outlined in examples such as slavenski jezici > južnoslavenski jezici > zapadnojužnoslavenski jezici (‘Slavic language > South Slavic language > West South Slavic language’), where compounding with adjectives is also evident on the formative level. Based on research and research results, the following taxonomy of hyponymic relations is proposed: 1. people > persons according to gender, occupations, etc. (osoba > aktivist, ženska osoba > gimnastičarka ‘person > activist, female person > female gymnast’) 2. members of communities in the natural world a. animal species > animal (pas > terijer ‘dog > terrier’) b. plant species > plant (cvijet > ruža ‘flower > rose’) 3. space a. geography term > example of such a term (država > Hrvatska ‘country > Croatia’) b. building > example of such a building (građevina > svjetionik ‘building > lighthouse’) 4. organised communities > examples of such communities (nogometni klub > Dinamo ‘football club > Dinamo Zagreb’) 5. beliefs or activities someone engages in > example of such beliefs or activities (ples > balet ‘dance > ballet’) 6. artefacts (devices, implements, appliances, etc.) > example of an artefact (stroj > dizalica ‘machine > crane’) 7. closed groups of agreed hyperonyms > group members (mjesec > siječanj ‘month > January’) 8. conditions > example of a condition (osjećaj > tuga ‘feeling > sadness’) 9. doing > ways of doing (glasati se > cijukati ‘produce sounds > squeak’) 10. metalinguistic groups > metalinguistic subgroups or group members (e.g. glagol > cijukati ‘verb > squeak’, quasi-hyponymy). Regarding the relation between a part and a whole, it was first noticed that syntagmatic expression plays an important role in the Croatian language, specifically in the expression of a part or a whole using the genitive case, instrumental or locative, or using prepositional-case expressions (e.g. od (‘of’) + genitive). In addition, it is important for meronymy that it is not VII realized without semantic links, and the most common in the Croatian language is dio (‘part’). This lexical unit has many hyponyms which also form special hierarchies, and since these hyponyms appear as meronyms of many holonyms, it can be stated that they are super meronyms (Cruse 1986: 164). In addition to the lexical unit dio (‘part’), the lexical units element (‘element’), sastavnica/komponenta (‘component’), odjel (‘section’) and verbs such as sastojati se (‘consist’), sadržavati (‘contain’), dijeliti se (‘divide (itself)’), obuhvaćati (‘encompass’), uključivati (‘include’), brojiti (‘count’), imati (‘have’) play an important role for the relation between a part and a whole. Finally, the following meronymic relations can be emphasized: 1. object > functional component (računalo > tipkovnica ‘computer > keyboard’) 2. set > individual a. set as a collection of equal or similar elements (šuma > stablo ‘forest > tree’) b. group as a united community (čopor > vuk ‘pack > wolf’, razred > učenik ‘class > student’) c. group as an organisation (EU > Hrvatska ‘EU > Croatia’, Sveučilište u Zagrebu > Filozofski fakultet ‘University of Zagreb > Faculty of Humanities and Social Sciences’) 3. duration or process > phase or activity (osnovna škola > mala matura ‘primary school > primary school graduation’) 4. area > part of an area (učionica > kut ‘classroom > corner’, Hrvatska > Slavonija ‘Croatia > Slavonia’) 5. mixture > ingredient (gemišt > bijelo vino ‘spritzer > white wine’) 6. mass > part as a measurable unit (meso > šnicla ‘meat > steak’, grašak > zrno (graška) ‘peas > grain (pea)’) 7. whole as a construct > part as an agreed unit (kilometar > metar ‘kilometre > metre’, sat > minuta ‘hour > minute’). The so-called quasi-relations and auto-relations were noticed in both hyponymy and meronymy: quasi-hyponymy, quasi-meronymy, auto-hyponymy, and auto-meronymy. Quasi hyponymy is in this thesis understood as a meta-linguistic relation, such as a relation in which a certain word is classified into a group according to the type of word (e.g. veznik > ali ‘conjunction > but’), and the linguistic corpus proved to be particularly useful for finding such examples. The relations between lexical units that do not have the same structure (e.g. the relation between a one-word lexical unit and a multi-word lexical unit) is not considered an VIII example of quasi-hyponymy in this thesis. When it comes to auto-hyponymy and auto meronymy, it should be noted that they are recognized in dictionary definitions containing the lexical unit istoimen (‘eponymous’). Numerous examples of auto-hyponymy were found (e.g. Beech is a family of woody plants (...) and Beech is a member of the family of the same name (...)), mostly among members of the plant and animal world. For example, bor (‘pine’), deva (‘camel’), fazan (‘pheasant’), gljiva (‘mushroom’), hrast (‘oak’), jelen (‘deer’), kokoš (‘chicken’), kornjača (‘turtle’), kuna (‘marten’), leptir (‘butterfly’), ljiljan (‘lily’), mačka (‘cat’), majmun (‘monkey’), medvjed (‘bear’) can be considered auto-hyponyms, i.e. lexical units whose meanings are in a hyponymic relation. Auto-meronymy has been observed in plants and their fruit, e.g. pineapple as pineapple fruit (relation ananas > ananas ‘pineapple tree > pineapple’), apple as apple fruit (relation jabuka > jabuka ‘apple tree > apple’), tomato as tomato fruit (relation rajčica > rajčica ‘tomato plant > tomato’), but there are cases in which the lexical unit denoting the plant and the lexical unit denoting the fruit differ (e.g. the hrast > žir ‘oak > acorn as oak fruit’), so there is no auto-meronymy, see Table 16. During the analysis, certain examples were observed in which it was not easy to determine whether hyponymy or meronymy occurred. These examples prompted the questioning of the contact or boundaries of these two relations. Certain lexical-syntactic patterns were noted to introduce both relations, e.g. jedan/jedna/jedno od (‘one of’), grana (‘branch’), dijeliti se (‘divide (itself)’), uključivati (‘include’), and it was particularly difficult to set the boundary when it came to collectives (Wierzbicka 1984) or sets such as clothing, furniture, accessories, etc. A comparative analysis of dictionary definitions (Table 15) is presented, showing that even within one dictionary, the definitions fluctuate – whether, for example, a T-shirt (majica) is a piece or part of clothing (komad/dio odjeće) (meronymy) or an article of clothing (odjevni predmet) (hyponymy). It was concluded that the fluctuation between meronymy and hyponymy occurs most often with nouns from the singularia tantum group which have subordinate terms that are lexical units of a complete paradigm. Finally, the thesis was based on the following hypotheses: 1. The lexical-syntactic patterns through which hyponymy and meronymy are recognized and which are extracted from the dictionary are also applicable to other types of texts (general language and specialised texts). This hypothesis was confirmed for nouns, but it was only partially confirmed for verbs and other types of words. Linguistic corpus as a corpus of texts dealing with language gave IX special insight into specific and meta-linguistic relations such as quasi-hyponymy and was, therefore, particularly useful for research. The hypothesis should therefore be questioned in future research on other types of corpora. 2. Using formalized lexical-syntactic patterns and examples containing hyperonyms or holonyms, it is possible to extract hyponyms or meronyms from the text. This assumption has been fully confirmed, i.e. with the help of adopted lexical-syntactic patterns, it is possible to extract thousands of sentences from the corpora in which hyperonymous-hyponymous and holonymous-meronymous pairs appear. 3. The traditional taxonomy of hyponymic and meronymic relations is not sufficient to describe all subtypes of these relations. Since languages have their own structure and the taxonomy of hyponymic and meronymic relations for the Croatian language has not been adopted, it can be stated that the hypothesis is confirmed or partially confirmed. The Croatian language with its specificities, for example in the grammatical number of individual lexical units, outlines the peculiarities of certain types of relations. For example, no division mentions the role played by, for instance, the case or a prepositional-case phrase, and for Croatian as an inflectional language, especially when it comes to the relation between part and whole, this role is very important and refers to numerous examples. In establishing a taxonomy for the Croatian language, the existing divisions in the literature were certainly useful, but in certain places, they were approached critically, whether regarding the stated relation in general or a specific aspect of Croatian. Although the adopted taxonomies of hyponymic and meronymic relations are a questionable model, they are nevertheless an incentive for further exemplification or restructuring of the division. Questioning this assumption also pointed to the importance of emphasizing the specificity of a particular language, which is very important in teaching foreign languages (e.g. how one would say in English, German, Hungarian, French, Chinese or any other language that someone likes cucumbers, strawberries, carrots or anything else). Starting from individual examples, the thesis came to the general conclusion that certain groups of linguistic phenomena in this context can be fertile ground for comparative research. In addition to comparative research, the findings from this work can also be used for conducting or stimulating other scientific research. For example, cooperation with experts from different scientific fields or the application of obtained lexical-syntactic patterns to specialized texts can result in automatic or semi-automatic recognition of wholes and parts as well as the X species/sorts and representatives of species and sorts. The linguistic corpus has broached research questions which are valuable precisely for linguistic research, but corpora from fields such as medicine, biology, construction, shipbuilding, architecture, etc., for certain relations (examples of units in certain relations) could be even more informative. Since in specialised professional texts of certain fields, for example wholes and parts are often highlighted, the research would probably lead to new lexical-syntactic patterns. In linguistics itself, research can stimulate new morphological, syntactic, glottodidactic, lexicological and, of course, semantic analyses. The points of contact between hyponymy and meronymy could be examined with native speakers (both adults and children), for example by surveys or by observing speakers while they need to define the vacillating lexical unit in a game like Alias. Application of patterns could also be found in computational linguistics and natural language processing (NLP), especially in the momentum of development of language models that serve to answer questions, provide information, etc. Thus, they can be used for information extraction, text summarisation, machine translation, improved search, data mining, text generation, discourse analysis (e.g. application for analysis of literary works), creation of ontologies and semantic networks, gamification, lexicographic work, and so on. Sakrij dio sažetka |