Bogunović, I. (2022). Baza engleskih riječi u hrvatskome [Data set]. https://urn.nsk.hr/urn:nbn:hr:187:904042.
Bogunović, Irena. Baza engleskih riječi u hrvatskome. Pomorski fakultet, 2022. 30 Nov 2024. https://urn.nsk.hr/urn:nbn:hr:187:904042.
Bogunović, Irena. 2022. Baza engleskih riječi u hrvatskome. Pomorski fakultet. https://urn.nsk.hr/urn:nbn:hr:187:904042.
Bogunović, I. 2022. Baza engleskih riječi u hrvatskome. Pomorski fakultet. [Online]. [Accessed 30 November 2024]. Available from: https://urn.nsk.hr/urn:nbn:hr:187:904042.
Bogunović I. Baza engleskih riječi u hrvatskome. [Internet]. Pomorski fakultet: Rijeka, HR; 2022, [cited 2024 November 30] Available from: https://urn.nsk.hr/urn:nbn:hr:187:904042.
I. Bogunović, Baza engleskih riječi u hrvatskome, Pomorski fakultet, 2022. Accessed on: Nov 30, 2024. Available: https://urn.nsk.hr/urn:nbn:hr:187:904042.
Scientific / art field, discipline and subdiscipline
HUMANISTIC SCIENCES Philology Anglistics
Abstract (english)
To build a dataset to train and test the model, 60,000 words were manually labelled according to language membership by three independent evaluators. N-gram feature representation was used in combination with a linear Support Vector Machine classification algorithm (SVM) (Smola & Schölkopf, 2004) to extract English words from the ENGRI corpus (Bogunović & Kučić, 2021; Kučić, 2021). An F1 score of 0.9669 was achieved on the test set. The database contains 9,453 English words as well as their absolute and relative frequencies.
Number: UIP-2019-04-1576 Title (croatian): Engleske riječi u hrvatskome jeziku: Identifikacija, afektivno-semantičko normiranje i ispitivanje kognitivne obrade bihevioralnim i neuroznanstvenim metodama Acronym: ENGRI Leader: Irena Bogunović Funding stream: UIP
Publisher
Pomorski fakultet Faculty of Maritime Studies, Rijeka