Title Linearna diskriminantna analiza
Title (english) Linear discriminant analysis
Author Anamarija Zrinščak
Mentor Anamarija Jazbec (mentor)
Committee member Ljiljana Arambašić (predsjednik povjerenstva)
Committee member Anamarija Jazbec (član povjerenstva)
Committee member Franka Miriam Brückler (član povjerenstva)
Committee member Andrej Dujella (član povjerenstva)
Granter University of Zagreb Faculty of Science (Department of Mathematics) Zagreb
Defense date and country 2022-03-04, Croatia
Scientific / art field, discipline and subdiscipline NATURAL SCIENCES Mathematics
Abstract U ovom radu promatramo diskriminantnu analizu kao metodu kojom možemo klasificirati objekte u odgovarajuće grupe te objasniti stupanj odnosa između grupa i određenih osobina objekata, tzv. prediktora. Koncentrirali smo se na linearnu diskriminantnu analizu (LDA), kojoj je cilj procijeniti linearnu kombinaciju varijabli prediktora koja najbolje diskriminira pripadnost individualnih objekata grupi. Prednost LDA u odnosu na ostale klasifikacijske metode jest što LDA dopušta kategorijsku zavisnu varijablu, a nezavisna varijabla ne mora biti dihotomna. U prvom poglavlju, analizirali smo glavne uvjete koji moraju biti zadovoljeni za dobivanje optimalnih rezultata analize; multivarijatna normalnost prediktora, homogenost matrice varijance i kovarijance te linearni odnosi varijabli prediktora unutar svake grupe. Prikazali smo izvod i testiranje diskriminacijskih funkcija za separaciju grupa, te klasifikacijskih jednadžbi za klasificiranje objekata u grupe. Opisali smo i tri glavne vrste linearne diskriminantne analize; direktnu, sekvencijalnu i stepwise analizu, te kriterije za procjenu značajnosti. Kanoničkim varijablama objasnili smo značaj diskriminacijskih funkcija i grafički prikazali dobivene separacije grupa. U drugom poglavlju, primijenili smo opisanu analizu na bazu podataka o daljinskim istraživanjima pet različitih usjeva. Baza se sastoji od 36 opservacija koje predstavljaju usjeve i 4 varijable koje predstavljaju mjerenja dobivena daljinskim istraživanjem. Analiza je pokazala da samo jedna varijabla značajno utječe na klasifikaciju, ali sveukupna dobivena klasifikacija nije značajna. Transformacijom varijabli prediktora, pokušali smo poboljšati rezultate klasifikacije, no bezuspješno. izvodenjem kvadratne diskriminantne analize ustanovili smo da korišteni skup podataka nije pogodan za ilustraciju diskriminantne analize. U trećem poglavlju, analizirali smo podatke o svojstvima zrna triju različitih sorti pšenice. Baza se sastoji od 210 opservacija koje predstavljaju sorte pšenice, te 7 varijabli koje predstavljaju geometrijska svojstva zrna pšenice. Izvođenjem linearne diskriminantne analize na ovom skupu podataka, zaključili smo da sve varijable prediktora značajno doprinose separaciji grupa i klasifikaciji opservacija u grupe. Za kraj, grafovima smo prikazali separacije sorti pšenice, te smo zaključili da na razdvajanje grupa najviše utječe veličina zrna pšenice.
Abstract (english) In this paper, we considered discriminant analysis as a method by which we can classify objects into groups and explain the degree of relationship between groups and certain properties of objects, the so-called, predictors. We concentrated on the linear discriminant analysis (LDA), which aims to estimate the linear combination of predictors that best discriminates the affiliation of individual objects to a group. The advantage of LDA over other classification methods is that LDA allows a category-dependent variable, and the independent variable does not have to be dichotomous. In the first chapter, we analyzed the main conditions that must be met to obtain optimal analysis results: multivariate normality of predictors, homogeneity of the matrix of variance and covariance, and linear relationships of predictors within each group. We presented a derivation and testing of discriminant functions for group separation, and classification equations for classifying objects in groups. We also described three main types of linear discriminant analysis: direct, sequential, and stepwise analysis, and the criteria for assessing the significance. We explained the importance of discriminant functions with canonical variables and graphically presented the obtained separation of groups. In the second chapter, we applied the described analysis to a database of remote sensing data of five different crops. The database consists of 36 observations representing crops and 4 variables representing measurements obtained by remote sensing. The analysis showed that only one variable significantly affects the classification, but the overall obtained classification did not turn out to be significant. By transforming the predictor variables, we tried to improve the results of the classification, but without success. By performing a quadratic discriminant analysis, we found that the data set used, is not suitable for illustrating discriminant analysis. In the third chapter, we analyzed data on kernel properties of three different wheat varieties. The database consists of 210 observations representing wheat varieties, and seven variables representing the geometric properties of wheat kernel. By performing a linear discriminant analysis on this data set, we concluded that all predictor variables significantly contribute to the separation of groups and to classifying observations into the correct group. Finally, we plotted the separations of wheat cultivars and concluded that the separation of groups is mostly influenced by the kernel size.
Keywords
prediktor
linearna diskriminantna analiza (LDA)
direktna analiza
sekvencijalna analiza
stepwise analiza
sorte pšenice
Keywords (english)
predictor
linear discriminant analysis (LDA)
direct analysis
sequential analysis
stepwise analysis
wheat cultivars
Language croatian
URN:NBN urn:nbn:hr:217:906516
Study programme Title: Mathematical Statistics Study programme type: university Study level: graduate Academic / professional title: magistar/magistra matematike (magistar/magistra matematike)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Created on 2022-04-01 10:20:08