Abstract | Let avionom civilizacijski je postao svakodnevni dio života. Naravno to ne znači da su tehnološki i organizacijski sustavi koji zračni promet čine mogućim i održivim, trivijalni. Ono na što se najviše obraća pozornost su poteškoće u tehničkom pregledu, pri polijetanju, slijetanju i tokom samog leta. Zbog toga je od strane EAS-e (eng. European Aviation Safety Agency) uvedena odredba o obaveznom izvještavanju svakog nepredviđenog događaja. Izvještaje uglavnom pišu piloti ili idući odgovorni u hijerarhiji leta i to najkasnije 72 sata nakon leta. U svrhu što brže organizacije i klasifikacije tih dokumenata s obzirom na razinu rizika u ovom radu su prikazani eksperimenti primjene različitih klasifikacijskih algoritama i popratnih nužnih metoda reprezentacije na skupu od nešto manje od 3000 zvještaja aviokompanije Adria Airways na engleskom jeziku. Izvještaji su, po unaprijed određenim parametrima, klasificirani prema faktoru rizika koji je dan u neekvidistantnoj skali od 1 do 2500. S obzirom na taj faktor, postoje 3 klase događaja: -Događaj bez nesreće u ishodu -Događaj s minimalnom nesrećom ili štetom -Događaj s katastrofalnom nesrećom (uključuje smrtni slučaj) Kako računalu ne možemo dati čisti tekstualni podatak na analizu, korištene su metode tekstuane reprezentacije onim što računalo razumije, a to su brojevi. Reprezentacije teksta korištene u ovom radu su TF-IDF (Term Frequency-Inverse Document Frequency) bazirana na klasičnoj BoW (Bag of Words) reprezentaciji i BERT kodiranje teksta (Bidirectional Encoder Representations from Transformers) koje se temelji, kao što ime kaže, na starijoj reprezentaciji modela Transformers, ali je strukturno šire i konceptualno razvijenije. Na gotovim tekstualnim reprezentacijama korišteni su klasifikacijski modeli. Opisani, i u eksperimentu isporbani modeli su RandomForestClassifier biblioteke otvorenog tipa Scikit-learn i model DistilBertForSequenceClassification, biblioteke Transformers koja je također otvorenog tipa. Rezulati nijednog modela nisu pretjerano uspješni, najbolje postignut rezultat klasifikacije određen mjerom F1 je 64% ukupno i 75% samo za klasu događaja s minimalnom nesrećom ili štetom. Iako rezultati nisu izvrsni, vidljiv je znatan napredakod modela RF do modela BERT. Jedan od razloga loših rezultata je zasigurno nebalansiran skup podataka. Najviše je izvještaja klase rizika od minimalne nesreće, pa klasa događaja bez nesreće u ishodu,a najmanje onih sa katastrofalnim ishodom. S obzirom da je pisanje izvještaja obavezno samo u slučaju nepredviđenog događaja, a rizika od katastrofalnih ishoda, na sreću, nema toliko puno, nebalansiranost podataka je logična. Drugi mogući uzrok loših rezultata je to što osim jako dobrih modela, unatoč nastojanjima i velikom uspjehu na tom polju, reprezentacije teksta i dalje ne mogu pouzdano uhvatiti koncept. Osim razlika u značenju isto napisanih riječi, razlike mogu biti u dijalektu, u naglascima i slično. Na primjer riječi (Jadransko) môre, (noćne) mòre i more (u značenju može na dalmatinskom dijalektu) imaju tri potpuno različita značenja a ovisno koliko je dobar rječnik reprezentacije, računalo će te tri riječi pročitati možda isto, možda samo slično, možda čak i različito, ali to ne mora biti pravilo. Možda bi cijeli sustav čitanja, slušanja, razumijevanja i procesuiranja teksta trebao biti konstruiran neovisno o brojevnim reprezentacijama rječnika. Svakako, veliki potencijal napretka zračne sigurnosti leži u metodama procesuiranja i razumijevanja teksta, jer su opisi događaja, bilo u pisanomili još bolje u glasovnom zapisu najbolji uvid u ono što se tokom leta događa. |
Abstract (english) | Travelling by plane has become part of our everyday life. Of course, this does not mean that the technological and organizational systems that make air transport possible and sustainable are trivial. What is most noticeable are the diffculties in the technical inspection, during take-off, landing, and during the flight itself. Therefore, EASA (European Aviation Safety Agency) has introduced a provision for mandatory reporting of any unforeseen events. Reports are generally written by pilots or the next in line of responsibility in the flight hierarchy no later than 72 hours after the flight. In order to organize and classify these documents as quickly as possible, according to the level of risk, this paper presents experiments with the application of various classification algorithms and accompanying necessary methods of representation on a set of fewer than 3000 reports of Adria Airways in English. These reports, according to predetermined parameters, are classified in line with the risk factor given in the non-equidistant scale from 1 to 2500. Considering this factor, there are 3 classes of events: - No accident outcome -Minor injuries or damage - Major or catastrophic accident (including death) Since we cannot give a computer pure textual data for analysis, methods of textual representation by what the computer understands are used, and these are numbers. The text representations used in this paper are TF-IDF (Term Frequency-Inverse Document Frequency) based on the classical BoW (Bag of Words) representation and BERT text encoding (Bidirectional Encoder Representations from Transformers) which is based, as the name suggests, on the older representation of the Transformers model, but is structurally broader and conceptually more developed. Classification models were used on the finished textual representations. The models described and used in the experiment are the RandomForestClassifier open source library Scikit-learn and the model DistilBertForSequenceClassification, a Transformers library that is also open source. The results of none of the models are overly successful, the best-achieved classification result determined by measure F1 is 64% for all classes and 75% for class with minor injuries or damage. However, significant progress is visible from the RF model to the BERT model. One of the reasons for invalid data is certainly an unbalanced data set. Most reports are of the class with a minimal accident, followed by events without an accident in the outcome, and the least of those with the ruinous outcome. Given that, writing a report is mandatory only in case of an unforeseen event, and fortunately, there are not so many events with risk of disastrous outcome, the imbalance of data is logical. Another possible cause of poor results is that, apart from very good models, despite efforts and great success in this field, text representations still cannot reliably grasp the concept. In addition to differences in the meaning of the same words, differences can be in dialect, accents, and the like. For example, the words (reading) a book and to book a flight have completely different meanings, and depending on how well the vocabulary is trained, the computer will maybe read those words the same, maybe just similar, maybe even different, but that doesn’t have to be the rule. Perhaps the whole system of reading, listening, understanding, and processing a text should be constructed independently of the numerical representations of the dictionary. Certainly, the great potential of air safety advances lies in text processing and comprehension methods, as descriptions of events, whether in written or even better in voice recording, provide the best insight into what happens during the flight |