Abstract | Cilj ovog rada je predstavljanje procesa optičkog prepoznavanja znakova i njegove primjene u praksi. Kako je optičko prepoznavanje znakova pretvaranje analognog dokumenta u pretraživu digitalnu inačicu, u prvom teoretskom dijelu rada objašnjen je pojam digitalizacije te su predstavljeni pojedini projekti i inicijative digitalizacije. Proces optičkog prepoznavanja znakova isprepliće se sa nekoliko srodnih disciplina od kojih su u radu ukratko objašnjene sljedeće: prepoznavanje uzoraka, računalni vid te strojno učenje. Teoretski dio rada također opisuje osam glavnih komponenti procesa optičkog prepoznavanja znakova: optičko skeniranje, segmentiranje regija slike, pretprocesiranje, normalizacija, segmentacija, reprezentacija, ekstrahiranje značajki te poučavanje i prepoznavanje. Potom su predstavljene aplikacije za optičko prepoznavanje znakova te je napravljeno istraživanje usporedbe rada u aplikacijama ABBYY FineReader 15, Free OCR-u, Google Drive OCR, I2OCR-u i Convertio. U istraživanju je napravljena usporedba osobina i uspješnosti pet različitih aplikacija za optičko prepoznavanje znakova. Provedena je rasprava o dobivenim rezultatima te su na temelju uvida iz istraživanja i analizirane teorije predstavljena zaključna razmatranja. |
Abstract (english) | The aim of this paper is to resolve what exactly lies behind the optical character recognition process, what its components are, and how it looks in practice. An optical character recognition, in simple words is translating an analog document into a searchable digital version and the concept of digitization is explained as the root part of the theoretical part of the paper, and what are the projects and initiatives of digitization that would not exist without optical character recognition. This process is interwoven with several related disciplines, of which the pattern recognition, computer vision, and machine learning are briefly explained in the paper. The theoretical part of the paper is formulated around a description of eight major components of the optical character recognition process: optical scanning, segmentation of image regions, preprocessing, normalization, segmentation, representation, feature extraction, and teaching and recognition. Although most of the literature is mathematically based, the components of optical character recognition are mathematically illustrated to the limit understandable to all. Optical Character Recognition applications were then introduced and a comparison study was performed on ABBYY FineReader 15, Free OCR, Google Drive OCR, I2OCR and Convertio applications. The methodology used in the research is a qualitative comparison of five different optical character recognition applications and a quantitative comparison of how many errors were made and how many languages were included in the architecture of each application. After the research, a discussion of the results was conducted and conclusions were drawn based on the research and the theory analyzed. |