Mishra, M M K (2014) Optical character recognition of printed Odia documents. BTech thesis.
PDF 1321Kb |
Abstract
Optical Character Recognition (OCR) is a document image analysis method that involves the mechanical or electronic transformation of scanned or photographed images of typewritten or printed text into text that can be easily read by the computer. OCR has been become a very widespread area of interest and research because of its ability to narrow the reading ability gap between computers and humans and because it improves human machine interaction in many applications. Example applications include cheque verification, and a large variety of banking, business and data entry applications. The project involved skew correction of odia documents, line segmentation and eventual segmentation of odia characters. The project involved segmentation of a document into its constituent lines, then treating the line as one entity, it segmented the words. Now, once the words are segmented, the characters are extracted one by one. The algorithms used here stand true for all the devnagri scripts. Hence examples of telgu word segmentation is also done just to show as an proof of the applied algorithm.
Item Type: | Thesis (BTech) |
---|---|
Uncontrolled Keywords: | OCR; segmentation; word segmentation; line segmentation ; character segmentation; |
Subjects: | Engineering and Technology > Computer and Information Science > Image Processing |
Divisions: | Engineering and Technology > Department of Computer Science |
ID Code: | 6231 |
Deposited By: | Hemanta Biswal |
Deposited On: | 08 Sep 2014 11:09 |
Last Modified: | 08 Sep 2014 11:09 |
Supervisor(s): | Dash, R |
Repository Staff Only: item control page