Ismail, Shaikh Mohammed (2013) Named-Entity Recognition in Business Card Images. BTech thesis.
| PDF 765Kb |
Abstract
We are surrounded by text everywhere: window signs, commercial logos and phone numbers plastered on trucks, flyers, take-away menus - and yet to capture and use all this information we essentially resort to typing these phone numbers and websites manually into a phone or computing device. We thought we should help change that, with the help of the mobile phone camera and OCR applications extracting the relevant textual information in these images. Basically, the problem can be seen as a two step process: • Extract characters/words from the image by OCR • Classify the words as Name, Email, Phone No, etc. Our work was more focussed on the first step - to reduce/minimize the time needed to perform the step given we want to make it usable for mobile computing devices. However, computing under handheld devices involves a number of challenges. Because of the non-contact nature of digital cameras attached to handheld devices, acquired images very often suffer from skew and perspective distortion. Since we have to separate text from graphics/background, segmentation/binarization algorithms play a vital role in the process, we studied, analyzed and impelemented existing standard algorithms. A number of thresholding techniques have been previously proposed using global and local techniques. OCR is done using Tesseract, which is an open-source OCR engine that was developed at HP between 1984 and 1994. The second step involves applying appropriate heuristics in order to achieve correct classification. Given a line of text, Named-Entity Recognition(NER) is in itself a different domain of research. We have come up heuristics to identify named-entities, the output of Step 1 is given as input and it displays the information in the relevant field.
Item Type: | Thesis (BTech) |
---|---|
Uncontrolled Keywords: | Named-Entity Recognition; Thresholding; OCR; Tesseract |
Subjects: | Engineering and Technology > Computer and Information Science > Image Processing |
Divisions: | Engineering and Technology > Department of Computer Science |
ID Code: | 5399 |
Deposited By: | Hemanta Biswal |
Deposited On: | 19 Dec 2013 10:59 |
Last Modified: | 19 Dec 2013 10:59 |
Supervisor(s): | Majhi, B D |
Repository Staff Only: item control page