|Automated form reading
Optical character recognition devices
Document imaging systems
Hong Kong Polytechnic University -- Dissertations
|Department of Electronic and Information Engineering
|xiii, 80 leaves : ill. ; 31 cm
|Forms are used extensively to collect and distribute data. The main task of an automated form reader is to locate the date filled in forms and to encode the content into appropriate symbolic descriptions. In this thesis, we aim to develop efficient algorithms for an automated form reading system. The key problems tackled in this thesis are image preprocessing, script determination, fast keywords matching, and printed character recognition. Preprocessing of digital images is a very important step in document analysis. We discuss in this thesis a combined intensity histogram and a local contrast feature for image binarization, horizontal Run Length Smoothing Algorithm (RLSA) followed by 8-neightbouring connection method for page segmentation, the use of simple criteria for text and line extraction, and fast skew estimation and correction using the extracted lines with a backup of an interline cross-correlation method for those forms without lines. Our approach for skew estimation is efficient and effective for those forms containing a lot of lines. The skewed angled can be detected with an error of smaller than 1o It is very common that the documents contain more than one script. In this thesis, a robust script determination approach is proposed which can cope with different fonts, sizes, styles and darkness of text in document images. Two neural networks are employed. The first neural network is trained to derive a set of 15 masks which are used for extracting 15 features. The coefficients of masks are then quantized for reduced computational complexity. The second neural network is trained with 15 extracted features to perform the script separation. Experimental results show that 97% of the image can be correctly classified. A Dynamic Recognition Neural Network (DRNN) is proposed in this thesis to perform fast keywords matching. Different sets of features are used to deal with different scripts. For English, projection profiles (x and y) are used while for Chinese, contour features are utilized. Testing on 29 name cards shows that a 90% correct matching rate can be achieved. An algorithm based on the vertical projection and a peak-to-valley function is adopted for segmenting characters. By applying the algorithm on form images with 100 dpi scanning resolution, about 86% of the characters can be correctly segmented. A neural network is then employed to classify the segmented characters into 50 groups. Both intensity features and structure-based features extracted from the skeleton image were utilized. An accuracy of 85% to 87.5% can be achieved when testing on the images with 100 dpi scanning resolution and higher accuracy of 94% to 96.6% can be achieved if the scanning resolution is 150 dpi.
|All rights reserved
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: