AP 186 Activity 8: Applications of Morphological Operations 1 of 3: Preprocessing Text

In this activity, we were tasked to do handwriting recognition. We must extract individual letters of a handwritten text from a scanned document with lines. The challenging part here is that we are only left with our knowledge of image processing from the past activities of this course in order to accomplish this activity.

Primarily, we have to download Untitled_0001.jpg and choose a part containing text, whether handwritten or printed, with lines. The figure below shows the said Untitled_0001.jpg image.

I chose the portion with the word Cable to be identified. We need to rectify the image because it is tilted. I used Picasa 3.9.0 by Google, Inc. to crop and straighten the image. With the help of the grid lines there, I believe i have straightened the image properly. The processing of the image is shown below.

The next step is to remove the lines using our image processing techniques. I used fft() and filtering in order to remove these lines. The figure below shows the (a) inverted, grayscaled, cropped image, (b) its FFT , (c) the mask/filter and (d) the binarized filtered image with threshold value of 0.5.

Now that the lines are gone, we need to clean the image and process it so that the letters are only 1 pixel thick. We should take the removed information due to line removal into consideration. This is done by applying morphological operations. The figure below shows the results for various operations in Scilab.

The skel() function produced 1-pixel thick letter. However, the only readable text is the “-uctions” part of the word “instructions”. Other words are not readable. The bwdist() function made the texts readable but these are not 1 pixel thick. The thin() function has produced 1-pixel thick characters but are indistinct. The edilate() function is the most inefficient operation for this case. I planned to use it so that the letters cut by the horizontal lines could be connected. It turns out that this method is impossible. I also tried to combine these operations but edilate() and bwdist() dominate when used. The effect of thin() is negligible. A black image is produced when skel() is combined with other operations. 

I cannot produce a single clean and clear image. Therefore, I give myself a grade of 9/10 for doing this activity.

Reference:

1. Maricor Soriano, “A8 – Applications of Morphological Operations 1 of 3: Preprocessing Text”, 2012.

Leave a comment