Offline HandWritten Gujarati Word Segmentation
Written by Shubham Patel and Yash Patel
Introduction
Most existing document analysis systems have been developed for printed text There has been little work done in word segmentation for handwritten documents. Most of this work has been applied to special kinds of example, addresses a “dean” pages which have been written specifically for testing the document analysis systems. Historical manuscripts from problems including noise, shine through and other artifacts due to degradation no techniques exist to segment words from such handwritten manuscripts. Further, Scale space techniques have not been at the problem before below We outline the various steps in the segmentation algorithm below.
The input to the system is a grey level document image. The image is processed to remove horizontal and vertical line segments likely to interfere with later operations. The page is then dissected into lines using projection analysis techniques modified for gray scale image. The projection function is smoothed with a Gaussian filter (low pass filtering) to eliminate false alarms and the position of the local maxima (i.e. white space between the lines) is detected. Line segmentation, though not essential is useful in breaking up connected ascenders and descenders and also in driving an automatic scale selection mechanism. The line images are smoothed and then convolved with second order anisotropic Gaussian derivative filters to create a scale space and the blob like features which arise from this representation give us the focus of attention regions (i.e. words in the original document image). The problem of automatic scale selection for filtering the document is also addressed. We have come up with an efficient heuristic for scale selection whereby the correct scale for blob extraction is obtained by finding the scale maxima of the blob extent. A connected component analysis of the blob image followed by a reverse mapping of bounding boxes allows us to extract the words. The box is then extended vertically to include the ascenders and descenders. Our approach to word segmentation is novel as it is the first algorithm which utilizes the inherent scale space behavior of words in grey level document images. This gives a brief description of the techniques used And Why We Choose Gujarati language for Project
Most of the work done in segmentation is on english language only,so we choose to try and implement the same for our mother tongue, Gujarati.
CHARACTERISTICS OF GUJARATI LANGUAGE
The basic direction of writing Gujarati is from left to right and top to bottom. Gujarati alphabets utilize 94 symbols altogether, which can be categorized into the different groupings. Gujarati character set provides 34 (+2 compound ksha, gna) consonants, 14 vowels which are represented by a single symbol, and 10 numerals as shown in Figure below.
There are 3 other symbols used for representing fractions. These are called “pa” (One Fourth), “adadho” (Half) and “poNo” (Three Fourth). Gujarati consists of a special symbol called Maatra, corresponding to each vowel, which are attached to consonants to modify their sound. A character is said to be simple if it is a consonant alone or with a maatra. A character is said to be conjunct if it is a half consonant along with other consonant. There are many possibilities for the conjunct consonants that increase difficulties in segmentation and identification of the characters. The vowels (modifiers) can be placed at the left, right, top or bottom (or both) of the consonant. Gujarati word is divided into three regions-upper region, middle region and lower region. The upper and lower region includes vowels and middle region includes consonants.
Methodology
Implementation of Scale Space Technique for line segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is from 1999, the method still achieves good results, is fast, and is easy to implement. The algorithm takes an image of a line as input and outputs the segmented words.
Input :- image to be segmented
First Convert :- words are separated by rectangular border
Second Convert :- multiple segmented images are generated
Output :- Segmenting words of sample sample2.PNG Segmented into 3 words
An anisotropic filter kernel is applied to the input image to create blobs corresponding to words. After thresholding the blob-image, connected components are extracted which correspond to words.
Parameter :-
Most of the parameters of the function Line segmentation deal with the shape of the filter kernel:
- img: grayscale uint8 image of the text-line to be segmented.
- kernelSize: size of filter kernel, must be an odd integer.
- sigma: standard deviation of Gaussian function used for filter kernel.
- theta: approximated width/height ratio of words, filter function is distorted by this factor.
- minArea: ignore word candidates smaller than specified area.
The function Prepare Image can be used to convert the input image to grayscale and to resize it to a fixed height:
- img: input image.
- height: image will be resized to fit specified height.
Algorithm :-
The illustration below shows how the algorithm works:
- top left: input image.
- top right: filter kernel is applied.
- bottom left: blob image after thresholding.
- bottom right: bounding boxes around words in original image.
Line segmentation allows the ascenders and descenders of consecutive lines to be separated. In the manuscripts it is observed that the lines consist of a series of horizontal components from left to right. Projection profile techniques have been widely used in line and word segmentation for machine printed documents [5] In this technique a 1D function of the pixel values is obtained by projecting the binary image onto the horizontal or vertical axis. We use a modified version of the same algorithm extended to gray scale images Let f (y) be the intensity value of a pixel (z. y) in a gray scale image. Then, we define the vertical projection profile as
where W is the width of the image. Fig. 1 shows a section of an image in (a) and its projection profile in (b). The distinct local peaks in the profile corresponds to the white space between the lines and distinct local minima corresponds to the text (black ink). Line segmentation, therefore, involves detecting the position of the local maxima. However, the projection profile has a number of false local maxima and minima. The projection function P(y) is therefore, smoothed with a Gaussian (low pass) filter to eliminate false alarms and reduce sensitivity to noise. A smoothed profile is shown in ©. The local maxima is then obtained from the first derivative of the projection function by solving for y such that:
The Line segmentation technique is robust to variations in the size of the lines and has been tested on a wide range of handwritten pages The next step after ne segmentation is to create a scale space of the line images for blob analysis.
Segmentation Problems
- The lower modifier of one line overlaps with the upper modifiers of lower line.
- Zigzag lines of the text and Zigzag words of the same line
- Unusual space between lines
Working
step1:- You have to put your image in data folder
step2:-After complition of step1 execute your code. Output gives you number of word present in line
step3:- Segmented Images are stored in folder named OUT
For reference
Github:-https://github.com/009geniuS/WordSegmentation
Conclusion
In this, the segmentation procedure of characters in a gujarati text has been discussed. These segmented characters are used in the recognition step of Scale-Space Technique development. There is a complex set of characters in the Gujarati language. Sophisticated algorithms are needed for recognizing these characters. Segmentation procedure of characters with modifiers has not been discussed in this work. This work may be extended by segmenting the characters with modifiers.