Handwriting Character Recognition using Vector Quantization Technique

ABSTRACT


I. Introduction
It is unquestionable that Indonesia possesses a prosperous cultural heritage. One substantial cultural heritage of Indonesia is the manifestation of local language with its native script. A Lontara script from Buginese language Makassar is one instance of native script from South Sulawesi. The Lontara script as one substantial cultural heritage of Buginese tribe in Makassar ought to be taken into account in all conscience by the public, particularly by the local community of Buginese in South Sulawesi. Unless Lontara script would have disappeared due to expeditious modernization. For the time being, Lontara script is endangered and requires meticulous attention since we found limited data and information regarding this script.
In order to assist the local community as well as the public in recognizing the Buginese Lontara script pattern from Makassar, several attempts have been conducted. With the rapid development of technology, some attempts employed technology-based approaches. One of the approaches taken was using artificial intelligence (AI) to develop a technology to recognize script patterns. Essentially, AI constitutes a system to develop a system enabling a computer machine to do human work. To enable computer machines in performing human work, the computer must be equipped with the knowledge possessed by a human to construct human behavior.
This paper intends to implement Learning Vector Quantization (LVQ) in recognizing the Buginese Lontara script pattern from Makassar and converting to handwriting pattern. This paper, in the near future, aims at assisting the local community in learning Buginese Lontara script. This paper is divided into four parts. First, it discusses research motivation; why did the researcher intrigue to conduct this research. Second, it presents what methodology and technique which were employed by the This paper seeks to explore Learning Vector Quantization (LVQ) processing stage to recognize The Buginese Lontara script from Makassar as well as explaining its accuracy. The testing results of LVQ obtained an accuracy degree of 66.66 %. The most optimal variant of network architecture in the recognition process is a variation of learning rate of 0.02, a maximum epoch of 5000 and a hidden layer of 90 neurons which was the result of recognition based on feature 8. Based on these variations, the obtained performance with a mean square error (MSE) of 0.0306 and the time required during the learning process was quite short, 6 minutes and 38 seconds. Based on the results of the testing, the LVQ method has not been able to provide good recognition results and still requires development to generate better recognition results.
researcher. Third, it discusses LVQ method assessment results. Finally, it sums up the assessment results as well as providing suggestions and recommendations for future endeavors of research.

II. Materials and Methods
This part briefly discusses Buginese Lontara script pattern, LVQ method, and image processing employed in this research.

A. Lontara Script
Lontara is a traditional script originating from Buginese community in Makassar. Its scriptwriting is based on Sulapa Eppa Wala Suji. Wala Suji was derived from the words Wala and Suji-Wala mean divider/fence/keeper and Suji means princess. Wala suji is a kind of a trapezoid bamboo fence using in traditional rituals. Sulappa eppa (four sides) is a classic mystical faith of Buginese in Makassar. It represents four elements of the universe: fire, water, air, and earth. Antecedently, Lontara script was used to write governmental regulations and rules as well as the social norms applied within the community. The manuscripts were written using stick [1].
The type of data used in this study was the primary data. Primary data was obtained from the original or first source. This data was not available in compiled form or in the form of files. It must be sought through sources, commonly referred as a respondent. Respondent plays a role as the person to obtain the data [2]. Data samples from 23 the Buginese script can be seen in Table 1.

B. Pattern Recognition
Pattern recognition is a constituent of artificial intelligence science. Several scholars have discussed some definitions of pattern recognition from the initial research; pattern recognition deals with a physical object or certain event recognition and is classified into single or multiple categories [3][4] [5]. In the same way, it is a science which emphasizes on the description and classification of certain measurement [6][7][8] [9].

C. Handwriting Recognition
According to Plamondon and Srihari, handwriting recognition constitutes a process of changing certain language manifested in the form of spatial form; converting handwriting pattern into a symbolic representation [10]. In principle, the handwriting processing stages including data acquisition, pre-processing, feature extraction, and machine learning algorithms such as BPNN, RBFNN, SVM, LVQ and so forth [11] [12][13] [14].

D. Image Processing
Image processing constitutes a number of processes to improve image quality to meet the demand and requirement of users for accessible and uncomplicated interpretation performed by human and/or computer machines. Image processing processes signal which input is an image. Image processing generates an image or a group of characteristics or parameters correlated with the image. Image processing basically is a system intending to classify objects within categories and classes based on neither knowledge a priory nor statistical information is taken from object pattern [

1) Image Acquisition
Image acquisition constitutes a capturing process or scanning process of certain analog images and converting to digital form. Commonly, this stage begins by initially capturing the image on the object using the scanner as primary media. This paper captured Buginese Lontara script from Makassar from 16 different respondents using A4-sized paper.

2) Enhancing Image Quality
First, Cropping; Object cropping was the process of cutting the image area and only the image area containing the object has remained. This process was done to remove unnecessary empty areas around the object and avoid learning errors due to the location of the object (letter writing) that has different positions. Cropping was performed by cutting 23 points of the image simultaneously and produced 23 image characters that are ready to be processed by the next stage. Basically, the cropping machine was separated from other parts of the image processing. Cropping object is a process in which it crops some part of the image to only. Next, Converting RGB Image to Grayscale; Grayscale is an image whose pixel intensity values are based on gray degrees. At this stage, the RGB image was converted to a grayscale image and it generated only one color channel [16] [17].

3) Image Quality Enhancement
Image sharpening or commonly referred to as image transformation constitutes one process of image quality enhancement. It is commonly employed to increase color contrast and brightness of an image. This process intends to simplify the process of interpretation and analysis of an image. Image contrast sharpening is purposively taken to correct the image display by maximizing image contrast between lightening and darkening of the image [17] [18].

4) Image Segmentation
Thresholding constitutes one of the image segmentation methods which separated object and background in the certain image based on the brightness level. Thresholding was a process in changing colorful image or grayscale image into binary image (black-white). This process employed thresholding concept. The thresholding process took color value on each image pixel and it compared with the value of the threshold. Each image pixel was then altered into white color if the value of color or its grayscale is above the threshold. On the other hand, if the color value or its grayscale is below the threshold, then it was altered into black color [19].

5) Size Feature Extraction
First, Image Resizing; Image resizing constitutes the process of reducing the size of an image based on the number of pixels. For example, images with a size of 186 x 186 pixels and reduced to 42×42 pixels. The image resizing process is needed to reduce the number of pixel images but still consider the shape of the object hence it does not change significantly. Reduction of the number of pixels aims at limiting the number of nodes or neurons in the input layer of the artificial neural network which requires longer processing time. Then, Image Thinning; Image thinning used an interactive process that removes black pixels (turning them into white pixels) at the edges of the pattern. The purpose of depletion was to reduce the unnecessary part (redundant part) and only important information was produced [20].

E. Learning Vector Quantization (LVQ)
The LVQ method constitutes a classification algorithm utilizing connected vectors and works competitively but is guided to solve a problem. The LVQ network is composed of competitive layers (composed of several neurons) working to classify input vectors based on the principle of competition in which the class is determined based on the winner of the competition. The winner of the competition is determined based on the distance of the input vector with the reference vector. LVQ is widely used in pattern recognition and data classification. This method is quite simple but very effective [21] [22].
LVQ architecture consists of an input layer (input layer), a competitive layer (a competition in the input to enter a class based on the proximity of its distance) and the output layer. The input layer is connected to the competitive layer by weights. In the competitive layer, the learning process is supervised. Input competes to be able to enter into a class. Then, the stages of LVQ are explained as follows: 1. Starting by entering the training data, the existing input data is the result of image processing, then the training data is a digital image that is ready to be processed by the pattern recognition system.

F. Performance Accuracy
Some methods in statistics to measure the accuracy of an algorithm are mean absolute error (MAE), mean square error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). The measurement algorithm aims at attaining the best value [23][24][25] [26]. In this study, the MSE method was chosen to measure accuracy. Meanwhile, MSE used (2) where, is a data value; − It is a result value; is a pattern value.

III. Results and Discussions
This part discusses the results of image processing, feature extraction, and followed by assessment using Learning Vector Quantization (LVQ).

A. Image Processing Results
Image processing intends to increase image quality and produce an image in accordance with the demand of the user to be easily interpreted by humans or machines. Based on the stages of image processing, the First Stage was cropping processes such as image acquisition (converting original images to grayscale images), improving image quality, and image sharpening. Second, the image segmentation process by changing into a grayscale image was converted into a binary image using the threshold method. Then, the size feature extraction was done by resizing into a 42 x 42 and depletion of the image. Third, the process of feature extraction to obtain the required data was obtained. The cropping process results are presented in Figure 1.
Next, the fourth stage was the segmentation process. In this study, the Chessboard method that divided images into objects on the squares has been used. The objects were formed into a square with a certain size. In this study, the image was divided into nine segments which then used by IoC to count the number of black pixels in each segment or section. The results of segmentation can be seen in Figure 2.
Last, the extraction of features in each segment by using Mark Direction to calculate several values including vertical (vert), horizontal (horz), left diagonal (dig1) and right diagonal (dig2) masking were performed. The feature extraction process was carried out on 16 x 23 letters of data which then formed 8 variations of features on each Buginese Lontara Script. This feature will be used as input on the network. The amount of data used can be seen in Table 2.
In this experiment, the results of the feature extraction process were saved into a dataset with a file extension .xlsx. Thus, feature data sets from the 1st to the 8 th features to be analyzed by the BPNN method as training and testing data have been used.

B. Image Calculation with Learning Vector Quantization (LVQ) Method
The LVQ network architecture used was similar to the previous BPNN network architecture which consists of the input layer, hidden layer, and output layer. The number of hidden layers can be changed as required. The input layer consists of 9.18 and 45 neurons which are a combination of features consisting of 8 different features that have been obtained from the feature extraction process. While the output layer consists of 23 neurons since the target output is the number of Lontara alphabets consisting of 23 letters. Network training using the LVQ method used 230 training data and 138 test data. In this study, network training using the LVQ method was also made as the BPNN method. It is influenced by the learning rate, the number of epochs and the number of neurons in the hidden layer.  Simulations performed on the LVQ method were alike with the BPNN method; it intended to make both methods were compared with similar parameters. Simulations performed 6 times with 3 different parameters were the effect of the learning rate, then the number of epochs with a maximum limit and hidden layer. The variation of each parameter can be seen in Table 3.
Based on simulations performed 6 times, the 4 th simulation was the best simulation in the calculation using the LVQ method with one hidden layer with a total of 90 neurons, the maximum number of epochs given was 5000 and the target error was 0. The learning function used was learnlv1 and the number the output layer had 32 neurons. The learning rate given in this simulation was 0.02. This network pattern was applied to a combination of characteristics 1 to 8. The results of the 5 th simulation can be seen in Table 4.
Based on Table 4, 92 letters out of 138 letters can be read properly and precisely with an accuracy of 66.66 % using a combination of features 8. The performance (MSE) obtained by 0.0306 was the second largest sequence approaching 0 after feature 5 with MSE 0.0276. The best epoch obtained was 184 which was influenced by 6 minutes 38 seconds learning time.

C. Learning Vector Quantization (LVQ) Method Analysis
Learning rate variation, changing the amount of maximum epoch and neurons unit on the hidden layer. Learning rate variation on the LVQ method during the first and second simulation was influenced by the varied changing of accuracy; each feature combination accuracy may increase or decrease. Overall, the results in this variation were less good since the accuracy obtained was only around 42.02 %. It was also influenced by the maximum epoch.
Giving maximum epoch during the third and fourth simulations can be stated to have an effect, yet it was similar with giving learning rate. The accuracy obtained was increased and decreased on each combination. Thus, the accuracy was not consistently increased in each feature combination.
If the accuracy increased, it affected the generated MSE. All learning in this variation stops at best epoch according to the maximum epoch given, which was 10 epoch and 50 epoch. But the time spent in the learning process also increases overall. The greater the epoch gave results the longer time spent on the learning process. Then, the maximum epoch in this study also has a significant influence on learning time. However, the number of changes from the neuron unit itself does not provide an increase that exceeds the highest value from the default value or reference point, the 5 th simulation with the highest accuracy of 66.66 %.
The best results can be seen in Table 4, taking into account the number of letters reads with the shortest learning time found in the 5 th simulation of 92 letters out of 138 letters with an accuracy of 66.66 %, MSE 0.0306, best epoch 184 with a learning time of 6 minutes 38 seconds.

IV. Conclusion
An analysis of the introduction of the Buginese Lontara script from Makassar using the LVQ method has been implemented. Based on the results of the experiment, the LVQ method has the highest accuracy rate of 66.66 %; obtained from the 5 th simulation of feature 8 with data that can be recognized as many as 92 data from a total of 138 input data. Meanwhile, the testing time needed was 6 minutes 38 seconds. It can be said that parameters such as learning rate, the number of neurons in the hidden layer and the maximum number of epochs, greatly affect the results of recognizable data and the accuracy of the recognition results. In this study, the best parameters of the BPNN method were learning rate = 0.02, the number of neurons in the hidden layer = 90, epochs = 5000 has been used to obtain the best accuracy. Thus, the LVQ method has generated a level of recognition accuracy measured by MSE of 0.0306.