Handwritten Tamil Word Pre-Processing and Segmentation Based on NLP Using Deep Learning Techniques
Main Article Content
Abstract
Tamil is a traditional Indian language spoken mostly among South Indians, SriLankans, as well as Malaysians. This paper proposed the novel techniques based on pre-processing and segmentation of handwritten Tamil words through NLP using threshold value based RGB image conversion to grayscale image. Then to segment this image based on line boundary detection with Alex Net based Convolutional neural network (Alex Net- CNN) in deep learning architecture. Every text is scaled in to needed pixel in the suggested system, that is then exposed to be trained. – i.e., every scaled word contains a set pixel count, which are used to train networks. The findings reveal that proposed method achieved better detection accuracy in written vocabulary knowledge that are equivalent to features extraction techniques. For numerous pictures, a descriptive analysis was performed in terms of effectiveness, accuracy, recollect, and F1 measure.