GRNTI 50.07 Теоретические основы вычислительной техники
BBK 3297 Вычислительная техника
The paper proposes a mathematical model of the concept of grapheme, the main purposes of which are to formulate a strict definition of the concept of «grapheme» and to highlight the overall structure of images of the same characters. The construction of the grapheme is based on a continuous skeletal approach, which involves the construction of the skeleton of a binary image of the symbol with its subsequent regularization. We also use the constructed model for the problem of text recognition on a digital image. For this purpose, features based on vertex positions in the grapheme model are extracted from the model, and the classifier is trained on these features. It determines which class the grapheme selected from the binary image of one symbol belongs to. We also consider the method of processing the input image with text for better selection of characters, lines and words. The experiments show the performance of the proposed grapheme model. The classification algorithm shows results comparable with modern methods of text recognition.
optical character recognition, digital text image, digital font, grapheme, mathematical model, medial representation, aggregated skeleton graph
1. Felix Hausdorff. Grundzüge der mengenlehre. – 1914.
2. Hochreiter, S. Lstm can solve hard long time lag problems / Sepp Hochreiter, Jürgen Schmidhuber // Advances in neural information processing systems. – 1997. – P. 473–479.
3. Ito, K. Gaussian filter for nonlinear filtering problems / Kazufumi Ito // Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No. 00CH37187) / IEEE. – Vol. 2. – 2000. – P. 1218–1223.
4. Kam, H. T. Random decision forests / Ho Tin Kam // Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC. – 1995.
5. Kimura, F. Handwritten numerical recognition based on multiple algorithms / Fumitaka Kimura, Malayappan Shridhar // Pattern recognition. – 1991. – Vol. 24, no. 10. – P. 969–983.
6. Orieux, F. Bayesian estimation of regularization and point spread function parameters for wiener– hunt deconvolution / François Orieux, JeanFrançois Giovannelli, Thomas Rodet // JOSA A. – 2010. – Vol. 27, no. 7. – P. 1593–1607.
7. Otsu, N. A threshold selection method from graylevel histograms / N. Otsu // IEEE Trans. Sys., Man., Cyber. – 1979.
8. ParaType. Cifrovye shrifty. – ParaType. – 2008.
9. Takahashi, H. A neural net ocr using geometrical and zonal pattern features / Hiroyasu Takahashi // Proc. 1st Intl. Conf. on Document Analysis and Recognition. – 1991. – P. 821–828.
10. Tesseract. – https://github.com/tesseract-ocr/tesseract.
11. A. Lipkina, L. Mestetskiy. Grapheme approach to recognizing letters based on medial representation / A. Lipkina, L. Mestetskiy. – 2019. – 01. – P. 351– 358.
12. O. V. Osetrova. SEMIOTIKA ShRIFTA. – Vestnik Voronezhskogo gosudarstvennogo universiteta. Seriya: Filologiya. Zhurnalistika. – 2006.
13. Yujian, L. A normalized levenshtein distance metric / Li Yujian, Liu Bo // IEEE transactions on pattern analysis and machine intelligence. – 2007. – Vol. 29, no. 6. – P. 1091–1095.
14. L.M. Mesteckiy. Nepreryvnaya morfologiya binarnyh izobrazheniy: figury, skelety, cirkulyary. – M.: FIZMATLIT. – 2009.