Bridging classical and neural methods for improved segmentation in mathematical text based images
Abstract
The recognition of mathematical expressions remains a challenging task, particularly due to the segmentation sub-stage, which plays a critical role in the overall recognition process. Despite significant advancements in mathematical expression recognition, existing research has primarily focused on the recognition phase, often overlooking the segmentation issues that arise in diverse domains such as computer vision and image processing. This study aims to address the segmentation problem in handwritten mathematical text and expression recognition through a comprehensive analysis and the development of an optimal solution. Classical segmentation methods were explored, classified, and tested on various datasets of mathematical expressions, with multiple comparative case analyses conducted. Based on these findings, an optimal neural network-based segmentation approach was proposed. The model demonstrated effective segmentation performance, achieving competitive mean Intersection over Union (IOU) scores of 79.4%, 83.5%, 81.3%, 74.6%, and 79.6% on the CROHME 2014, CROHME 2016, CROHME 2019, Aidapearson, and HasyV datasets, respectively. The results highlight the success of the proposed neural network-based solution in overcoming the limitations of traditional segmentation methods and affirm its potential in enhancing mathematical expression recognition systems.