Handwritten character segmentation remains one of the most challenging and essential phases in Optical Character Recognition (OCR) and handwritten document analysis. The complexity of unconstrained handwriting, varying writing styles, touching and overlapping characters, inconsistent spacing, and noise significantly affect accurate segmentation and recognition. Traditional segmentation approaches operate primarily on uncompressed images; however, recent studies demonstrate that performing segmentation directly on run-length encoded (RLE) compressed handwritten documents enhances computational efficiency and reduces memory usage. This paper presents a consolidated review and analysis of segmentation methodologies, ranging from explicit segmentation, implicit segmentation, projection-based analysis, connected component analysis, graph-based techniques, clustering approaches, and hybrid recognition-based methods. Furthermore, segmentation strategies for applications including postal address recognition, content-based image retrieval, number plate detection, and cursive word recognition are examined. Hybrid approaches based on min-cut graph, dynamic programming and HMM outperform purely classical dissection for cursive scripts as experimental results show. The work references future scope in the direction in a form of deep learning-based models and combined compressed-domain OCR systems as a solution to attain higher segmentation and recognition accuracy. In summary, our work presents a detailed overview of segmentation-related challenges, techniques, and trends in the field that can benefit both researchers and practitioners in achieving robust handwritten OCR performance.
Kumar et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: