A Two-Stage End-to-End Framework for Robust Scene Text Spotting with Self-Calibrated Detection and Contextual Recognition | Synapse