Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training | Synapse