Pre-Trained Acoustic-and-Textual Modeling for End-To-End Speech-To-Text Translation | Synapse