Self-Supervised and Explainable Transformer-Based Architectures for Robust End-to-End Speech and Language Understanding | Synapse