Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation | Synapse