A Transformer-based multi-modal fusion network for semantic segmentation of high-resolution remote sensing imagery | Synapse