When CNNs Meet Vision Transformer: A Joint Framework for Remote Sensing Scene Classification | Synapse