Key points are not available for this paper at this time.
We present a method for automatic feature extraction and cross-modal mappingusing deep learning. Our system uses stacked autoencoders to learn a layeredfeature representation of the data. Feature vectors from two (or more)different domains are mapped to each other, effectively creating a cross-modalmapping. Our system can either run fully unsupervised, or it can use high-levellabeling to fine-tune the mapping according a user's needs. We show severalapplications for our method, mapping sound to or from images or gestures. Weevaluate system performance both in standalone inference tasks and incross-modal mappings.
Fried et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: