Towards textually describing complex video contents with audio-visual concept classifiers | Synapse