Decode, move and speak! Self-supervised learning of speech units, gestures, and sounds relationships using vocal imitation | Synapse