Learning long-term dependencies with gradient descent is difficult | Synapse