NALA: a Nesterov accelerated look-ahead optimizer for deep learning | Synapse