Comparing exploration strategies for Q-learning in random stochastic mazes | Synapse