Sample mean based index policies byO(logn) regret for the multi-armed bandit problem | Synapse