Key points are not available for this paper at this time.
We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most d^3. 5 n polylog (n, d) with high probability where d is the dimension and n is the time horizon. In the stochastic setting the bound improves to M d^2 n polylog (n, d) where M d^-1/2, d^-1 / 4 is a constant that depends on the geometry of the constraint set and the desired computational properties.
Fokkema et al. (Mon,) studied this question.