Key points are not available for this paper at this time.
This paper studies stochastic power control problems over a fading channel, where the transmitter randomly harvests renewable energies from environment and stores them in a battery for future data transmissions. Moreover, data packets are assumed to arrive at the data queue of transmitter with constant rate μ. To incorporate delay quality-of-service guarantees, two delay constraint models are separately considered, namely average delay model with maximum average delay constraint and statistical delay model with maximum delay outage probability constraint. Under each delay constraint model, the stochastic power control problem aims at maximizing μ considering the randomness of channel fading and energy harvesting (EH) processes. The resulting optimization problems can be formulated as infinite-horizon Markov decision processes. Under average delay model, the optimal power control policy needs to keep track of current data queue-length state in addition to the battery state. On the other hand, under statistical delay model, a sufficiently large queue-length region is assumed, hence, the optimal policy does not depend on the data queue-length state. We study various structural properties of the optimal control policies and develop online power control algorithms that converge to the optimal solutions without requiring statistical knowledge of channel fading and EH processes. By defining and learning the so-called post-decision state-value functions, the proposed learning algorithms require less complexity and converge faster than the conventional reinforcement learning algorithms. Numerical results demonstrate the effectiveness of the online learning algorithms for different delay constraint models and EH settings.
Ahmed et al. (Mon,) studied this question.