As recommender systems are essential to various web domains such as e-commerce and web content sharing, providing equitable item exposure regardless of popularity becomes an imperative requirement. However, traditional fairness-aware approaches typically aim to achieve a better trade-off between recommendation accuracy and fairness, and focus on improving the exposure rate of the long-tail items on static settings, evaluating fairness on one-shot recommendation decisions using logged data. Such methods overlook the dynamic nature of user preferences in real-world interactive environments. In contrast, our work seeks a win-win solution that simultaneously enhances recommendation accuracy and fairness over the long term, rather than merely trading off one against the other. To achieve this goal, we empirically demonstrate and analyze the spatiotemporal heterogeneity of user popularity preference. Our findings reveal complementary characteristics that, when fully exploited, can guide personalized strategies for long-term fairness. Building on this insight, we propose HER4IF, a novel hierarchical reinforcement learning framework designed for interactive recommendation. HER4IF decomposes the recommendation process into two key tasks: dynamic fairness control and item recommendation. The high-level agent continuouasly learns adaptive fairness constraints from evolving user popularity preferences, while the low-level agent refines recommendation policies under these personalized constraints. Extensive experiments on three real-world datasets and the interactive recommendation platform KuaiSim demonstrate that HER4IF significantly outperforms state-of-the-art methods, achieving substantial improvements in both fairness and recommendation accuracy. Our code is available at: https://github.com/1163710212/HER4IF.
Xia et al. (Thu,) studied this question.