September 20, 2023

Bandit Algorithms Applied in Online Advertisement to Evaluate Click-Through Rates

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Reinforcement learning approaches are increasingly used to model complex decision-based problems. The multi-armed bandit problem is a classical instance suitable for reinforcement learning challenges that involves balancing exploration and exploitation trade-offs. Finding a balance between exploration and exploitation is a fundamental aspect of a variety of reinforcement learning applications. Multi-armed bandit algorithms are useful in multiple industry domains such as computer games, clinical trials, telecommunication, and recommender systems. This paper aims to study the multi-armed bandit problem and contextualize the algorithms to provide a framework for optimizing click-through rates in online advertising, thereby improving the customer fidelity. To that end, parameterized bandit algorithms such as upper confidence bound (UCB), epsilon greedy (є-greedy), and SoftMax algorithms were implemented and tweaked to maximize performance in an advertising platform. The results obtained demonstrate optimal records in choosing the best adverts. The UCB approach achieves the highest cumulative mean rewards for selecting the arms over the iterations. Experiments stipulate that the proposed system outperforms the conventional techniques when є and τ are set to 0.1 as it does not rely on the availability of the data over varying cycles.

Me gusta

Guardar

Cite This Study

Mambou et al. (Wed,) studied this question.

synapsesocial.com/papers/6a17a0c356b3e2ada41297a6 https://doi.org/https://doi.org/10.1109/africon55910.2023.10293356

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Me gusta

Guardar