Abstract—Neural network performance is highly sensitive to hyperparameter settings, yet exhaustive search over the configuration space is computationally prohibitive. We present DP-HPO, a framework that formulates hyperparameter optimisation (HPO) as a finite-horizon Markov Decision Process (MDP) and solves it via approximate dynamic programming (ADP). Under the conditional independence of hyperparameter dimensions—empirically satisfied on standard MLP search spaces—DP-HPO implements exact dynamic programming, committing each dimension optimally via Bellman's backward induction. When independence is violated, we derive an optimality gap bound of f* − fᴅᴘ-ᴴᴺᴼ ≤ (d−1)·ε, where ε is the maximum pairwise interaction strength and d is the number of hyperparameter dimensions (Theorem 1). An evaluation cache eliminates redundant model training, yielding exactly 10 evaluations for a standard 4-dimensional MLP space versus 108 for exhaustive grid search—a 90.7% reduction. We benchmark DP-HPO against eight baselines (Grid Search, Random Search at two budgets, Bayesian Optimisation, Optuna/TPE, Hyperband, BOHB, and SMAC) across four datasets with 25 independent seeds and Wilcoxon signed-rank tests with Bonferroni correction. Results demonstrate that DP-HPO achieves competitive performance within 0.5% of exhaustive grid search across all four datasets while reducing the number of model evaluations by 90.7%.Index Terms—Hyperparameter optimisation, dynamic programming, Markov decision process, neural network, approximate dynamic programming, evaluation caching, Bayesian optimisation.
Kartik Sonawane (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: