A NAS predictor trained on 200 labeled architectures must rank hundreds of thousands of candidates, yet expressive GNN predictors overfit at this scale while simple ones plateau as data grows. We introduce FORGE (Factorized Operation-aware Regime-adaptive Graph Encoding), the first GNN predictor whose gate complexity adapts automatically to training set size and whose hyperparameters are derived entirely from data and graph properties, requiring no benchmark-specific tuning. Each message is gated by both source and destination operation types before aggregation, subsuming prior destination-only approaches as a special case. A classical Wiener–Tikhonov shrinkage coefficient, learned per layer, controls the transition from simple to expressive gating. Combined with topology-aware layer selection and data-proportional training, the result is a single configuration requiring zero benchmark-specific tuning. On NAS-Bench-101, FORGE achieves Kendall's τ = 0.702 with 30 seeds (+0.026 over GATES, p < 10⁻⁷), outperforming all tested graph-based predictors at every training size. On NAS-Bench-201, advantages reach +14.7% τ in the data-scarce regime where each labeled architecture costs a full training run. FORGE matches or exceeds individually-tuned baselines on three of four benchmarks, across six datasets and training sizes spanning two orders of magnitude — with zero manual intervention. Preprint v1.0. Not yet peer-reviewed. Code and reproducibility artifacts are linked from the manuscript.
Jose Santiago Echevarria (Mon,) studied this question.