Abstract Large language models (LLMs) require considerable computation and energy resources during training and deployment. While scaling laws for training have guided much recent progress, inference costs represent a significant and growing component of the overall resource burden, particularly for reasoning models. Existing compute-optimality characterizations that consider model size, dataset size and inference tokens in isolation or fixed combinations may overlook more efficient operating points. We introduce directed stochastic skill search (DS3), a general framework that represents inference as stochastic traversal over a learnt skill graph. From a simplified yet expressive instantiation, we derive closed-form expressions for task success and compute cost across a wide range of inference strategies—including chain-of-thought (CoT) and tree-of-thought (ToT)—enabling comparisons by task difficulty and model capability. We extend a prior graph framework of LLM training to include inference and bridge DS3 with empirical scaling laws. We theoretically recover observed patterns, including linear accuracy scaling with log-compute, variation in preferred inference strategies by task and capability, emergent behaviour elicited by reasoning despite parameter plateaus and both best-of-N and majority voting (MV) captured within one framework. By characterizing training-inference interdependencies, our framework deepens theoretical understanding and supports principled algorithmic design and resource allocation. This article is part of the discussion meeting issue ‘Bits, neurons and qubits for sustainable AI’.
Building similarity graph...
Analyzing shared references across papers
Loading...
Austin Ellis‐Mohr
Anuj K. Nayak
Nishant Garg
Philosophical Transactions of the Royal Society A Mathematical Physical and Engineering Sciences
University of Illinois Urbana-Champaign
Stony Brook University
Building similarity graph...
Analyzing shared references across papers
Loading...
Ellis‐Mohr et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69a528b3f1e85e5c73bf02a1 — DOI: https://doi.org/10.1098/rsta.2024.0510
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: