Large language models (LLMs) have shown remarkable ability to generate code, yet their outputs often violate syntactic or semantic constraints when guided only through natural language prompts. We introduce TreeCoder, the most general and flexible framework to date for exploring decoding strategies, constraints, and hyperparameters in LLMs, and use it in code generation to enforce correctness and structure during decoding rather than relying on prompt engineering.TreeCoder represents decoding as a tree search over candidate programs, where both decoding strategies and constraint functions---such as style, syntax, execution---are treated as first-class, optimisable components. This design enables systematic exploration and automatic tuning of decoding configurations using standard optimisation techniques. Experiments on Python, SQL and Rust show that TreeCoder consistently improves accuracy across open-source models such as CodeLlama, Mistral and DeepSeek, often significantly outperforming their unconstrained baselines.
Princis et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: