Large Language Models (LLMs) excel in diverse text generation tasks but still face limited controllability, opaque decision processes, and frequent hallucinations. This paper presents a structural causal intervention framework that models input–hidden–output dependencies through a structural causal model and performs targeted interventions on hidden representations. By combining counterfactual sample construction with contrastive training, our method enables precise control of style, sentiment, and factual consistency while providing explicit causal explanations for output changes. Experiments on three representative tasks demonstrate consistent and substantial improvements: style transfer accuracy reaches 92.3% (+7–14 percentage points over strong baselines), sentiment-controlled generation achieves 90.1% accuracy (+1.3–10.9 points), and multi-attribute conflict rates drop to 3.7% (a 40–60% relative reduction). Our method also improves causal attribution scores to 0.83–0.85 and human agreement rates to 87–88%, while reducing training and inference latency by 25–30% through sparse masking that modifies ≤10% of hidden units per attribute. These results confirm that integrating structural causal intervention with counterfactual training advances controllability, interpretability, and efficiency in LLM-based generation, offering a robust foundation for deployment in reliability-critical and resource-constrained applications.
Qiu et al. (Mon,) studied this question.