In the context of political and financial market turmoil, effectively forecasting financial market trends is crucial for investment decisions. Large language models (LLMs) have been applied in extant research to predict market trends, analyze investor sentiments and interpret financial news, all aiming to help investment decision making. However, LLMs face limitations due to training data heterogeneity, restricting multidimensional perspectives and hindering comparative analysis for optimization. This study proposes a “Dual-Agent LLM Debate Mechanism” framework using a Proponent (LLM1: Gemini Pro 3) and an Opponent (LLM2: ChatGPT 5.2) to address single-LLM forecasting gaps: The Proponent generates a baseline forecast (F1) from an Integrated Context, while the Opponent validates and resolves conflicts with the Proponent via up to three rounds of cross-debate to produce a consensus forecast (F2). A controlled experiment was conducted to analyze 75 financial market indicators (FMIs) across five asset categories, revealing that F2 outperforms F1 in accuracy and directional stability, particularly in highly volatile assets like Cryptocurrencies and 10-Year Government Bonds. Paired-sample t-tests confirmed statistical significance, validating the mechanism’s effectiveness. Our study results demonstrate how cross-debate between LLMs enhances forecasting accuracy through structured optimization.
Chang et al. (Tue,) studied this question.