This study investigates the ability to improve corporate financial performance prediction by using non-financial measures including narrative disclosure tone and corporate governance indicators through advanced machine learning approaches. The research addresses a critical gap in emerging markets by examining how non-financial variables enhance prediction accuracy beyond traditional financial metrics, specifically within the Tehran Stock Exchange context where information asymmetry and governance challenges are pronounced. The main objective is to evaluate the improvement in prediction accuracy of financial performance for companies listed on the Tehran Stock Exchange by incorporating non-financial variables alongside traditional financial metrics. This quantitative study employs a post-event panel data approach, analyzing 140 companies listed on the Tehran Stock Exchange during 2014-2023. Data collection involved systematic extraction from audited financial statements, accompanying notes, corporate governance documents, and board reports. Narrative disclosure tone was quantified using VADER sentiment analysis adapted for Persian-translated annual reports (Cronbach's alpha = 0. 85), employing the Loughran-McDonald financial dictionary for tone measurement. Machine learning algorithms including support vector machines, random forest (nₑstimators=100, maxdepth=10), and gradient boosting (XGBoost with learningᵣate=0. 1, nₑstimators=200) were implemented with 5-fold cross-validation. Model performance was evaluated using regression-appropriate metrics including R² (coefficient of determination), adjusted R², mean squared error (MSE), and root mean squared error (RMSE), with baseline comparison against ordinary least squares (OLS) regression. Results showed that including narrative disclosure tone resulted in an average improvement of 4. 1 percent in prediction accuracy (95% CI: 3. 2%-5. 0%, Cohen's d = 0. 72). Corporate governance indicators provided a more substantial improvement of 8. 2 percent (95% CI: 7. 1%-9. 3%, Cohen's d = 1. 32). A critical finding is that narrative disclosure tone demonstrates significantly greater impact on market-based measures (Tobin's Q, price-to-book ratio) compared to accounting measures (ROA, ROE) with p=0. 019, while corporate governance indicators show uniform impact across all performance measures (p=0. 893). The gradient boosting algorithm (XGBoost) achieved the highest prediction accuracy (R²=0. 973 for ROE), followed by random forest (R²=0. 967 for ROA). This study provides empirical evidence that integrating non-financial variables significantly enhances financial performance predictability in emerging markets, with practical implications for investors, analysts, and regulators.
Lavashloui et al. (Wed,) studied this question.