What question did this study set out to answer?

This research aims to introduce and evaluate Semantic State Abstraction Interfaces (SSAI) for optimizing portfolio decisions from sparse news data.

May 8, 2026Open Access

Semantic State Abstraction Interfaces for LLM-Augmented Portfolio Decisions: Multi-Axis News Decomposition and RL Diagnostics

Key Points

This research aims to introduce and evaluate Semantic State Abstraction Interfaces (SSAI) for optimizing portfolio decisions from sparse news data.
Implemented SSAI using four axes: sentiment, risk, confidence, volatility forecast.
Evaluated across 30 NASDAQ-100 assets from 2019 to 2023 using various forecasting models.
Analyzed performance against buy-and-hold strategies and other statistical measures.
The four-factor SFP portfolio achieved a 307.2% cumulative return but underperformed compared to buy-and-hold in all coverage terciles.
PC1-SFP outperformed SSAI by 126 percentage points in cumulative return.
Reinforcement learning models showed significant dependence on algorithm choice for performance improvement.

Abstract

We introduce Semantic State Abstraction Interfaces (SSAI): a methodological1 template for mapping sparse unstructured text into K auditable, named coordinates2 with neutral defaults on no-news days, designed to separate representation hypothe-3 ses from optimisation variance in sequential decision systems. Our contribution4 is the framework and its evaluation protocol—not a claim that SSAI outperforms5 denser alternatives.6 We instantiate SSAI with K=4 axes (sentiment, risk, confidence, volatility fore-7 cast) on a US-equity panel (30 NASDAQ-100 names, FNSPID news, 2019–20238 test), and evaluate it across three parallel estimators—direct factor portfolios9 (SFP/SRF/SCW), supervised ridge forecasters, and RL agents (DP-PPO, SAC)—10 that share the same fixed φ so that signal and optimiser effects can be read off11 separately.12 What the experiments show. The four-factor SFP reaches 307.2% CR / Sharpe13 1.067 over 2019–2023. However, this apparent advantage over buy-and-hold14 (243.6%) does not survive its own controls: SFP underperforms stratum-matched15 B high-conviction semantic tilts and lexical baselines (VADER, TF–25 IDF/SVD) recover price-only Sharpe. The RL block is a diagnostic: DP-PPO trails26 buy-and-hold; SAC with identical SSAI observations improves Sharpe (1.128 vs.27 1.032), isolating algorithm dependence rather than representation advantage. Seed-28 mean Sharpe differences across 21 DP-PPO seeds are non-significant (Wilcoxon29 p≈0.25–0.31).30 Contribution framing. We present SSAI as an interpretability-performance fron-31 tier instrument and a cautionary diagnostic: it characterises the cost of maintaining32 auditable axes (126pp CR vs. PC1 over five years), shows that RL performance is33 dominated by algorithm choice rather than representation, and provides a reusable34 template for separating semantic signal from optimisation noise in other sparse-text35 decision settings

Semantic State Abstraction Interfaces for LLM-Augmented Portfolio Decisions: Multi-Axis News Decomposition and RL Diagnostics

Key Points

Abstract

Cite This Study