Text-attributed graphs (TAGs) require models to jointly exploit node text and graph structure, yet doing so effectively remains difficult when node text is sparse and the structural context is large. Here, we propose STAGE (Semantic and Topological Augmented Graph Embedding), a two-stage framework for representation learning on TAGs. In Stage I, a frozen large language model is used offline to generate explanatory text that enriches compressed node attributes without introducing online LLM training cost. In Stage II, STAGE performs structure-aware representation learning under a fixed global token budget by combining random-walk-based structural context with graph-conditioned token reduction before PLM encoding. This design preserves informative semantic content while preventing unconstrained sequence expansion. Experiments on seven benchmark datasets show that STAGE consistently outperforms strong baselines under the same evaluation setting and maintains favorable efficiency under bounded input-length constraints.
Huang et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: