Out-of-Distribution (OOD) generalization is a promising yet challenging goal that guarantees the test performance of GNNs in open-world settings. However, due to the intricate internal topology of graph-structured data, redundant information from the spurious topologies severely confuses GNNs to deviate from the labels. Extracting concise and label-relevant subgraphs from the original graphs can alleviate this problem. Unfortunately, existing methods either overlook the global structural distribution or rely heavily on manually predefined assumptions. As a result, they fall short of well capturing the structural distribution changes between input graph and extracted subgraph, thus compromising adaptability of extracted invariant subgraphs to diverse OOD scenarios. This motivates us to propose a framework called S tructural E ntropy guided I nformation B ottleneck (OOD-SEIB) that aims to more traceably measure the inherent information changes for better and more flexible OOD generalization. The core of OOD-SEIB lies in concise topology extraction module, where we measure the mutual information flow between input graph and extracted subgraph based on structural entropy, termed Compression Index (CI). Specifically, the CI is a quantifiable metric that calculates the codeword length required to describe entire graph structure via a biased random walk. Under this guidance, OOD-SEIB then launches a structural information bottleneck compression module that jointly optimizes both CI and label-relavance of the subgraph topology by iteratively balancing between informativeness and compression. To further improve GNN's invariant subgraph identification capability, OOD-SEIB generates multiple augmented environments and distill the invariant subgraphs into GNN as knowledge in an inside-out manner. When iteratively optimizing in above prescribed way, OOD-SEIB progressively reinforce the invariant subgraph extraction, thereby enhancing its generalization capability. Extensive experiments on synthetic and three real-world graph-level OOD benchmarks demonstrate that our proposed OOD-SEIB improves classification accuracy by \(4.85\%\) - \(38.03\%\) on average compared to state-of-the-art baselines. Additionally, we extend OOD-SEIB to two node-level benchmarks, achieving average classification accuracy improvements of \(14.52\%\) and \(13.15\%\) .
Di et al. (Fri,) studied this question.