We adapt Neural DNA (NDNA), a compact developmental genome previously demonstrated on weight matrices in MLPs through GPT-2, to per-edge masking on human protein-protein interaction (PPI) graphs. A 290-parameter genome trained on TCGA breast cancer (BRCA) gene expression learns to select 25% of edges in a 5, 000-gene STRING subgraph for tumor-versus-normal classification. We report two findings. First, the selected edges are biologically coherent. Wnt signaling is enriched 18× over a density-matched random control (Fisher's exact p = 0. 013), cell cycle is enriched 1. 4× (p = 0. 013), and estrogen-receptor edges are actively excluded. Second, the BRCA-trained genome transfers to lung (LUAD), colon (COAD), and prostate (PRAD) cancer when frozen, beating density-matched random selection on test AUC in every cancer (LUAD +0. 007, COAD +0. 005, PRAD +0. 004 across n = 5 seeds) and exhibiting an order of magnitude lower variance across seeds (LUAD frozen ±0. 002 vs random ±0. 040). The novel-edge candidates the genome ranks highest include CEACAM5–KLK3 (the protein products are CEA and PSA, two clinically deployed tumor markers in different cancers), FZD9–WNT3 (a canonical Wnt receptor–ligand pair), and LAMB3–LAMC3 (laminin subunits implicated in tumor invasion). The result extends NDNA's "compact program encodes useful structure" thesis from artificial neural networks into biological networks, and provides a practical instrument for data-scarce cancers: a structural prior trained on a large cohort transfers reliably to cancers where per-cohort training is unstable. Code: https: //github. com/tejassudsfp/proteinₛtudy
Tejas Parthasarathi Sudarshan (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: