What question did this study set out to answer?

This research aims to develop a novel transformer-based model for the design of aptamers targeting specific proteins, enhancing the discovery process.

May 29, 2026

AI ‐Driven Protein‐to‐Aptamer Design Using a Transformer Architecture With Cross‐Model and Structural Validation

Key Points

This research aims to develop a novel transformer-based model for the design of aptamers targeting specific proteins, enhancing the discovery process.
Developed a transformer-based sequence-to-sequence framework for aptamer generation.
Utilized self-supervised pretraining on large protein and RNA datasets for learning.
Conducted in silico validation using 100 proteins to assess model performance against AptaTrans.
Achieved an accuracy of 0.902 and AUC of 0.918 in API prediction, comparable to AptaTrans.
Significantly outperformed AptaTrans with a mean binding score of 0.898 vs. 0.836 (p < 2.22 × 10−16).
Experimental validation confirmed the predicted aptamer's strong concentration-dependent binding to CCL4 (R² = 0.95).

Abstract

ABSTRACT Aptamers are promising molecular recognition elements with broad applications in diagnostics, therapeutics, and biosensing; however, their discovery remains labor‐intensive and time‐consuming due to limitations in traditional SELEX‐based workflows. In this study, we propose a transformer‐based sequence‐to‐sequence framework that directly generates aptamer sequences conditioned on target protein sequences, enabling a generative approach to protein–aptamer design. The model incorporates k‐mer tokenization (3‐mer for proteins and 6‐mer for aptamers) and leverages self‐supervised pretraining on large‐scale protein and RNA datasets to learn sequence and structural representations. We first evaluated the model on aptamer–protein interaction (API) prediction, where it achieved performance comparable to AptaTrans (ACC = 0.902, AUC = 0.918), while maintaining a simpler architecture and improved suitability for generative tasks. To further assess its design capability, we conducted a two‐stage in silico validation using 100 randomly selected proteins from the PDB dataset. In cross‐model evaluation, the proposed model significantly outperformed AptaTrans (mean binding score: 0.898 vs. 0.836, p < 2.22 × 10 −16 ), indicating improved generalization and reduced self‐consistency bias. In structure‐based validation using HDOCK, both models showed comparable docking performance under a unified scoring framework, suggesting that the generated sequences maintain structural binding feasibility. Finally, experimental validation using an ALISA assay demonstrated that the predicted aptamer exhibits concentration‐dependent binding to CCL4, with a strong linear correlation ( R 2 = 0.95), confirming its target‐specific binding capability.

Bookmark

AI ‐Driven Protein‐to‐Aptamer Design Using a Transformer Architecture With Cross‐Model and Structural Validation

Key Points

Abstract

Cite This Study