Target-aware de novo drug generation remains challenging due to the difficulty of effectively incorporating protein information without sacrificing molecular diversity or syntactic validity. Most current models are either target-agnostic or use oversimplified conditioning (e.g., discrete labels), limiting their relevance. We propose ctDrug, an autoregressive framework that conditions molecule generation on continuous protein embeddings via cross-attention, eliminating the need for 3D structures. A key contribution is the systematic comparison of sequence-based (ProtT5) and molecule-centric (Chem-BERTa) target representations within the same architecture. On DRD2 and HTR1A, ctDrug outperforms label-based baselines in distributional fidelity (FCD/SNN) while achieving high validity, novelty, and drug-like properties. Results suggest that continuous embeddings can provide more flexible target conditioning compared to discrete label-based approaches.
Ferdosi et al. (Mon,) studied this question.