Artificial intelligence (AI) agents extend large language models from single-turn text generation to systems that pursue goals through planning, retrieval, tool use, code execution, memory, feedback, and role coordination. In medicine and biomedical research, this shift is creating early systems for clinical calculations, risk prediction, oncology decision support, omics analysis, hypothesis development, laboratory automation, and research writing. However, the evidence remains uneven. Clinical examples are the most defensible when agents use validated calculators, curated clinical tools, or guideline-grounded modules under human oversight. Biomedical discovery systems exhibit broader workflow capabilities; however, many claims still rely on preprints, narrow benchmarks, simulated settings, or domain-specific demonstrations. For clinicians and biomedical researchers, the immediate challenge is not to decide whether agents will replace experts but to understand what tasks can be delegated, what evidence is needed, and what human judgment must be preserved. This narrative review explains what makes an AI system agentic, summarizes its representative clinical and discovery applications, and outlines safeguards for evaluation, reproducibility, and oversight. Biomedical readers should expect AI agents to enter medicine and research first as constrained, auditable workflow infrastructures. These infrastructures may reorganize biomedical work; however, accountability should remain with the clinicians and investigators.
Sangzin Ahn (Wed,) studied this question.