Los puntos clave no están disponibles para este artículo en este momento.
Reinforcement learning is a powerful tool for developing personalized treatment regimens from healthcare data. Yet training reinforcement learning agents through direct interactions with patients is often impractical for ethical reasons. One solution is to train reinforcement learning agents using an 'environment model,' which is learned from retrospective patient data, and can simulate realistic patient trajectories. In this study, we propose transitional variational autoencoders (tVAE), a generative neural network architecture that learns a direct mapping between distributions over clinical measurements at adjacent time points. Unlike other models, the tVAE requires few distributional assumptions, and benefits from identical training, and testing architectures. This model produces more realistic patient trajectories than state-of-the-art sequential decision-making models, and generative neural networks, and can be used to learn effective treatment policies.
Baucum et al. (Tue,) studied this question.