A tandem reinforcement learning framework for localized prostate cancer treatment planning and machine parameter optimization | Synapse