Learning Optimal Dynamic Treatment Regime from Observational Clinical Data through Reinforcement Learning | Synapse