Key points are not available for this paper at this time.
Life for the clinical investigator was much simpler 30 years ago. With only a handful of available cytotoxic agents, often obtained from cell line screens of plant-derived extracts, the progression of trials from phase I to II, and then ultimately to phase III, was well accepted, practical, and intuitive. The single arm phase II study was the principal mechanism for deciding whether to proceed to a randomized phase III trial. These phase II trials often used response rate as the primary end point and were powered to yield a reasonably low false-negative rate (type II error often in the range of 10% or lower) to capture the majority of potentially active regimens. Due to the small number of patients typically enrolled in single arm phase II trials and the reliance on historical controls for an estimation of expected response rate, it was recognizedthatthisdesignisassociatedwithafairlyhighfalse-positiverate (type I error often in the range of 10% or higher). This was considered to beanacceptablecompromise, realizingthat thetrueactivityofanewdrug would eventually need to be clarified in a phase III trial. Over the past decade, there has been an explosion of new drugs designed to target specific pathways relevant to cancer growth, apoptosis, or angiogenesis. As more of these drugs show some measure of activity in single arm phase II trials, it has become clear that the ability of the standard phase II platform to accurately predict for phase III success is surprisingly low. Specifically, approximately 60% of oncology regimens that have apparently promising activity in single arm phase II trials fail to demonstrate superiority when tested in the phase III setting. Although this level of false-positive results might have been acceptable in the 1980s when there was a relative paucity of new compounds, at present thereareseveralconcernsoverthishighattritionrate.Thecostofconducting large phase III trials, the ethics of asking patients to participate in a study that is likely to be negative (and which might expose them to unnecessary toxicity), and the burden on clinical investigators to open trials and enroll patients require that we increase the rigor with which we conduct phase II trials. These concerns not only have implications for clinical investigation but also have an important influence on the kinds of phase II trials the editors consider appropriate for publication in Journal of Clinical Oncology (JCO). There are several reasons why the single-arm phase II design might not be predictive of benefit when a new agent or combination is tested in the phase III setting. Typically such designs require a prespecified improvement in response rate, compared to historical control data, as an indication of phase III promise. However, it is well-known that historical control data are moving targets. What was representative 10 years ago might not be representative today, depending on shifts in disease presentation and patient referral patterns (population drift), improvements in radiographic and surgical staging techniques (stage migration), and improvements in the ability to assess response. In addition, the use of response rate as a measure of important clinical activity is not always straightforward; there are now several examples of cytotoxic regimens capable of increasing response rate without translating into improved overall survival. Conversely, there are other examples of drugs that yield an important prolongation of survival due to cytostatic mechanisms, without appreciable improvement in response rate. These and other issues related to the inadequacy of the single arm phase II design have been well described in several recent reviews. Despite these concerns, the single arm phase II design continues to have a role in disease settings in which the behavior of historical controls has remained stable over time, the likelihood of response to standard therapeutic options is low, the desired effect size of the novel agent is large, and the drug’s mechanism of action is expected to be cytotoxic as opposed to cytostatic (permitting use of response rate as an end point). Some of these features are characteristic of relapsed, platinum-resistant ovarian cancer, where the single arm phase II trial still has a place in identifying single agents with promising activity. Conversely, adding an experimental agent to an active cytotoxic platform (eg, adding a third drug to paclitaxel and carboplatin in newly diagnosed ovarian cancer) may yield uninterpretable results in a single arm phase II trial, because the baseline activity of the regimen is already high, and the ability of this design to reliably distinguish between response rates is low. The randomized phase II trial is a well-known platform for testing the efficacy of novel agents in oncology, with the potential of minimizing some of the pitfalls inherent in the single arm phase II design. Such studies fall into three main groups, including randomized selection design (“pick the winner”), in which the best of two or more arms is chosen for further evaluation; randomized comparison design, in which a formal statistical comparison is made between the experimental and control groups; and randomized discontinuation design. It is not the goal of this editorial to discuss any of these designs in depth, although a few points are worth noting. The randomized selection design typically does not involve a standard therapy control but instead randomly enrolls patients onto two or more experimental arms, often evaluating different drug doses or schedules. The “best” arm is usually chosen based on predetermined response criteria but is still subject to the possibility of false-positive results (type I error in the range of 10% to 20%). In the randomized selection design, no formal attempt is made to compare any of the experimental arms with another. This JOURNAL OF CLINICAL ONCOLOGY E D I T O R I A L VOLUME 27 NUMBER 19 JULY 1 2009
Stephen A. Cannistra (Tue,) studied this question.