Study Design and Statistical Methods
The primary endpoint for AALL02P2 was EFS as compared to historical outcomes on POG 9412 and POG 9061, which had 3-year EFS rates for late iCNS-R of approximately 75%. AALL02P2 was designed to accrue 143 patients with late iCNS-R over 5.72 years. If at least 41 events (induction failures, relapse, second malignancy, or death) occurred among the 143 (assuming a binomial distribution to model the number of events at three years), it would be concluded that the 3-year EFS is less than 75%. With this design there is a 17.9% chance of erroneously concluding that there is a decrease in EFS when in fact the true three-year EFS is 75%. The probability is 95.5% of concluding there is a decrease in EFS when the true three-year EFS is 65%. The study had interim monitoring conducted to protect against poor EFS. Under the exponential assumption, three-year EFS of 75% translates to a hazard rate of relapse of 0.096. Interim analysis was be based on the estimated hazard rate. The alpha x t2 spending function was used to maintain an overall one-sided Type I error rate of 18%. A total of 113 of the expected 143 eligible patients with iCNS-R had been accrued at the time of the third protocol specified interim monitoring for EFS. The 3-year EFS rate for these patients was 60.27±7.18 %, with a 3-year overall survival rate of 75.20±6.24%. Thirty of the expected 41 events had occurred. Using an alpha x t2 spending function, the p-value required would be less than or equal to 0.084. The observed p-value of 0.02 was less than 0. 084. Hence the monitoring boundary was crossed indicating that the outcomes on this study were inferior to those seen on P9412 leading the COG Data monitoring committee to permanently close AALL02P2.
EFS was calculated as the time from enrollment to first event (induction death, induction failure, relapse at any site, second malignant neoplasm or remission death from any cause) or last contact, and OS was defined as the time from enrollment to death from any cause or last contact. Survival rates were estimated using the Kaplan-Meier method and corresponding standard errors were based on the method of Peto, et al.38,39 The two-sided log-rank test was used for comparison of survival curves between groups. P -values < 0.05 were considered statistically significant. Data frozen as of September 30, 2016 are included in this report.