2 | METHODS
| Cancer registry and patient information
Construction of this single institution childhood cancer survivorship
cohort was based on cancer registry data and integration of EHR data
elements linked by the medical record number (MRN). As part of the
accreditation by the American College of Surgeons Commission on Cancer,
centers are required to report all newly diagnosed cases to the
NCDB.33 Centers are also required to report cases to
respective state registries regardless of their accreditation status.
Inclusion criteria for the construction of this childhood cancer
survivorship cohort (Figure 1) included patients ≤18 years of age with a
diagnosis of a malignancy reported to the cancer registry. The cohort
was limited to patients seen in the pediatric oncology (PHO) or
neuro-oncology (PNO) clinics between January 1, 1994 and November 30,
2012 in order to ensure a seven year follow-up period for all survivors.
Patients who died or had a documented relapse during the seven year
follow-up window after date of diagnosis were excluded. In order to
exclude referrals for refractory or relapsed cases that were not
diagnosed and treated at this institution, only analytic cases were
included. Analytic cases are defined by the Facility Oncology Registry
Data Standards (FORDS) Manual as cases diagnosed at and/or received all
or part of the first course of treatment at the reporting facility (Duke
University Medical Center).
| Disease classification
Cancer diagnoses were grouped according to the International
Classification of Childhood Cancer, third revision
(ICCC-3),34 by using the International Classification
of Diseases for Oncology, third revision (ICD-O-3), as reported in the
cancer registry (Table 1). The ICD-O-3 codes were then used to further
classify diagnoses and group patients into malignancy categories
outlined by the BCCSS.6 Additionally, for brain tumor
patients, ICD-O-3 topography for central nervous system (CNS) locations
were used to mitigate misclassification based on primary pathologic
diagnosis (i.e. intracranial mixed germ cell tumors).
| Risk stratification
The cancer registry captures the first course of treatment based on
chart review by a certified cancer registrar in accordance with the
FORDS Manual.35 Exposures are reported as dichotomous
(Yes/No) for surgery, diagnostic biopsy, radiation, chemotherapy,
hormonal therapy, immunotherapy, other, palliative, and transplant.
Based on these exposures and the primary diagnosis classification, risk
strata were constructed from the BCCSS system (Table
1).6
| Follow-up definitions
The institutional cancer registry provided the base cohort for all
childhood cancer diagnoses. These registry data were merged, using MRN
and a durable key unique patient identifier, with EHR data through the
Duke Enterprise Data Unified Content Explorer (DEDUCE) to extract all
visits in the PHO and PNO clinic encounters to identify eligible
patients. To determine appropriate follow-up, all visits in the PBMT
clinic and the Duke Cancer Institute were also extracted. Inadequate
follow-up was defined as a survivor not being seen during the five to
seven year window after the date of initial diagnosis.
| Spatial Variables
DEDUCE was also used to export the longitude and latitude coordinates of
the home address, zip code, and the census block group Federal
Information Processing Standards (FIPS) code for each survivor. Using
ArcGIS 10.5.1 (ESRI, Redlands, CA), we calculated the Euclidean
(straight line) distance from the address of each survivor to the
nearest COG-affiliate site36 in North Carolina (NC),
South Carolina (SC), and Virginia (VA). Analysis was limited to
survivors whose coordinates were in NC, SC, and VA. Additionally, using
spatial point-in-polygon joining operations, we identified the zip
code-level Rural-Urban Commuting Area (RUCA) codes and the block
group-level Area-Deprivation Index (ADI) for each survivor. RUCA is a
categorical classification for rural vs urban areas that takes into
account population density and distance to nearest urban centers. ADI is
an indexed composite of seventeen variables related to social
determinants of health from the United States Census and American
Community Survey that captures socioeconomic disadvantage at the census
block group level.37,38 A high ADI percentile, which
represents greater disadvantage, has been shown to correlate with a
number of adverse health outcomes.39,40
| Statistical analyses
Patients were grouped according to whether or not they were seen in a
Duke Cancer Clinic five to seven years after initial date of diagnosis.
Patient characteristics were compared between those that were seen in
this window versus those that were not seen. Using the Cancer Registry,
we utilized the last known date of contact to ensure that patients
survived through the five to seven year window after their initial date
of diagnosis before including them for analysis. Continuous variables
are presented as medians (standard deviations), and differences were
compared using the t-test. Categorical variables are presented as counts
(proportions). Differences were compared using the χ2test. For all analyses, risk strata were categorized as a three level
categorical variable (low, intermediate, and high risk).
Logistic regression was used to estimate the association between
follow-up and risk stratification both in bivariate analyses and after
adjusting for known covariates including ALL indicator, gender, age at
diagnosis, race, and indicator of local state of residence. Local state
of residence was defined as residing in NC, SC or VA to minimize
potential confounding associations between risk strata, distance from
medical center, and follow-up care. Because our primary variable of
interest consisted of three levels (i.e. risk stratification), we
utilized a multiple degree of freedom lack of fit test to compare a
baseline model where risk stratification was excluded and separately, a
model where it was included.
Subsequent models that included risk stratification and an indicator of
local state of residence were used to determine if that association
varied by a broad geographic indicator. Bonferroni adjustments were made
for multiple comparisons. Only patients with complete data for all
covariates were included for each analysis, and effective sample sizes
are included for all tables and figures. In reviewing the correlation
among all predictors in our models, we found no evidence that suggested
multicollinearity might be an issue. All statistical analyses were
conducted R version 3.6.1.
| RESULTS