XIS-PM2.5: A daily spatiotemporal machine-learning model for PM2.5 in
the contiguous United States
Abstract
Air-pollution monitoring is sparse across most of the United States, so
geostatistical models are important for reconstructing concentrations of
fine particulate air pollution (PM2.5) for use in health studies. We
present XGBoost-IDW Synthesis (XIS), a daily high-resolution PM2.5
machine-learning model covering the contiguous US from 2003 through
2021. XIS uses aerosol optical depth from satellites and a parsimonious
set of additional predictors to make predictions at arbitrary points,
capturing near-roadway gradients and allowing the estimation of
address-level exposures. We built XIS with a computationally tractable
workflow for extensibility to future years, and we used weighted
evaluation to fairly assess performance in sparsely monitored regions.
Averaging across all years in site-level cross-validation, the weighted
mean absolute error of predictions (MAE) was 2.13 μg/m3, a substantial
improvement over the mean absolute deviation from the median, which was
4.23 μg/m3. Comparing XIS to a leading product from the US Environmental
Protection Agency, the Fused Air Quality Surface Using Downscaling
(FAQSD), we obtained a 22% reduction in MAE. We also found a stronger
relationship between PM2.5 and social vulnerability with XIS than with
the FAQSD. Thus, XIS has potential for reconstructing environmental
exposures, and its predictions have applications in environmental
justice and human health.