Published February 2017
Atmospheric Environment

Big spatial data and machine learning methods are winning the battle over the tiniest of moving targets: the elemental components of particulate matter that pollute the air we breathe.

An emerging, novel approach to assessing pollutant exposures, called a land use random forest (LURF) model, proved more effective than the current standard land use regression (LUR) model at estimating personal pollution exposure. The study demonstrates that LURF could more accurately connect pollution risk levels and chronic health outcomes in children.

“These models could someday help guide patient care by supplying clinicians with a bigger picture of the environment in which a patient spends their time,” says Cole Brokamp, a research fellow and first author. The paper was his PhD dissertation. Patrick Ryan, PhD, MS, was senior author. Both are with the Division of Biostatistics and Epidemiology.

Using ambient air sampling data from 24 sampling stations in urban Cincinnati, the team tested the LURF and LUR models as they measured 11 elemental components of pollution, ranging from aluminum to zinc. They factored in more than 50 predictors associated with transportation, physical features, community socioeconomic characteristics, greenspace, land cover, and emission point sources.

The LURF model proved more effective at capturing complex interactions and nonlinear relationships between land use predictors and pollutant concentrations.

“Advancing exposure science methodology is very exciting,” Brokamp says, “but the most rewarding aspect is probably seeing the exposure models implemented in health studies.”