ABSTRACT
Background: Clinical trial simulations and pharmacometric
modeling of biomarker profiles for under-represented groups are
challenging because the underlying studies frequently do not have
sufficient participants from these groups.
Objectives: To investigate generative adversarial networks
(GANs), an artificial intelligence (AI) technology that enables
realistic simulations of complex patterns, for modeling clinical
biomarker profiles of under-represented groups.
Methods: GANs consist of generator and discriminator neural
networks that operate in tandem. GAN architectures were developed for
modeling univariate and joint distributions of a panel of 16
diabetes-relevant biomarkers from the National Health and Nutrition
Examination Survey (NHANES), which contains laboratory and clinical
biomarker data from a population-based sample of individuals of all
ages, racial groups, and ethnicities. Conditional GANs were used to
model biomarker profiles for race/ethnicity categories. GAN performance
was assessed by comparing GAN outputs to test data.
Results: The biomarkers exhibited non-normal distributions and
varied in their bivariate correlation patterns. Univariate distributions
were modeled with generator and discriminator neural networks consisting
of two dense layers with rectified linear unit-activation. The
distributions of GAN-generated biomarkers were similar to the test data
distributions. The joint distributions of the biomarker panel in the
GAN-generated data were dispersed and overlapped with the joint
distribution of the test data as assessed by three multi-dimensional
projection methods. Conditional GANs satisfactorily modeled the joint
distribution of the biomarker panel in the Black, Hispanic, White, and
“Other” race/ethnicity categories.
Conclusions: GAN are a promising AI approach for generating
virtual patient data with realistic biomarker distributions for
under-represented race/ethnicity groups.