SUPPLEMENTARY FIGURE LEGENDS
Supplementary Figure 1. Supplementary Figure 1 compares the
probability density histogram of a representative generated data set
from the generative adversarial network (teal bars) to the probability
density histogram of the test data (salmon bars) from the univariate
analyses of 8 diabetes-associated biomarkers. The dark gray bars
correspond to the regions of overlap between the two probability density
histograms. The eight biomarkers not shown in Figures 2A-H are shown
here: alanine aminotransferase (ALT) (Supplementary Figure 1I),
aspartate aminotransferase (AST) (Supplementary Figure 1J), gamma
glutamyl transferase (GGT) (Supplementary Figure 1K), uric acid
(Supplementary Figure 1L), high sensitivity C-reactive protein
(Supplementary Figure 1M), direct HDL-cholesterol (Supplementary Figure
1N), average systolic blood pressure (Supplementary Figure 1O), and
ferritin (Supplementary Figure 1P). The x -axes on all graphs are
biomarker levels that are log-transformed and scaled to lie between -1
and 1. The p -values from the Kolmogorov-Smirnov test are shown on
the top left.
Supplementary Figure 2. Supplementary Figure 2 shows
quantile-quantile plots that compare the distribution of a
representative generated data set from the generative adversarial
network to the distribution of the test data from the univariate
analyses of the 16 diabetes-associated biomarkers. The circles
correspond to the data and the salmon line is the line of identity. The
biomarkers shown are: urine albumin (Supplementary Figure 2A), urine
creatinine (Supplementary Figure 2B), fasting glucose (Supplementary
Figure 2C), insulin (Supplementary Figure 2D), body mass index
(Supplementary Figure 2E), glycohemoglobin (Figure 2F), triglyceride
(Figure 2G), total cholesterol (Figure 2H), alanine aminotransferase
(ALT) (Supplementary Figure 2I), aspartate aminotransferase (AST)
(Supplementary Figure 2J), gamma glutamyl transferase (GGT) (Figure 2K),
uric acid (Supplementary Figure 2L), high sensitivity C-reactive protein
(Supplementary Figure 2M), direct HDL-cholesterol (Supplementary Figure
2N), average systolic blood pressure (Supplementary Figure 2O), and
ferritin (Supplementary Figure 2P).
Supplementary Figure 3. The uniform manifold approximation and
projection (UMAP) two-dimensional projections of the 14-dimensional,
diabetes-associated biomarkers data for the Black, Hispanic, Other and
White race categories. The test data results are shown in salmon, and
the GAN-generated results are in teal. The x -axis (UMAP X) andy -axis (UMAP Y) correspond to the UMAP projections into two
dimensions of the input of 14-dimensional biomarker levels that are
log-transformed and scaled to lie between -1 and 1.
Supplementary Figure 4. Box plots of the univariate results
from 14-dimensional, diabetes-associated biomarkers data for the Black,
Hispanic, Other and White race categories. The test data are shown in
salmon, and the GAN-generated results are in teal. The six biomarkers
not shown in Figures 5A-H are shown here: alanine aminotransferase (ALT)
(Supplementary Figure 4I), aspartate aminotransferase (AST)
(Supplementary Figure 4J), gamma glutamyl transferase (GGT)
(Supplementary Figure 4K), uric acid (Supplementary Figure 4L), direct
HDL-cholesterol (Supplementary Figure 4M), and average systolic blood
pressure (Supplementary Figure 1N). The x -axes on all graphs are
biomarker levels that are log-transformed and scaled to lie between -1
and 1. The lines on the box correspond to the 25thquantile, median and 75th quantile, the error bars
correspond to the median ± 1.5 inter-quartile range and the outliers are
in black circles.