SUPPLEMENTARY FIGURE LEGENDS
Supplementary Figure 1. Supplementary Figure 1 compares the probability density histogram of a representative generated data set from the generative adversarial network (teal bars) to the probability density histogram of the test data (salmon bars) from the univariate analyses of 8 diabetes-associated biomarkers. The dark gray bars correspond to the regions of overlap between the two probability density histograms. The eight biomarkers not shown in Figures 2A-H are shown here: alanine aminotransferase (ALT) (Supplementary Figure 1I), aspartate aminotransferase (AST) (Supplementary Figure 1J), gamma glutamyl transferase (GGT) (Supplementary Figure 1K), uric acid (Supplementary Figure 1L), high sensitivity C-reactive protein (Supplementary Figure 1M), direct HDL-cholesterol (Supplementary Figure 1N), average systolic blood pressure (Supplementary Figure 1O), and ferritin (Supplementary Figure 1P). The x -axes on all graphs are biomarker levels that are log-transformed and scaled to lie between -1 and 1. The p -values from the Kolmogorov-Smirnov test are shown on the top left.
Supplementary Figure 2. Supplementary Figure 2 shows quantile-quantile plots that compare the distribution of a representative generated data set from the generative adversarial network to the distribution of the test data from the univariate analyses of the 16 diabetes-associated biomarkers. The circles correspond to the data and the salmon line is the line of identity. The biomarkers shown are: urine albumin (Supplementary Figure 2A), urine creatinine (Supplementary Figure 2B), fasting glucose (Supplementary Figure 2C), insulin (Supplementary Figure 2D), body mass index (Supplementary Figure 2E), glycohemoglobin (Figure 2F), triglyceride (Figure 2G), total cholesterol (Figure 2H), alanine aminotransferase (ALT) (Supplementary Figure 2I), aspartate aminotransferase (AST) (Supplementary Figure 2J), gamma glutamyl transferase (GGT) (Figure 2K), uric acid (Supplementary Figure 2L), high sensitivity C-reactive protein (Supplementary Figure 2M), direct HDL-cholesterol (Supplementary Figure 2N), average systolic blood pressure (Supplementary Figure 2O), and ferritin (Supplementary Figure 2P).
Supplementary Figure 3. The uniform manifold approximation and projection (UMAP) two-dimensional projections of the 14-dimensional, diabetes-associated biomarkers data for the Black, Hispanic, Other and White race categories. The test data results are shown in salmon, and the GAN-generated results are in teal. The x -axis (UMAP X) andy -axis (UMAP Y) correspond to the UMAP projections into two dimensions of the input of 14-dimensional biomarker levels that are log-transformed and scaled to lie between -1 and 1.
Supplementary Figure 4. Box plots of the univariate results from 14-dimensional, diabetes-associated biomarkers data for the Black, Hispanic, Other and White race categories. The test data are shown in salmon, and the GAN-generated results are in teal. The six biomarkers not shown in Figures 5A-H are shown here: alanine aminotransferase (ALT) (Supplementary Figure 4I), aspartate aminotransferase (AST) (Supplementary Figure 4J), gamma glutamyl transferase (GGT) (Supplementary Figure 4K), uric acid (Supplementary Figure 4L), direct HDL-cholesterol (Supplementary Figure 4M), and average systolic blood pressure (Supplementary Figure 1N). The x -axes on all graphs are biomarker levels that are log-transformed and scaled to lie between -1 and 1. The lines on the box correspond to the 25thquantile, median and 75th quantile, the error bars correspond to the median ± 1.5 inter-quartile range and the outliers are in black circles.