Figure legends
Figure 1. Item response model for MDS-UPDRS Part III. Left:
Item scores relate to underlying severity, which mirrors the sum of
score, through Item Characteristic Curves (ICCs). Upper right: the
position and steepness of ICCs reflect an item’s difficulty and ability
to differentiate patient severity, respectively. The blue, pink, green
and red curves describe the probabilities of having a score of not lower
than 1, 2, 3 and 4, respectively. Lower right: The blue, pink, green,
red and yellow Category Characteristic Curves (CCCs) describe the
probabilities of having a score of 0, 1, 2, 3 and 4, respectively.
Figure 2. Data pattern and model evaluation. The pattern of the
observed Sum-of-Score data (upper left) was reproduced by modeled
Symptom Severity (upper right). Model-estimated Category Characteristic
Curves (lines) reflected the distribution of observed categories
(circles) for each item over the range of symptom severity (lower left).
The proportion of the simulated scores were compared with the observed
scores (lower right).
Figure 3. Item informativeness. Item information over the whole
spectrum of symptom severity shows some items are far more informative
than others. The color-coded areas represent the items, from bottom to
top, in the order of decreasing information.
Figure 4. Trial probability of success. Upper: Probability of
trial success for detecting a hypothetical drug’s ability to slow down
disease progression was higher when data were analyzed using Symptom
Severity (brown) than using the Sum of Scores (green), where solid and
dashed lines reflect analyses including all items and only non-tremor
items, respectively. Lower: Comparison of power for detecting drug
effect (green: 0.1; blue: 0.5) and overall probability of trial success
(brown) for detecting a range of potential drug effects in a one-year
trial.
Figure 5. Visual predictive check for the longitudinal
item-response model. The time course of the distribution of the
observed sum of scores was well reproduced by the longitudinal IRT model
(dots: observations; green lines: 5%, 50% and 95% quantiles of the
observations; red line: predicted time course of sum of score for a
typical patient; bands: 95% confidence intervals of model simulated
corresponding quantiles).
Figure 6. The model accurately simulated the time course of the
observed proportion of each score for each item. The lines are the
proportion of the observed scores of 0 to 5. The bands are the 95%
confidence intervals of the model simulation.