“Prevalence is to the diagnostic process as gravity is to the solar system – it has the power of a physical law.” - Clifton K. Meador, A Little Book of Doctors’ Rules
Diagnosis is the apotheosis of medical skill1. Without an accurate diagnosis, the patients’ foremost questions cannot be answered: what is wrong, what is likely to happen, and what can be done about it? Central to the diagnostic process is the compilation of a list of possible causes of the clinical presentation – a differential diagnosis. Traditionally, diseases comprising the differential diagnosis were rank ordered from highest to lowest in terms of likelihood2,3. This method, which is inconsistently used, is loosely probabilistic but can obscure large differences in probability between components of the differential; an ideal method would assign probabilities to each possibility that sum to 100%4. Practical difficulties limit the ability to accurately estimate the probabilities of components of the differential5-7, yet probabilistic reasoning is nonetheless an essential part of expert forecasting and diagnosis8-16.
Several well-known clinical axioms pay homage to the primacy of probability in diagnosis. The most popular, “common things are common” (CTC), has been part of medical folklore for more than a century1,17. It is often expressed as a metaphor: “when you hear hoofbeats, look for horses not zebras”18-20. The CTC axiom is rooted in base rates and Bayes’ theorem and is the foundation of other epigrams such as “uncommon presentations of common diseases are more common than common presentations of uncommon diseases.” The CTC axiom and its variants arose to combat what are now known as cognitive biases such as base rate neglect21,22 and the representativeness heuristic23,24 but are themselves heuristics – rudimentary rules of thumb that provide general rather than specific guidance for the consideration of probability in diagnosis1.
Ironically, the axiom and extant literature are silent on how to determine what is common, what is rare, and related questions. Is sarcoidosis common? Is commonness related to epidemiological metrics such incidence and prevalence, or is it an intuitive or experiential determination, or both? Should diseases be dichotomized as either common or rare, or rated on a frequency continuum? Can the notion of commonness be operationalized as an aid to estimating probabilities of components of a differential diagnosis in a practicable way?
At first blush, discriminating between common and rare diseases is a simple task: pneumonia is common, and lymphangioleiomyomatosis (LAM) is rare; a physician will encounter many cases of the former and few (if any) of the latter in any given time interval. This self-evident truth is a distillation of the physician’s experience into the coarse dichotomy of common and rare; it obfuscates the magnitude of the difference in the frequency of the two diseases. How much more common is pneumonia than LAM? The dichotomy does not permit an answer to this question. Ideally, we would like to know the relative likelihoods of diseases so that, ceteris paribus, their weight in the differential diagnosis could be made proportional to their observed frequencies.
Fortunately, dichotomization is unnecessary. The actual frequencies of diseases can be used as the metric for comparison of their commonness. First, it is necessary to determine which measure of epidemiological disease frequency - incidence or prevalence - should be used. Incidence is the number of new cases diagnosed per person per year, whereas prevalence is the number of existing cases already diagnosed per person at a given time point. (Incidence is customarily expressed as cases/100,000 person-years, and prevalence as cases per 100,000 persons; prevalence is, roughly, the product of incidence and the average duration of the disease.) Incidence and prevalence are similar when a disease has a high recovery or short-term mortality rate (e.g., pneumonia), since death or recovery removes cases from the numerator. Prevalence is higher than incidence when the short-term mortality and recovery rates are low (e.g., emphysema) because chronic cases accumulate in the population, growing the numerator. Because incidence relates to new or previously undiagnosed cases and prevalence to existing or already diagnosed cases, it is incidence that germane to the diagnostician. Therefore, the epigraph from A Little Book of Doctors’ Rules requires modification – incidence, not prevalence, has the power of a physical law25.
This distinction, until now neglected in the vast literature on diagnosis and clinical reasoning, is paramount because the prevalence of many diseases is higher than their incidence, sometimes markedly so. The probability that a patient seen tomorrow will present with previously undiagnosed symptomatic hypothyroidism is related to the incidence of hypothyroidism and most practitioners will go a month or longer without diagnosing a new case of (incident) hypothyroidism. By contrast, the probability that a patient with established (prevalent) hypothyroidism will be seen on an average day is high. The physician does not “diagnose” these cases of prevalent disease – they have already been diagnosed. For already diagnosed chronic diseases prone to complications or flare-ups, such as systemic lupus erythematosus (SLE), the prevalence of lupus will affect the probability of seeing the complications, as the latter are conditional upon the former26. However, when diagnosing complications of SLE, it is already known that SLE is present; therefore, the prevalence of SLE is immaterial. It is theincidence of the complication that relates the probability of the complication, given SLE.
If, as seems likely, notions of disease commonness are based on how frequently patients with the disease are encountered without regard to whether they represent new or existing diagnoses, the resulting amalgam of incident and prevalent cases will bias intuitions about what is common, making many diseases appear to be more common than they are. Referral bias and clinicopathological conferences may similarly skew intuitions about disease incidence, since patients with rare diseases are concentrated in these samples compared to unselected patients15,27,28. Indeed, clinicopathological conferences and grand rounds customarily select the rarest diseases for presentation, turning the natural order of disease frequency topsy-turvy28.
Fortunately, commonness need not be based upon intuitions: because of the proliferation of epidemiological cohort data in recent decades, the incidence of most diseases can now be readily found in epidemiological cohort series. Similarly, online resources such as www.uptodate.com commonly report disease incidences under a subsection on epidemiology. Estimates from these sources are not always in agreement, but the problems posed by variability are not as serious as they may seem; precise incidences are not necessary. Worthwhile comparisons can be made based on order of magnitude differences in incidence estimates for different diseases.
For example, Table 1 shows that the incidence of pneumonia is approximately 650 cases per 100,000 persons-years. By comparison, that of segmental pulmonary embolism is on the order of 60 per 100,000 person-years. (UpToDate was used as the default source for incidence data for the sake of simplicity and ease of use and to limit the size of the bibliography.) Suppose we were to make a differential diagnosis for dyspnea in a patient presenting to the emergency department before any individuating information about the illness was known that would allow us to differentiate between pneumonia and pulmonary embolism. (A scenario such as this occurs countless times each day as emergency room physicians approach a patient’s room with nothing more than age, gender, and chief complaint recorded by the triage nurse on the intake sheet.) Based on incidence alone, we could say that pneumonia is an order of magnitude more likely than pulmonary embolism; it is more than five orders of magnitude more likely than lymphangiomyomatosis. Indeed, diseases with incidences of less than 1/100,000 person-years are so rare that most clinicians outside of specialized referral centers will diagnose new (incident) cases on average no more than once or twice during their entire career14.