Why do many textbooks on Bayes' Theorem include the frequency of the disease in examples on the reliability of medical tests?

I believe it's commonly included because it's counterintuitive. You would expect a test with a high degree of accuracy to be right most of the time but this isn't actually the case and requires more evidence. To address this I think of it as the "error of one sample" fallacy which is to say you can't do an experiment one time and make strong conclusions, even if the experiment is well-designed.


Further to user856(formerly: Rahul)'s explanation in the comments, here's a complementary answer:

The way to frame/interpret medical tests in general is to understand them as updating/refining one's level of certainty that the patient has the disease:

  • Without a medical-test result, the disease prevalence (a measure of disease frequency) can be taken as the patient's likelihood of having the disease.

  • But given a test result, the aforementioned likelihood has changed, and its updated value now depends on the test's sensitivity (true positive rate) and specificity (true negative rate) and the disease prevalence (the latter as before). enter image description here The relevant probabilities are

    1. ($\color{purple}{\textrm{in purple}}$) the likelihood that the patient is indeed Unhealthy given a positive test result,
    2. ($\color{red}{\textrm{in red}}$) the likelihood that the patient is actually Unhealthy given a negative test result;

    each is a function of $3$ variables (test sensitivity $v$, test specificity $p$, disease prevalence $d$), obtained by adding and dividing the probabilities of the corresponding outcomes.

A screening test's predictive values $P(U|+)$ & $P(H|-)$ and overall accuracy $\left(dv+(1-d)p\right)$ depend on both its technical characteristics and the population that it is being used on. The interplay among $v, p$ and $d$ is apparent from the tree diagram:

  • Unless the test has 100% sensitivity $(v),$ its number of false-negative results is proportional to the disease prevalence $(d).$
  • Similarly, unless the test has 100% specificity $(p),$ its number of false-positive results is proportional to $(1-d).$ $\longleftarrow$do I call this expression the disease prevalence's complement?

N.B. The OP mentions “test reliability”, but that’s a separate issue, since reliability typically refers to how consistent a test’s results are across retakes. $$\\$$ Finally, here is a concrete extended example (based on actual data) to put all this in context: enter image description here Due to the low disease prevalence,

  • the PCR and rapid tests have a positive predictive value of only $4\%$ and $17\%$ respectively,
  • whereas their negative predictive value are both almost $100\%;$

the tests' overall accuracy $\left(dv+(1-d)p\right)$ are $95\%$ and $99\%$ respectively.