Every question has a bias and variance.

Each question is a measurement. Two ways a measurement can fail.

MSE Bias² Variance

misalignment

readers agree on the wrong thing

→fix: rewrite definitions

disagreement

readers diverge paper-by-paper

→fix: add scaffolding; sharpen examples

A useful decomposition of error, not a literal MSE. Both quantities depend on the corpus you run it on.

A clinical-review field that breaks both ways: allocation_concealment_adequate

high biasCodebook says look for "double-blind." Readers agree — on the wrong construct. Blinding is not allocation concealment.

high varianceCodebook names the right construct. Papers describe it with scattered cues ("central randomization", "pharmacy-controlled sequence", appendix-only details). Readers diverge case-by-case.

cf.Cohen (1960) · Cronbach & Meehl (1955) · Krippendorff (2004) · Shrout & Fleiss (1979) · Cronbach et al. (1972) G-theory · Lundberg & Stewart (2021)

Inter-reader disagreement is variance, made visible.

@karlrohe