AI: bad at judgment,
great at reading.
For your research
You want to extract data from many documents — screen abstracts, code findings, assess bias. You want AI to help. But almost every task is judgment. Even simple extractions often hide expert judgment, and AI is bad at judgment.
Your task
To make systematic judgments, you need to decompose them into a sequence of reading comprehension tasks. Each reading task is one key step in the judgment. It takes skill. A codebook is where you write it down — and it is the prompt you give to the reader.
The good news
This is not new. It is not about AI. Complex judgments become more reliable when they are decomposed into simpler, observable parts.
JUDGING
AI collapses
READING
AI thrives
a pattern across disciplines, long before AI
- Meehl (1954)→clinical diagnosis, decomposed into weighted rules on observable traits.
- Bueno de Mesquita (1981)→conflict outcomes, decomposed into four estimated variables.
- Altman (1994)→medical research quality, reframed as checkable technical obligations.
- Gawande (2009)→surgical expertise, decomposed into pre-op checklist items.
- Sterne et al. (2019)→RoB 2: risk of bias, decomposed into five domains of signaling questions.
- Kahneman et al. (2021)→noisy judgment, decomposed via decision hygiene.
The codebook is the bridge from reading to judgment.
cf.Descartes (1637) · Polya (1945) · Meehl (1954) · Simon (1962) · Alexander (1977) · Bueno de Mesquita (1981) · Altman (1994) · Richardson (1995) · Begg et al. (1996) · Kahneman & Frederick (2002) · Gawande (2009) · Tetlock (2015) · Sterne et al. (2019) · Kahneman et al. (2021)