One example of what codebook refinement can look like.
Its scaffolding changed across three drafts. So did the papers we tested it on.
Screen abstracts submitted to a medical journal. Flag the ones that report a randomized trial — the papers where CONSORT reporting guidelines apply.
A codebook. A way for two readers — a human or an AI — to look at the same abstract and reach the same answer.
No codebook yet. v1 was drafted in about an hour, from what we thought mattered.
codebook
v1
6 fields. Top of funnel: study_type, one of eleven categories.
v2
5 fields. Top of funnel: manuscript_type, one of four.
v3
6 fields. Top of funnel: two yes/no gates, one after the other.
papers we tested it on
first batch
~50 recently-published abstracts. Not chosen for difficulty.
second batch
~75 abstracts, searched from PubMed for confusing types of studies.
Six fields in v1. Six fields in v3. Almost none of them the same.