highlight · part vi · 2

Validation is trust, built in layers.

No single layer is sufficient. Seven we build on, at Data Mint.


A kappa score doesn't make you trust the data. The trust is in the codebook behind it — and in the readers you chose for this codebook, on documents drawn from a corpus you know.

  1. 01
    The codebook.

    Readable, inspectable. The decision rules, written down.

  2. 02
    Evidence trails for every extracted value.

    Quote, reasoning, assumptions — auditable per cell.

  3. 03
    Multiple independent readers.

    Agreement earned, not assumed. Disagreement exposes ambiguity.

  4. 04
    Codebook refinement.

    Run. See where readers diverge. Fix what broke. Repeat — disagreements thin out with each pass.

  5. 05
    The mint grade.

    A signal built from process traces throughout minting, calibrated against reader agreement — and, soon, against human-labelled ground truth.

  6. 06
    Shared codebooks.

    Replication through reuse. A layer you can't produce alone.

  7. 07
    Accumulated experience.

    You learn what right looks like. And wrong. Over time, you know when to trust.

No statistic validates a codebook. Use does.

Data Mint is where that trust gets built — layer by layer.

@karlrohe