highlight · part 0 · 6

The fundamental problem we face.

We want to fit our documents into a clean spreadsheet. Some cells resist.


Each cell answers a question. Most fill cleanly. Some don't.

A cell resists when the question was ambiguous for that document. Some illustrations of what that can mean:

  • The codebook asked something the document didn't answer plainly.
  • The AI reader didn't understand the question the way you did.
  • The document is a weird edge case you didn't imagine.

Ambiguity takes many shapes. It is the fundamental problem of this work.

Most projects that fail, fail here. The ambiguities were there. Nobody saw them.

The good news: Data Mint rapidly accelerates the identification of ambiguities.

Resolving them is craft. The reward is real: you understand your question better than you did before. Your measurement gets sharper. Every downstream answer gets stronger. What follows is mostly about this craft.

@karlrohe