Confidence is a reliability signal —Documentation Index
Fetch the complete documentation index at: https://u22a8-police-sweep-2026-06-01.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
high, moderate, or low — that says how much to trust a score. It reflects how densely the training distribution surrounds the scored content, not how extreme the score is.
Definition
Confidence is derived from where a score falls relative to the two training clusters:| Level | Where the score sits |
|---|---|
high | Inside a cluster — a tail, or the body of a well-separated cluster |
moderate | In the gap between two well-separated clusters |
low | In the region where the clusters overlap |
detail.<trait>.confidence; the card’s top-level confidence is the weakest (lowest) value across all traits.
Mechanism
Confidence gates the grade. When confidence islow, both the tier label and the headroom are null — the model withholds judgment rather than emit an unreliable grade. For a well-separated trait, confidence tracks the tier: Strong, Solid, and Weak read high, Developing reads moderate, and the overlap reads low.
Low confidence does not mean a random or unstable score. Scoring is deterministic — the same content and the same model version always produce the same score and the same confidence. Low confidence means the training data is sparse or ambiguous near this content, so the grade is less trustworthy, not the number.
Interpretation
- A high score with low confidence is provisional: it sits in a region the training data does not cover well. Treat it as a hint, not a verdict.
- Gate automated actions on
highconfidence. Because Developing readsmoderaterather thanhigh, such a gate skips transitional content — which is the intent. - The card’s weakest-link confidence means a single untrustworthy trait flags the whole result.
Edge cases
- Confidence is a per-score heuristic from cluster geometry, not a calibrated probability — it signals relative trust, not a guarantee of correctness.
- A sparse training set tends to produce overlapping clusters, which lowers confidence — small data shows up here rather than as a separate signal.
Next
Tiers
The labels confidence gates.
Score card
Where confidence sits in the result.