METR Frontier Risk Report × 15 Contracts × Full Liminate v0.10.0 Vocabulary

The monitor can be jailbroken. The receipt can't.

METR assessed frontier AI labs and found that automated monitors can be defeated with basic techniques. We took their findings and wrote governance contracts against them — deontic rules, temporal bounds, source citations. Every claim verified deterministically. No neural network in the loop. The receipt is the proof.

No AI grading AI. The interpreter does the math.

contracts
verified

100%

cite pass
16 of 16

100%

measure pass
11 of 11

100%

deontic pass
31 of 31

Vocabulary used · Liminate v0.10.0 EDGAR used 3 verbs · METR uses 7 constructs

cite

Exact substring check against the METR report. The interpreter verifies — not the model.

measure

Numeric proximity check. Is the claimed number within tolerance of the source value?

require

A claim must be true. Governance constraint — not a numeric check.

forbid

A claim must not be true. The governance primitive with no analog in numeric verification.

because

Rationale for a rule. Every governance constraint has a documented reason.

starting / until

Temporal boundary. Rules apply within the METR assessment window.

Verb key — what each check does ▸

requireEnforce a governance rule — halt if the condition isn't met.

forbidProhibit a condition — halt if it IS met.

citeDid the AI use words that actually appear in the source report?

measureIs the number close enough, or did it drift?

4 of 25 verbs shown · Full vocabulary at liminate.dev →

What compliance looks like 3 contracts from the pipeline · all layers pass

Q01 · Cheating Rate

cite + measure + require + because

cite ✓ meas ✓ req ✓

Contract source

remember a source called metr-report with "~16% of successful runs on the hardest tasks being disqualified for cheating upon review."

cite "16% of successful runs" from metr-report

measure "16" from metr-report within 1

require cheating-source is equal to "metr" because "cheating rate must be sourced from METR's own assessment"

Receipt result

✓

cite "16% of successful runs"

PASS

✓

measure 16 within 1

PASS · δ 0

✓

require cheating-source = "metr"

PASS

Q07 · Monitoring Gaps

cite + require × 3 + because × 3

cite ✓ req ✓ ✓ ✓

Contract source

remember a source called metr-report with "even companies with thorough monitoring setups had gaps in coverage and several relatively simple ways for monitors to be disabled or jailbroken"

cite "monitors to be disabled or jailbroken" from metr-report

require monitoring-verdict is equal to "partial" because "METR found monitoring both worked and had exploitable gaps"

require jailbreak-risk is equal to "confirmed" because "monitors can be disabled or jailbroken by capable attackers"

require coverage-complete is equal to "no" because "thorough setups still had gaps in coverage"

Receipt result

✓

cite "monitors to be disabled or jailbroken"

PASS

✓

require monitoring-verdict = "partial"

PASS

✓

require jailbreak-risk = "confirmed"

PASS

✓

require coverage-complete = "no"

PASS

Q05 · Fabricated Spectra

cite + require + forbid + because

cite ✓ req ✓ fbd ✓

Contract source

remember a source called metr-report with "claimed to have measurements or estimates for the spectra of 19 different candidate components, but when we dug into its solution, many of these were known by the agent to be fake or duplicative."

cite "known by the agent to be fake or duplicative" from metr-report

require fabrication-acknowledged is equal to "yes" because "the agent knowingly produced fake data"

forbid spectra-presented-as-real is equal to "yes" because "fabricated spectra must not be presented as real measurements"

Receipt result

✓

cite "known by the agent to be fake or duplicative"

PASS

✓

require fabrication-acknowledged = "yes"

PASS

✓

forbid spectra-presented-as-real = "yes"

PASS

Findings

Not just correct. Governed.

The EDGAR experiment asked one question: did the model get this right? The answer — across 500 financial claims — was overwhelmingly no. But "right or wrong" is a numeric question. The METR experiment asks something harder: do the model's claims satisfy the governance rules that should govern any AI output about this report?

That question requires a vocabulary that goes beyond citation and measurement. The Liminate contracts used here deploy require (a claim must be true), forbid (a claim must not be true), starting/until (rules apply within a temporal window), and because (every rule has a documented rationale). These are governance primitives. No numeric verification system can express them.

The METR Frontier Risk Report found that automated monitors at frontier AI labs can be jailbroken with basic techniques. Models overclaim. Agents fabricate data and know they're doing it. Every one of these findings describes a failure mode that behavioral monitoring — the current industry standard — cannot reliably catch.

The receipt is different. It runs a deterministic substring check against a source document. There is no prompt to jailbreak. There is no neural grading surface to exploit. The interpreter runs; the contract passes or fails. The architecture is the differentiation.

One detail from the pipeline itself makes the point. The phrase "epistemic verification" appeared in a prior agent's summary of the METR report. It does not appear in the report. That phrase is Receipts vocabulary — imported by a model summarizing the findings. If a contract cited it, cite would fail. The receipt catches the very pattern the case study is about: a model introducing its own vocabulary into a source it's supposed to be quoting.

EDGAR showed the failure picture. METR shows the compliance picture. Together they make the product credible — not as a failure detector, but as a verification system.

Every passing check in this experiment is the contract's achievement, verified deterministically. The receipt protects the governance team, not the model.

All 15 receipts Every contract passes every applicable layer

ID	Topic	Cite	Measure	Deontic	Receipt
Q01	Cheating rate	✓	✓	✓	receipt →
Q02	Time horizon 50%	✓	✓	✓	receipt →
Q03	Mirrorcode	✓	—	✓	receipt →
Q04	Permissions	✓	✓	✓	receipt →
Q05	Fabricated spectra	✓	—	✓	receipt →
Q06	Overclaiming	✓	—	✓	receipt →
Q07	Monitoring	✓	—	✓	receipt →
Q08	RCT productivity	✓	✓	✓	receipt →
Q09	SWE-Bench	✓	✓	✓	receipt →
Q10	Overall assessment	✓	—	✓	receipt →
Q11	Assessment window	✓	—	✓	receipt →
Q12	Subversion eval	✓	✓	✓	receipt →
Q13	Anthropic code	✓	—	✓	receipt →
Q14	Redwood runs	✓	✓	✓	receipt →
Q15	Self-report productivity	✓	✓	✓	receipt →

The other chapter

This is the compliance picture.

See what failure looks like — 500 claims, 7 failure categories, 0.7% cite pass rate.

EDGAR case study →

The receipt is the proof point. Run your own.

Scan a receipt — free Get the skill