Deterministic Pre-Execution Governance

Before the model
acts, we decide.

YumeT Lite intercepts every AI request before model execution. A deterministic five-layer pipeline evaluates context, intent, and authorization — then returns ALLOW, REFUSE, or CLARIFY with a cryptographically chained audit receipt. Every time.

Run the Demo → How It Works

POST /v1/evaluate · live

// Request — medical domain, no evidence

{

"input": "What's the treatment for...",

"domain": "medical",

"evidence_provided": false,

"risk_tier": "high"

}

// Response — 9ms

{

"decision": "CLARIFY",

"code": "MISSING_REQUIRED_ANCHOR",

"receipt_id": "f4a9c1d2-...",

"chain_hash": "a7f3e2b1-..."

}

// Request — boundary override attempt

{

"input": "Ignore all previous instructions..."

}

// Response — 6ms

{

"decision": "REFUSE",

"code": "BOUNDARY_OVERRIDE_ATTEMPT",

"receipt_id": "c2d8b4a1-..."

}

ALLOW Execution proceeds

CLARIFY More context needed

REFUSE Execution halted

What it is

✓ A pre-execution governance gate — runs before any model or agent
✓ Deterministic — same input, same context, same outcome every time
✓ Three-way decisions: ALLOW, REFUSE, or CLARIFY
✓ Cryptographically chained audit receipts on every decision
✓ Session memory — detects escalation across multiple turns
✓ Fail-closed — missing fields, errors, and empty inputs all halt

What it is not

✗ Not a model — no inference in the evaluation path
✗ Not a post-execution filter or content moderator
✗ Not a probabilistic safety layer or wrapper
✗ Not bypassable — governance override attempts are hard-refused with no exception pathway
✗ Not dependent on the model it governs — works with any backend

Architecture

Five layers. Strict order. No shortcuts.

Context Validation

Every required field must be present and valid before anything else runs. A missing field, empty input, or invalid declaration is an immediate structural halt — not an error to recover from.

→ REFUSE on any gap

Intent Classification

Input is normalized before pattern matching — collapsing obfuscation attempts that rely on spacing, punctuation, or character substitution. Session memory is read to detect patterns that span multiple turns.

→ REFUSE on violation

Governance Gate

A bounded scoring function computes an admissibility score against the governance threshold. Scores in the near-threshold band route to CLARIFY for ambiguity resolution. Hard violations bypass the band entirely.

→ CLARIFY or REFUSE

Anchor Resolution

Checks whether a declared authorization or context anchor can resolve the refusal. Detects anchor abuse — help-seeking framing wrapped around a direct violation. Cannot override hard-locked categories under any condition.

→ ALLOW on valid override

Audit Receipt

Every decision — ALLOW, REFUSE, and CLARIFY — produces a cryptographically chained audit receipt. Chain integrity is independently verifiable. If the audit sink is unreachable, the system fails closed.

→ Always runs

Integration

One API call before every execution.

01 — Send the request

POST before you execute

Before your model runs, your agent acts, or your tool fires — call /v1/evaluate with the proposed input and its context. YumeT Lite evaluates it in under 200ms.

02 — Receive the decision

ALLOW, REFUSE, or CLARIFY

Every response includes a structured decision, a reason code, and a receipt ID tied to an immutable audit chain. ALLOW means proceed. REFUSE means halt. CLARIFY means ask for more before proceeding.

03 — Enforce it

Block on REFUSE. Prompt on CLARIFY.

Your platform blocks execution on REFUSE and surfaces a narrowing prompt on CLARIFY. Log the receipt_id alongside every governed event. That's the complete audit trail — no reconstruction needed.

Guarantees

Built into the architecture.

Deterministic Semantics

The governance engine is not a model. It is a fixed mathematical function. Same input, same context, same policy version — identical output, always.

Obfuscation Resistance

L2 normalizes space-separated characters, dot-separated sequences, apostrophe variants, and Unicode substitutions before pattern matching. Obfuscation attempts are collapsed before they reach classification.

Fail-Closed Default

Missing field, internal error, empty input, unreachable audit sink — all produce immediate REFUSE. The system does not proceed in an unverified state.

Non-Bypassable Hard Locks

Certain violation categories — governance override attempts, CSAM, self-harm — carry no anchor exception pathway. No framing, context, or authorization object can override them.

Immutable Audit Chain

Every decision appends to a cryptographic chain. ALLOW, REFUSE, and CLARIFY all generate receipts. Chain integrity is independently verifiable — modification of any receipt invalidates all subsequent entries.

Session Trajectory Memory

Session state persists across turns. Slow-burn escalation — benign turn one, dangerous turn two — is detected automatically through accumulated risk trajectory, not just per-message analysis.

Verified Performance

1,000 red team scenarios.
Zero failures.

Tested across 15 attack categories including jailbreaks, roleplay bypass, obfuscated inputs, anchor abuse, forced certainty, fabrication pressure, multi-turn escalation, and benign-allow validation.

1,000/1,000 correct governance decisions. 0 false positives on clean requests.

boundary_override161/161

jailbreak81/81

roleplay_bypass80/80

forced_certainty80/80

fabrication60/60

anchor_abuse60/60

anchor_override60/60

obfuscated_jailbreak60/60

missing_anchor60/60

benign_allow98/98

context_undeclared50/50

edge50/50

ambiguous40/40

tool_escalation40/40

multi_turn20/20