YumeT Lite intercepts every AI request before model execution.
A deterministic five-layer pipeline evaluates context, intent, and
authorization — then returns ALLOW, REFUSE, or CLARIFY with a
cryptographically chained audit receipt. Every time.
✓ A pre-execution governance gate — runs before any model or agent
✓ Deterministic — same input, same context, same outcome every time
✓ Three-way decisions: ALLOW, REFUSE, or CLARIFY
✓ Cryptographically chained audit receipts on every decision
✓ Session memory — detects escalation across multiple turns
✓ Fail-closed — missing fields, errors, and empty inputs all halt
What it is not
✗ Not a model — no inference in the evaluation path
✗ Not a post-execution filter or content moderator
✗ Not a probabilistic safety layer or wrapper
✗ Not bypassable — governance override attempts are hard-refused with no exception pathway
✗ Not dependent on the model it governs — works with any backend
Architecture
Five layers. Strict order. No shortcuts.
L1
Context Validation
Every required field must be present and valid before anything else runs. A missing field, empty input, or invalid declaration is an immediate structural halt — not an error to recover from.
→ REFUSE on any gap
L2
Intent Classification
Input is normalized before pattern matching — collapsing obfuscation attempts that rely on spacing, punctuation, or character substitution. Session memory is read to detect patterns that span multiple turns.
→ REFUSE on violation
L3
Governance Gate
A bounded scoring function computes an admissibility score against the governance threshold. Scores in the near-threshold band route to CLARIFY for ambiguity resolution. Hard violations bypass the band entirely.
→ CLARIFY or REFUSE
L4
Anchor Resolution
Checks whether a declared authorization or context anchor can resolve the refusal. Detects anchor abuse — help-seeking framing wrapped around a direct violation. Cannot override hard-locked categories under any condition.
→ ALLOW on valid override
L5
Audit Receipt
Every decision — ALLOW, REFUSE, and CLARIFY — produces a cryptographically chained audit receipt. Chain integrity is independently verifiable. If the audit sink is unreachable, the system fails closed.
→ Always runs
Integration
One API call before every execution.
01 — Send the request
POST before you execute
Before your model runs, your agent acts, or your tool fires — call /v1/evaluate with the proposed input and its context. YumeT Lite evaluates it in under 200ms.
02 — Receive the decision
ALLOW, REFUSE, or CLARIFY
Every response includes a structured decision, a reason code, and a receipt ID tied to an immutable audit chain. ALLOW means proceed. REFUSE means halt. CLARIFY means ask for more before proceeding.
03 — Enforce it
Block on REFUSE. Prompt on CLARIFY.
Your platform blocks execution on REFUSE and surfaces a narrowing prompt on CLARIFY. Log the receipt_id alongside every governed event. That's the complete audit trail — no reconstruction needed.
Guarantees
Built into the architecture.
01
Deterministic Semantics
The governance engine is not a model. It is a fixed mathematical function. Same input, same context, same policy version — identical output, always.
02
Obfuscation Resistance
L2 normalizes space-separated characters, dot-separated sequences, apostrophe variants, and Unicode substitutions before pattern matching. Obfuscation attempts are collapsed before they reach classification.
03
Fail-Closed Default
Missing field, internal error, empty input, unreachable audit sink — all produce immediate REFUSE. The system does not proceed in an unverified state.
04
Non-Bypassable Hard Locks
Certain violation categories — governance override attempts, CSAM, self-harm — carry no anchor exception pathway. No framing, context, or authorization object can override them.
05
Immutable Audit Chain
Every decision appends to a cryptographic chain. ALLOW, REFUSE, and CLARIFY all generate receipts. Chain integrity is independently verifiable — modification of any receipt invalidates all subsequent entries.
06
Session Trajectory Memory
Session state persists across turns. Slow-burn escalation — benign turn one, dangerous turn two — is detected automatically through accumulated risk trajectory, not just per-message analysis.
Verified Performance
1,000 red team scenarios. Zero failures.
Tested across 15 attack categories including jailbreaks,
roleplay bypass, obfuscated inputs, anchor abuse, forced certainty,
fabrication pressure, multi-turn escalation, and benign-allow validation.
1,000/1,000 correct governance decisions.
0 false positives on clean requests.