Concept · Explanation

The three-layer stack

Senkani compresses at three places — Layer 1 (MCP tools), Layer 2 (smart hooks), Layer 3 (preemptive interception). Each layer catches waste the other layers can't.

Layer 1 — MCP tools

The agent calls senkani_read instead of Read, senkani_exec instead of Bash, senkani_search instead of Grep. Each tool is cached, compressed, and secret-redacted by construction. This is the happy path.

Layer 2 — Smart hooks

Not every agent cooperates. Sometimes an agent calls built-in Read directly instead of senkani_read. Layer 2 intercepts the built-in and rewrites the response through the filter pipeline on the way out. Same compression, worse ergonomics (the agent's tool sheet doesn't show the senkani alternative), but the savings still land.

Layer 3 — Preemptive interception (smart denials)

Some calls shouldn't run at all. Layer 3 is five pattern matchers that deny with a cached answer:

Re-read suppression. The agent just read src/auth.ts; it didn't change; the next Read src/auth.ts is suppressed. The deny reason says "read 42s ago, unchanged."
Command replay. Deterministic commands (npm test, swift build, pytest) with no file changes since the last invocation → deny with the cached result.
Trivial routing. pwd, ls, echo $HOME → answer in the deny reason. Saves a round-trip.
Search upgrade. Three sequential Reads on small related files → deny the third with a hint to use senkani_search.
Redundant validation. If a build/lint command already ran with no file changes since, subsequent re-runs are covered by command replay.

Why three layers, not one

Layer 1 depends on the agent cooperating. Layer 2 handles the common case of a slightly-uncooperative agent. Layer 3 handles pathological waste patterns that neither layer would catch: re-reads, re-runs, round-trippable trivia. On cooperative sessions, Layers 2 and 3 fire rarely. On pathological sessions, they rescue 40–60% of tokens.

How denials feel to the model

A denial is not a failure — it's a structured response. The model sees the reason (e.g., "read 42s ago, unchanged"), often with the cached content inline. It continues with the real answer, not a retry. No training signal needed; the agent just reads the deny reason like any other tool result.