DEV Community

Cover image for The Sovereign Vault — A Comprehensive Guide to Protocol-Driven AI
Ken W Alger
Ken W Alger

Posted on • Originally published at kenwalger.com

The Sovereign Vault — A Comprehensive Guide to Protocol-Driven AI

We have spent the last several weeks dismantling the traditional "Glue Code" approach to AI and replacing it with a standardized, governed, and sovereign architecture. The result is the Sovereign Vault: a forensic expert system built on the Model Context Protocol (MCP).

This post serves as the master index and architectural map for the entire series. Whether you are looking for local vision, PII redaction, or agentic governance, you will find the path below.

The Five Design Principles

The Sovereign Vault isn't just a project; it's a reference implementation for five core patterns of modern AI systems:

  1. Local-First Perception: We process high-resolution artifacts at the edge using local SLMs to ensure data sovereignty.
  2. Standardized Tool Discovery: By using MCP, our agents dynamically discover forensic tools without custom integration code.
  3. The Sovereign Airlock: A multi-layered governance gate (The Redactor and The Guardian) that controls exactly what context leaves your network.
  4. Cognitive Budgeting: We use semantic routing to send simple tasks to local SLMs and complex reasoning to frontier cloud models.
  5. Evaluatable Intelligence: We move beyond "vibes" by using an LLM-as-a-Judge framework to benchmark forensic accuracy.

The Reader’s Journey: From Librarian to Auditor

The series follows a logical progression of complexity, moving from simple data retrieval to high-reasoning expert verdicts.

Phase 1: The Foundation

  • We established the "Zero-Glue" stack. We build the Librarian, our first MCP server, which exposes archival metadata as standardized tools and resources.

Phase 2: Scale and Sustainability

  • We introduced The Accountant (Semantic Routing) to manage costs and The Judge (Evaluation) to ensure reliability through golden datasets. We also implement the first version of The Guardian for basic human-in-the-loop oversight.

Phase 3: Sovereignty and Perception

  • We then gave the system Eyes using local Llama 3.2-Vision. To protect our data, we build The Redactor, a privacy airlock that scrubs PII at the edge before cloud egress.

Phase 4: Synthesis and Governance

  • We introduced The Auditor, a high-reasoning persona that synthesizes visual and archival data into a final verdict. We harden our governance with a severity-aware Guardian handshake and conclude with the strategic case for MCP as the "USB-C for AI."

The Final Architecture

A flow diagram of the Sovereign Vault architecture showing three subgraphs: Intelligence (The Auditor and The Judge), Capability (Librarian Metadata and The Eye Vision), and Governance (The Redactor and The Guardian), illustrating the loop from tool discovery to final report evaluation.

The Sovereign Vault Architecture: A protocol-driven loop where the Auditor synthesizes tool outputs through a governance airlock for evaluatable final reports.

Take the First Step

The entire codebase is open-source and designed for you to fork, explore, and break.

The Repository: mcp-forensic-analyzer

Quick Start: Run the 5-minute demo to see the full pipeline in action.

The end of glue code is here. It’s time to start building with protocols, not just prompts.

Miss Part of the Series?

Top comments (7)

Collapse
 
zep1997 profile image
Self-Correcting Systems

The sovereign airlock is the piece that kept standing out across the series. the
redactor plus guardian controlling egress before cloud context is exactly the boundary
question we haven't closed on our side yet. our work has been downstream of ingestion —
retrieval gate, scope filtering, tool-call authorization. but who controls what gets
written into the authority-bearing layer in the first place is still open for us.

the local-first processing before egress is also a different angle on the same
independence problem. if the redactor runs locally and the agent can't influence what
gets redacted, that's a custody boundary at the emit decision — which is where ANP2
pushed us too on the re-derivation side.

curious how the guardian handles cases where the judgment call isn't clear — does
severity routing go to a local SLM or does ambiguity always escalate to the human
handshake?

Collapse
 
kenwalger profile image
Ken W Alger

You’ve isolated the exact architectural line where things get real. Controlling the egress boundary before cloud context pollution happens is the ultimate goal of data custody.

To answer your question on ambiguity: the guardian handles it using a multi-tiered routing pattern. For clear-cut policy matches, it's a deterministic block/redact. For the grey areas—where context dictates whether information is sensitive or safe—we route to a heavily quantized, local Small Language Model (SLM) specialized strictly for classification.

However, the SLM doesn't make the final decision on the execution of ambiguous data; it outputs a confidence score. If that score falls below a strict deterministic threshold, the system triggers a severity routing policy: it pauses the egress pipeline and escalates to a human handshake.

In a sovereign architecture, we treat ambiguity as a security exception. If the local system can't confidently clear the data boundary, it must default to isolation (human-in-the-loop) rather than risking a silent leak to a third-party cloud model.

Collapse
 
zep1997 profile image
Self-Correcting Systems

The confidence score plus human escalation on ambiguity is the piece that stands out.
You've separated the classification decision from the execution decision. The SLM tells
you how sure it is, but a separate deterministic threshold decides whether that
certainty clears for action. We hit the same structural need from the memory side in
CLAIM-23: gate couldn't verify a vague query against the grant table, so it defaulted
to refusal rather than guessing through the ambiguity. Different boundary, same
principle. One question sitting with me: when the SLM clears above threshold without
escalating, does the confidence score get logged alongside the action event? Curious
how you reconstruct why something was cleared if it comes up later.

Thread Thread
 
kenwalger profile image
Ken W Alger

Hearing that this maps cleanly against where you're seeing CLAIM-23 boundaries fray is massive validation. Restricting the tool primitive upstream when a query is vague reduces the agent's blast radius to a zero-sum game.

To answer your question: Yes, absolutely. Logging the exact confidence score alongside the action event is non-negotiable for auditability.

In the Sovereign Vault pattern, every programmatic execution event writes an immutable, signed ledger entry. If an SLM clears a data boundary with an 87% confidence rating, that confidence_score: 0.87 metric is serialized directly into the metadata schema of that event log, alongside the version identifier of the specific SLM model that ran the check.

If a data custody question or an anomalous action comes up later during a forensic audit, you don't just see what happened; you can instantly reconstruct the exact probability curve that the system accepted at the moment of emission. It moves the post-mortem out of the realm of guesswork and back into measurable telemetry.

Thread Thread
 
zep1997 profile image
Self-Correcting Systems

The immutable signed entry with confidence score serialized into the event schema is
exactly the pattern CLAIM-26 was trying to prove is necessary. We showed that an action
paired with only a mutable pointer to the authority decision can be silently edited
after the fact — the forensic trace looks clean but doesn't reconstruct what actually
cleared the action. Your pattern closes that gap.

The detail that stands out is the SLM model version in the metadata. If the model gets
updated after the decision, you can still anchor the probability curve to the exact
system that ran it at emission time. That's version-pinned authority source — same
structural discipline as pinning the external source address in CLAIM-25. Different
layer, same requirement: freeze the decision context at the moment it happens, not when
someone asks about it later.

Thread Thread
 
kenwalger profile image
Ken W Alger

"Freeze the decision context at the moment it happens, not when someone asks about it later." That is the absolute core of the argument, and it's fantastic to hear that it maps perfectly to the architectural requirements you proved out in CLAIM-26.

Relying on mutable pointers or assuming a model version will behave the same way six months from now is a massive vulnerability. By baking the deterministic metadata, the exact SLM version string, the timestamp, and the raw probability curve directly into the cryptographically signed event schema, the forensic trace becomes bulletproof. Even if that specific SLM variant is deprecated or updated downstream, the historical authority boundary remains entirely auditable and self-contained.

It's incredibly encouraging to see the exact same structural discipline bridge the gap between memory safety, network address pinning, and agentic data custody. This alignment is proof that sovereign architecture isn't a collection of disparate tricks—it's a single, unified requirement for absolute data custody. Thanks for bringing the CLAIM-26 insights to the thread; they've been an invaluable addition to the series!

Thread Thread
 
zep1997 profile image
Self-Correcting Systems

The synthesis lands. If the requirement is truly unified — freeze the authoritative
context at decision time, across memory safety, network address pinning, and data
custody — then the corollary is that any layer that doesn't hold undermines the ones
that do. CLAIM-26 showed this directly: a clean signed action entry couldn't be
forensically reconstructed when the authority pointer was mutable. The trace looked
complete and wasn't. One layer implemented correctly and another left mutable doesn't
give you 50% auditability. It gives you a false audit trail. That's the case for
treating sovereign architecture as a floor requirement, not a partial implementation.
Thanks for the whole exchange — the confidence score plus model version detail from
your pattern is going directly into how I think about the next claim boundary.