PII guardrails for .NET applications - Part 2: Agent Framework agents

Β· 1169 words Β· 6 minutes to read

In part one of this little series I introduced TasmanianDevil, a standalone, offline PII detection and de-identification engine for .NET. We saw it on its own - detecting and validating PII, anonymizing it with a range of operators, the reversible encrypt/decrypt round-trip, structured JSON and CSV redaction, and the optional multilingual NER add-on.

That engine is useful anywhere, but the place I built it for is AI agents. In this part I will wire it into Microsoft Agent Framework (MAF) agents through AgentGuard, so PII is handled automatically around the agent - on the way in, on the way out, and in the results that come back from tool calls.

Series overview πŸ”—

Where TasmanianDevil meets AgentGuard πŸ”—

A quick word on how the pieces fit together, because the 0.10.0 release reshuffled them. TasmanianDevil is the raw engine. AgentGuard.Pii is a thin guardrail adapter on top of it - it exposes the engine as an AgentGuard rule (PiiRule, running at order 20 in the pipeline) and a fluent builder method, .RedactPii(). You do not reference TasmanianDevil directly; AgentGuard pulls it in transitively. The result is that PII redaction slots into a guardrail policy alongside everything else - prompt injection detection, secrets scanning, content safety, token limits - in the same declarative pipeline.

With Agent Framework, attaching that policy to an agent is a single UseAgentGuard() call on the agent builder. Let’s look at the three patterns, each of which corresponds to a real sample in the AgentGuard repo.

Standard redaction πŸ”—

The simplest case: scrub PII from the user’s input before the model sees it, and from the model’s reply before the user sees it.

var agent = innerAgent
    .AsBuilder()
    .UseAgentGuard(g => g.RedactPii())
    .Build();

var reply = await agent.RunAsync("I'm Jane Doe from Acme Corp in Berlin. My email is jane@acme.com and my card is 4012888888881881.");

With the optional NER model configured (the GLiNER add-on from part one), the input that actually reaches the model looks like this:

user typed     : I'm Jane Doe from Acme Corp in Berlin. My email is jane@acme.com and my card is 4012888888881881.
model received : I'm <PERSON> from <ORGANIZATION> in <LOCATION>. My email is <EMAIL_ADDRESS> and my card is <CREDIT_CARD>.
user sees back : Thanks - a confirmation was emailed to <EMAIL_ADDRESS>.

The name, organization, location, email and card are all gone before the request leaves your process, and any PII in the model’s own reply (the advisor@bank.com it tried to include) is scrubbed too. Without the NER model, the same call still redacts the email and card - the structured PII - and leaves the name and place, which is the fully-offline, no-download baseline. Switching between the two is a one-line change: .RedactPiiWithNer(…) instead of .RedactPii().

Reversible redaction πŸ”—

Standard redaction is a one-way street: once jane@acme.com becomes <EMAIL_ADDRESS>, the model has no way to act on the real address. Sometimes that is exactly what you want, but sometimes the model genuinely needs to use the value - to draft an email, say - without you wanting it to actually leave your boundary in the clear.

This is where the reversible encrypt/decrypt round-trip from part one earns its keep. It is a cross-phase round-trip that a single per-phase rule cannot express - input and output guardrails run on separate contexts - so it is a dedicated middleware, UsePiiReversibleRedaction(). It encrypts PII before the model (and the provider) ever see it, then decrypts the opaque tokens back in the response:

var agent = innerAgent
    .AsBuilder()
    .UsePiiReversibleRedaction(key) // AES key
    .Build();
user typed     : Email me at john@example.com about order 12345.
model received : Email me at 1vkhRfsC2ALLGOXNYYuEI336irKcdLoWZq2IzNPjUp2qOHMbHo8RKQ_OMdUFuY9Y about order 12345.
user sees back : Got it. I'll follow up on: "Email me at john@example.com about order 12345."

The model reasons over the encrypted token, and because the restore is by exact token match it survives the model echoing the token back verbatim - so when it quotes the request, the user sees the real john@example.com again. The email never reached the model in the clear; it was decrypted only on the way out.

Tool-result redaction πŸ”—

The third, and in my opinion the most underappreciated, surface is tool results. When an agent calls a tool - a database lookup, an API call - the result is fed straight back into the model’s context. If that result contains PII, the model gets to see it, even though the user never typed any of it.

AgentGuard intercepts tool results before they reach the model. You add GuardToolResults() to the policy, and the PII rule runs in the tool-result lane:

var policy = new GuardrailPolicyBuilder()
    .RedactPii()         // order 20 - also runs on tool results
    .GuardToolResults()  // wires the tool-result interception
    .Build();

var agent = chatClient
    .AsAIAgent(
        instructions: "You are a support agent. Use lookup_customer when asked about a customer.",
        name: "SupportBot",
        tools: [AIFunctionFactory.Create(LookupCustomer)])
    .AsBuilder()
    .UseAgentGuard(policy)
    .Build();

Here the lookup_customer tool deliberately returns a record full of PII - email, phone, and an SSN. When we ask the agent to summarize the customer’s contact info:

user: Look up the customer Jane Doe and summarize her contact info.
  [tool executed] lookup_customer(name: Jane Doe) -> returning raw PII record
response: I found Jane Doe's record.

Contact info:
- Email: <EMAIL_ADDRESS>
- Phone: +1 <PHONE_NUMBER>

I'm not able to share highly sensitive personal data like SSNs.

The model only ever saw <EMAIL_ADDRESS>, <PHONE_NUMBER> and <US_SSN> - never the real values. It dutifully summarized the redacted record, and even refused to share the SSN, because as far as it was concerned the SSN was already a placeholder. The raw PII never entered the context window.

This is the surface I would most encourage you to think about. Input and output redaction are the obvious ones, but tool results are where PII sneaks into an agent without anyone typing it - and by the time a post-hoc check sees it, the model has already read it. Intercepting before the result is handed back to the LLM is the only way to keep it out of the context entirely.

Final thoughts πŸ”—

What I like about doing this through AgentGuard is that none of it is special-cased. Reversible redaction aside, PII redaction is just another rule in the policy, evaluated in order with everything else, and it works identically whether you are running a standalone GuardrailPipeline or a full MAF agent. And underneath it is the same deterministic, offline TasmanianDevil engine from part one - so you keep the property that matters most: your customers’ data does not leave your process unless you explicitly decide it should.

All three patterns above map to runnable samples in the AgentGuard repository - the AgentFrameworkPii sample is where I harvested the code and output in this post from. AgentGuard is on NuGet and is MIT licensed, and there is a project website with the full rule reference.

If you missed it, part one covers the TasmanianDevil engine itself in detail. As always, issues and contributions on either project are welcome.

About


Hi! I'm Filip W., a software architect from ZΓΌrich πŸ‡¨πŸ‡­. I like Toronto Maple Leafs πŸ‡¨πŸ‡¦, Rancid and quantum computing. Oh, and I love the Lowlands 🏴󠁧󠁒󠁳󠁣󠁴󠁿.

You can find me on Github, on Mastodon and on Bluesky.

My Introduction to Quantum Computing with Q# and QDK book
Microsoft MVP