PrivacyLens¶
Transparent PII masking for LLM clients — keep sensitive data out of your AI prompts.
The Problem¶
Every time you send a prompt to an LLM, you risk leaking PII — names, emails, phone numbers, SSNs. PrivacyLens fixes this by automatically detecting and replacing sensitive data with anonymous tokens before the prompt leaves your app, then restoring the original values when the response comes back.
"Email john@example.com" → "Email [EMAIL_1]" → LLM → "[EMAIL_1] notified" → "john@example.com notified"
Your LLM never sees real PII. Your app gets back the original values. Zero code changes needed.
Supported SDKs¶
| Package | Install | Adapters |
|---|---|---|
| Python SDK | pip install privacylens |
OpenAI, Anthropic, LangChain, CrewAI, Strands |
| TypeScript SDK | npm install privacylens |
OpenAI, Vercel AI SDK |
Quick Start¶
Python — one line to shield any client¶
from privacylens import shield
import openai
# Wrap your client — that's it
client = shield(openai.OpenAI())
# Use it exactly as before. PII is masked/unmasked automatically.
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "My name is John Doe, email: john@example.com"}],
)
print(response.choices[0].message.content) # Original PII restored
Works the same way with Anthropic, LangChain, CrewAI, and Strands:
client = shield(anthropic.Anthropic()) # Anthropic
handler = shield(my_langchain_model) # LangChain
client = shield(my_crewai_agent) # CrewAI
Use inspect() to preview what would be masked without actually masking it — handy for testing:
from privacylens import inspect
results = inspect("Call me at 555-123-4567 or email john@example.com")
# [EntitySpan(type='PHONE', value='555-123-4567', ...), EntitySpan(type='EMAIL', value='john@example.com', ...)]
TypeScript — drop-in OpenAI wrapper¶
import OpenAI from "openai";
import { shieldOpenAI } from "privacylens/adapters/openai";
const client = shieldOpenAI(new OpenAI());
// Use normally — PII is masked before sending, restored in the response
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Contact john@example.com about the project" }],
});
Works with Vercel AI SDK too:
import { shield } from "privacylens";
import { openai } from "@ai-sdk/openai";
const { text } = await generateText({
model: shield(openai("gpt-4o")),
prompt: "Summarise the contract for john@example.com",
});
What Gets Detected¶
Out of the box (regex-based, extensible):
| Entity | Example |
|---|---|
john@example.com → [EMAIL_1] |
|
| Phone | 555-123-4567 → [PHONE_1] |
| SSN | 123-45-6789 → [SSN_1] |
With optional detectors:
| Detector | Install | Entities |
|---|---|---|
| Presidio | pip install privacylens[pii] |
Names, addresses, credit cards, 50+ types |
| GLiNER (semantic) | pip install privacylens[semantic] |
ML-based entity detection |
Configuration¶
Create a privacylens.yaml in your project root to customize detection:
detectors:
regex:
patterns:
- entity_type: EMAIL
pattern: '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
- entity_type: PHONE
pattern: '\b\d{3}[-.]?\d{3}[-.]?\d{4}\b'
- entity_type: CUSTOM_ID
pattern: 'PROJ-\d{4,}'
vault: memory # or "sqlite" or "redis"
How It Works¶
┌──────────┐ ┌───────────┐ ┌─────────┐ ┌──────────────┐
│ Your App │ ──▶ │ Tokenizer │ ──▶ │ LLM API │ ──▶ │ Detokenizer │ ──▶ Response
│ │ │ │ │ │ │ │ (PII restored)
│ "Email │ │ "Email │ │ │ │ "[EMAIL_1] │
│ john@.."│ │ [EMAIL_1]"│ │ │ │ confirmed" │
└──────────┘ └───────────┘ └─────────┘ └──────────────┘
│ ▲
▼ │
┌───────────┐ │
│ Vault │ ────────────────────────────
│ [EMAIL_1] │
│ =john@.. │
└───────────┘
- Analyze — Detectors scan the prompt for PII entities
- Tokenize — Each PII value is replaced with a deterministic token (
[EMAIL_1],[PHONE_1]) - Store — Token↔value mappings are stored in a session vault (memory, SQLite, or Redis)
- Send — The sanitized prompt goes to the LLM
- Detokenize — Tokens in the LLM response are replaced with original values
Repository Structure¶
privacylens/
├── packages/
│ ├── core-py/ # Python SDK
│ │ ├── src/privacylens/
│ │ │ ├── adapters/ # OpenAI, Anthropic, LangChain, CrewAI, Strands
│ │ │ ├── core/ # Pipeline, Analyzer, Tokenizer, Vault
│ │ │ └── detectors/ # Regex, Presidio, GLiNER
│ │ └── tests/
│ └── core-ts/ # TypeScript SDK
│ ├── src/
│ │ ├── adapters/ # OpenAI, Vercel AI SDK
│ │ ├── core/ # Pipeline, Analyzer, Tokenizer, Vault
│ │ └── detectors/ # Regex
│ └── tests/
└── privacylens.schema.json # Config schema
Contributing¶
Contributions are welcome! Please read CONTRIBUTING.md first.
Documentation¶
Releases¶
Both packages are published automatically on GitHub release:
- Python SDK → PyPI via publish-pypi.yml
- TypeScript SDK → npm via publish-npm.yml
License¶
MIT © 2026 Madan Gopal