Betaraipii is in early access. Not all features are released yet. Share feedback
v1RESTHTTPS

raipii API

raipii is a REST API that detects and sanitizes PII in text before it reaches your LLM. Two calls wrap your existing LLM workflow — one before (sanitize), one after (restore). No infrastructure changes required.

Base URL for all requests:

text
https://api.raipii.com

All requests use HTTPS. Request and response bodies are JSON. All endpoints require authentication.

Authentication

Pass your API key as a Bearer token in every request.

bash
curl -X POST https://api.raipii.com/v1/sanitize \
  -H "Authorization: Bearer ps_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text": "Hello John Smith"}'
Get a free API key at raipii.com — 2M characters/month, no credit card required. Keys are prefixed with ps_live_.
StatusMeaning
401Missing or invalid API key
403Session belongs to a different account

Quick start

Three API calls — sanitize, your LLM, restore.

PythonNode.jscurl
python
import raipii, openai

ps = raipii.Raipii(api_key="ps_live_...")
oai = openai.OpenAI()

prompt = "Help John Smith (john@acme.com, SSN 392-45-7810) with his claim."

# 1. Sanitize — strip PII before sending to your LLM
result = ps.sanitize(prompt, mode="fake_substitute")
# → "Help Michael Torres (m.torres@email.net, SSN 847-23-1956)..."

# 2. Call your LLM with clean text
reply = oai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": result.sanitized_text}],
).choices[0].message.content

# 3. Restore original values in the response
final = ps.restore(reply, result.session_id)
print(final.restored_text)  # John Smith's real data is back

POST /v1/sanitize

Detects PII in text and replaces it according to the chosen mode. Returns a session_id for later restoration.

Request body

FieldTypeRequiredDescription
textstringrequiredThe text to sanitize. Max 100,000 characters.
modestringoptionaltoken (default) | fake_substitute | redact
entitiesstring[]optionalLimit detection to specific entity types. Detects all if omitted.
session_ttlintegeroptionalSession expiry in seconds. Default 3600 (1 hr). Max 86400.
conversation_idstringoptionalLink to a multi-turn conversation session for consistent substitutions.
confidence_thresholdfloatoptionalOverride the detection confidence threshold (0.0–1.0). Default 0.85.

Response

json
{
  "session_id": "ps_sess_abc123...",
  "conversation_id": null,
  "sanitized_text": "Call [PERSON_1] at [EMAIL_1]",
  "entities_found": [
    {
      "type": "PERSON",
      "original": "John Smith",
      "replacement": "[PERSON_1]",
      "position": [5, 15],
      "confidence": 0.99
    },
    {
      "type": "EMAIL",
      "original": "john@acme.com",
      "replacement": "[EMAIL_1]",
      "position": [19, 32],
      "confidence": 1.0
    }
  ],
  "char_count": 32,
  "usage": { "chars_billed": 32 }
}

Response fields

FieldTypeDescription
session_idstringPass to /v1/restore to reverse substitutions.
sanitized_textstringText with PII replaced.
entities_foundarrayEach detected entity: type, original, replacement, position, confidence.
char_countintegerCharacters in the input text.
usage.chars_billedintegerCharacters billed against your monthly limit.
conversation_idstring | nullEcho of the conversation_id passed in, or null.
Store the session_id immediately — sessions expire after session_ttl seconds (default 1 hr). Calling /v1/restore after expiry returns 404.

POST /v1/restore

Reverses the substitutions made by /v1/sanitize, replacing tokens or synthetic values in the LLM response with their original PII.

Request body

FieldTypeRequiredDescription
textstringrequiredThe LLM response text containing tokens or synthetic values.
session_idstringrequiredThe session_id returned by the corresponding sanitize call.

Response

json
{
  "restored_text": "Call John Smith at john@acme.com",
  "substitutions_reversed": 2,
  "usage": { "chars_billed": 32 }
}
redact mode sessions have nothing to restore — original values were discarded. Calling restore on a redact session returns the text unchanged.

POST /v1/detect

Scans text for PII and returns detected entities with types, positions, and confidence scores. Does not modify the text. Useful for auditing and risk assessment.

Request body

FieldTypeRequiredDescription
textstringrequiredText to scan for PII.
entitiesstring[]optionalLimit detection to specific entity types.
confidence_thresholdfloatoptionalOverride the detection confidence threshold.

Response

json
{
  "entities_found": [
    {
      "type": "US_SSN",
      "value": "392-45-7810",
      "confidence": 1.0,
      "position": [10, 21],
      "detection_method": "structured"
    }
  ],
  "pii_detected": true,
  "risk_level": "HIGH",
  "usage": { "chars_billed": 21 }
}

Risk levels

LevelTriggered by
HIGHSSN, credit card, medical record number, bank account, tax ID
MEDIUMPerson name, email, date of birth, address
LOWAny other detected entity type
NONENo PII detected

POST /v1/conversations

Creates a multi-turn conversation session. Pass the returned conversation_id to /v1/sanitize calls so the same real entity always maps to the same synthetic value across all turns.

Request body

FieldTypeRequiredDescription
ttlintegeroptionalSession lifetime in seconds. Default 86400 (24 hr).
metadataobjectoptionalArbitrary key-value pairs stored with the conversation.

Response

json
{
  "conversation_id": "ps_conv_xyz...",
  "expires_at": "2026-04-11T12:00:00"
}

Sanitize modes

tokendefault

Replaces each entity with a labelled placeholder. The LLM sees the type but not the value. Fully reversible.

Input

Call John Smith at john@acme.com

Output

Call [PERSON_1] at [EMAIL_1]

fake_substitutebest quality

Replaces each entity with a realistic synthetic value. The LLM sees natural data and produces higher-quality output. Fully reversible.

Input

Call John Smith at john@acme.com

Output

Call Michael Torres at m.torres@email.net

redactone-way

Replaces each entity with [REDACTED]. No restore possible — use when the LLM response must never reference PII.

Input

Call John Smith at john@acme.com

Output

Call [REDACTED] at [REDACTED]

Label neutralization

After substituting values, raipii also rewrites sensitive context phrases to prevent safety refusals — even when the actual values have been replaced.

Original phraseRewritten as
SSN / social security numberID number
credit card numberaccount number
date of birth / DOBdate
bank accountaccount reference
passport numberdocument number
driver's licensedocument number
tax ID / EINreference number

Entity types

Supported entity types. Pass any of these in the entities array to limit detection.

TypeExampleTier
PERSONJohn SmithAll tiers
EMAILjohn@acme.comAll tiers
PHONE555-867-5309All tiers
US_SSN392-45-7810All tiers
DATE_OF_BIRTH03/14/1985All tiers
ADDRESS742 Evergreen Terrace, Springfield ILAll tiers
IP_ADDRESS192.168.1.1All tiers
MEDICAL_RECORD_NUMBERMRN 00123456All tiers
TAX_ID12-3456789All tiers
IBANGB29 NWBK 6016 1331 9268 19All tiers
JWTeyJhbGci...All tiers
AWS_KEYAKIA...All tiers
CREDIT_CARD4111 1111 1111 1111Growth+
BANK_ACCOUNT123456789012Growth+
PASSPORTA12345678Growth+
DRIVERS_LICENSEA1234567Growth+
NPINPI 1234567890Growth+
All tiers detect structured PII (SSNs, emails, phones, addresses, dates) using regex and local NLP — no external API calls. Growth and Business tiers add AWS Comprehend for higher recall on names and unstructured free-form entities, plus extended financial and identity document types.

Multi-turn conversations

Without a conversation session, each /v1/sanitize call generates independent substitutions. The same real name may map to different synthetic values across turns.

Create a conversation session once and pass its ID to all sanitize calls. raipii ensures the same entity always maps to the same synthetic value for the lifetime of the conversation.

python
conv = ps.conversations.create(ttl=3600)

# Turn 1 — "John Smith" → "Michael Torres"
turn1 = ps.sanitize(
    "My name is John Smith.",
    mode="fake_substitute",
    conversation_id=conv.conversation_id,
)

# Turn 2 — "John Smith" → same "Michael Torres" from turn 1
turn2 = ps.sanitize(
    "Tell me more about John Smith.",
    mode="fake_substitute",
    conversation_id=conv.conversation_id,
)

HIPAA mode

HIPAA mode ensures no text is sent to any external service during detection. All analysis runs entirely within your AWS region using local engines only — no data ever leaves your region.

HIPAA mode is enabled by default on the Starter tier and available as a toggle for Business tier accounts. It reliably detects all structured PHI — SSNs, medical record numbers, dates of birth, addresses, contact information — as well as names and contextual entities via local NLP. No external cloud services are called.

raipii is HIPAA-compliant and can provide a Business Associate Agreement (BAA). Review our BAA template or contact us to execute a signed copy.

Errors

All errors return a JSON body with an error field.

json
{ "error": "Monthly character limit exceeded" }
StatusMeaningHow to handle
400Bad request — missing or invalid fieldCheck request body
401Invalid or missing API keyCheck Authorization header
402Monthly character limit exceededUpgrade plan at raipii.com
403Feature not available on current tierUpgrade to Growth or Business tier
404Session not found or expiredRe-sanitize the original text
429Too many requestsBack off and retry — SDKs do this automatically
503Service temporarily unavailableRetry after a short delay — SDKs do this automatically

Retry & timeouts

Both SDKs automatically retry on 429 and 503 with exponential backoff. If calling the HTTP API directly, implement your own backoff.

AttemptDelay
1st retry1 second
2nd retry2 seconds
3rd retry (final)4 seconds

Default request timeout is 30 seconds. Configurable via SDK options.

Caveats

Session expiry

Sessions expire after session_ttl seconds (default 3600 — 1 hr). Always call /v1/restore promptly after receiving the LLM response. Expired sessions return 404. Pass a larger session_ttl to extend — max 3600 (Starter), 86400 (Growth), 604800 (Business).

redact mode has no restore

When using redact mode, original values are not stored. Calling /v1/restore on a redact session returns the text unchanged. Use token or fake_substitute if you need to restore.

Characters billed

All three endpoints bill by character count of the input text. The monthly limit resets on the first of each calendar month. Free tier: 2M chars/month.

Detection accuracy by tier

raipii runs a multi-layer detection pipeline on every request. Regex patterns fire first at 100% confidence for all structured PII. A local contextual NLP engine then runs fully within your region — offline and HIPAA-safe — to catch names, dates, and addresses in natural language. On Growth and Business tiers, an additional cloud NLP service provides the highest recall for unstructured free-form text.

LLM Proxy

The raipii proxy lets you add PII protection to any LLM call with a single line change — swap base_url to point at raipii. No SDK required. Your existing OpenAI, Anthropic, or Gemini code works unchanged.

raipii intercepts the request, sanitizes all PII in the message body, forwards the clean request to the real LLM with your key, then restores original values in the response before returning it to your app.

LLM Proxy requires Growth tier or above. Your LLM API key is passed in the X-LLM-API-Key header and is never logged or stored.
OpenAI
OpenAI
Anthropic
Anthropic
Gemini
Gemini
Groq
Groq
Mistral
Mistral
DeepSeek
DeepSeek

Proxy quick start

OpenAI

python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.raipii.com/v1/proxy/openai",
    api_key="ignored",          # raipii handles auth
    default_headers={
        "Authorization": "Bearer ps_live_...",   # your raipii key
        "X-LLM-API-Key": "sk-...",               # your OpenAI key
    },
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Help John Smith (SSN 392-45-7810)"}],
)
# PII was never sent to OpenAI — response has original values restored

Anthropic

python
import anthropic

client = anthropic.Anthropic(
    base_url="https://api.raipii.com/v1/proxy/anthropic",
    api_key="ignored",
    default_headers={
        "Authorization": "Bearer ps_live_...",
        "X-LLM-API-Key": "sk-ant-...",
    },
)

message = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Help John Smith (SSN 392-45-7810)"}],
)

Node.js (OpenAI SDK)

typescript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.raipii.com/v1/proxy/openai",
  apiKey: "ignored",
  defaultHeaders: {
    Authorization: "Bearer ps_live_...",
    "X-LLM-API-Key": "sk-...",
  },
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Help John Smith (SSN 392-45-7810)" }],
});

Supported providers

All providers are accessed via https://api.raipii.com/v1/proxy/{provider}/...

ProviderBase URLCompatible SDKs
openai/v1/proxy/openaiopenai-python, openai-node, LangChain
anthropic/v1/proxy/anthropicanthropic-python, anthropic-node
gemini/v1/proxy/geminigoogle-generativeai, LangChain
groq/v1/proxy/groqgroq-python (OpenAI-compatible)
mistral/v1/proxy/mistralmistralai SDK (OpenAI-compatible)
deepseek/v1/proxy/deepseekopenai SDK with DeepSeek base_url
Streaming (stream=True) is not yet supported on the proxy. Use the standard sanitize → LLM → restore flow for streaming use cases.

Python SDK

bash
pip install raipii
python
import raipii

ps = raipii.Raipii(api_key="ps_live_...")  # or RAIPII_API_KEY env var

result   = ps.sanitize("John Smith, john@acme.com", mode="fake_substitute")
restored = ps.restore(llm_response, result.session_id)
detected = ps.detect("SSN 392-45-7810")
conv     = ps.conversations.create(ttl=3600)

Full docs and options in the PyPI README.

Node.js / TypeScript SDK

bash
npm install raipii
ts
import { Raipii } from "raipii";

const ps = new Raipii({ apiKey: "ps_live_..." }); // or RAIPII_API_KEY env var

const result   = await ps.sanitize("John Smith, john@acme.com", { mode: "fake_substitute" });
const restored = await ps.restore(llmResponse, result.sessionId);
const detected = await ps.detect("SSN 392-45-7810");
const conv     = await ps.conversations.create({ ttl: 3600 });

Zero runtime dependencies. Ships ESM + CJS with full TypeScript types.

HTTP (curl)

bash
# Sanitize
curl -X POST https://api.raipii.com/v1/sanitize \
  -H "Authorization: Bearer ps_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text": "John Smith, john@acme.com", "mode": "fake_substitute"}'

# Restore
curl -X POST https://api.raipii.com/v1/restore \
  -H "Authorization: Bearer ps_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text": "...", "session_id": "ps_sess_..."}'

# Detect
curl -X POST https://api.raipii.com/v1/detect \
  -H "Authorization: Bearer ps_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text": "My SSN is 392-45-7810"}'

Ready to start?

Free tier — 2M characters/month, no credit card required.