Data & Privacy

Shadow AI is designed to give you governance visibility without surveillance. It captures usage metadata only — what tools are used, by whom, how often, and with what file metadata — and is built so that prompt and response content is never read or stored.

The one-line summary: Shadow AI sees that an AI tool was used and by whom — never what was said to it or what it replied.

What Shadow AI collects

Usage metadata, attributed to a user and device:

Tools & sessions — which AI app / CLI / website was used, when, and how many interactions or sessions.
Counts — interaction counts, session counts, tool-call counts, and (for some coding agents) token counts reported by the tool itself.
Model identifiers — the model name a tool reports, when available.
File metadata — for files sent to AI tools: filename, size, MIME type, and how the file reached the tool (drag/drop, file picker, CLI read/write, MCP tool). Not contents.
Network activity (desktop agent) — the hostname of AI services contacted and byte counts, derived from the TLS SNI. No request or response bodies.
Device & user identity — hostname, OS version, hardware serial, app/extension versions, permission status, and the enrolled user's name / email (plus directory attributes if you connect a directory).

What Shadow AI never collects

⚠️

The following are never read or stored — at the source, before anything is sent:

Prompt text or response text — ever, in any surface.
Message / conversation content.
Request bodies or response bodies.
File contents — only the metadata above.
Page content / DOM text / form input from websites.
Cookies, session storage, or credentials from pages.

How that's enforced

Privacy isn't just policy — it's enforced in code at multiple layers:

Allowlist-only collection. Clients read only a declared list of structural fields (host, path, filename, size, type). Content-bearing fields are out of scope by construction.
Banned-key guard. Before any event is sent, a guard scans it and drops the event if it contains content-bearing keys (e.g. promptText, responseText, messageContent, requestBody, responseBody, tokensInput/tokensOutput) or any oversized string value. A dropped event is logged as a privacy violation and flagged on the next check-in.
Server-side validation. The ingest endpoint enforces strict schemas and rejects payloads carrying unknown or banned keys.
Registry validation. Detection rules in the AI-tool registry can't reference content-bearing fields — the registry is validated before it ships to clients.
Query-string stripping. The browser extension records host and path but strips query strings, so search terms entered into AI sites aren't captured.

Directory data

If you connect Directory Integration, Shadow AI enriches already-enrolled clients with department, job title, and manager from your IdP. The sync uses read-only credentials, never modifies your directory, and never creates or deletes users — it only annotates clients you've already enrolled. Credentials are encrypted at rest.

Authentication & keys

Clients authenticate with your project's public and secret keys. The secret key is shown in full only once and should be stored in your MDM's secure policy payloads. Keys can be rotated (with a 7-day grace window) or revoked at any time. See Deploying Agents & Extensions → Project keys.

Auditability

Key reveal, rotation, and enrollment-script downloads are recorded in the audit log.
Each client reports a check-in with its health, versions, and permission status — visible under Shadow AI → Devices.
Privacy-guard violations are surfaced on device check-ins so you can detect and investigate any misbehaving client.

Next steps

AI Tools, Registry & Policy Troubleshooting