Your institutional knowledge doesn’t live in a database. It lives in PDFs, Word docs, policy binders, field reports, and shared drives. Delphi treats documents as first-class data sources alongside your connectors and KPIs, so every answer can be grounded in your own material. Retrieval is semantic, not keyword-based — you can ask questions in the language your team actually uses and Delphi will find the passages that matter.

Add a document
Open the Data tab on any command center and upload a file, or promote a Google Drive file directly from chat by asking Delphi something like “add this doc as a data source.” PDFs, Word documents, spreadsheets, plain text, and Google Docs are all supported, up to 30MB per file.
Storage uploads start indexing automatically the moment they land. Drive files are promoted through chat because Delphi needs your own Google OAuth to fetch them the first time — once a Drive file is indexed, any teammate with access to the command center can ask questions about it without needing their own Drive connection.
If a document ever gets stuck in pending or failed, the reindex button on the Documents list re-runs the whole pipeline. You’ll see a live status pill next to each file: pending, processing, indexed, or failed.
How indexing works
When a document is added, Delphi extracts its text, breaks it into overlapping passages, and builds a semantic index so the meaning of each passage can be searched — not just the exact words. This is the RAG (retrieval-augmented generation) pattern: when you ask a question, Delphi finds the most relevant passages first, then uses them as grounded context for the answer.
Because the index is semantic, a question about “staff turnover” will surface a passage that talks about “attrition” or “people leaving the program.” You don’t need to match the document’s exact phrasing.
Every dashboard has its own private index. Document content never leaks across command centers or tenants.
Ask Delphi about your documents
Open the chat panel on any command center and ask in plain language. Two patterns work well:
- Summarize a specific document. Ask “summarize the Q3 safeguarding report” and Delphi will read the full document and give you a structured summary.
- Search across everything. Ask “what do our policies say about conflict-of-interest disclosures?” and Delphi will retrieve the most relevant passages from every indexed document and answer with citations.
See Chatting with Delphi for more on how the agent picks tools and cites sources. If you want a formal written deliverable rather than a quick answer, ask for a “report” or “brief” explicitly — otherwise Delphi will answer conversationally and keep the chat fast.
Authority levels
Not every document deserves the same weight. Delphi lets you tag each one with an authority level so the agent knows how much to trust it:
- Canonical — the official source of truth. Signed policies, board-approved strategies, published financial statements. Delphi will prefer these over anything else when they conflict.
- Reference — a credible secondary source. Vendor reports, partner briefings, peer-reviewed research.
- Field report — first-hand observations from the ground. Useful and often urgent, but unverified.
- Unverified — unknown provenance. The default for anything uploaded without context.
Owners, admins, and editors can change a document’s authority from the Documents list. The agent reflects these labels in its answers — if a field report contradicts a canonical policy, it will say so explicitly rather than picking a side.
Privacy and PII scanning
Every document is scanned for sensitive information before it’s indexed. Delphi uses Google Cloud Data Loss Prevention to look for names, emails, phone numbers, addresses, government IDs, medical record numbers, and payment card data. Documents with high-sensitivity findings (social security numbers, medical records, or more than ten PII items) are blocked from indexing. Lower-risk findings are redacted automatically before the content is stored.
PII scanning is fail-closed: if the scanner can’t run for any reason, the document won’t be indexed. No unscanned content ever makes it into the searchable index.
Who can see what a document contains is governed by your command center’s role model. Each document inherits a classification (public, internal, confidential, restricted) and only users whose role is cleared for that tier will get it back in a chat answer. See Roles and access for the full matrix.