Overview
When you upload a PDF or DOCX file, Raycaster Doc automatically parses it into structured Markdown in the background. This parsed content powers AI features like chat document reading, semantic search, and review analysis — without you having to do anything.How Parsing Works
Upload triggers parsing
When a file is uploaded or replaced, a SHA-256 hash of the source file is computed.
Cache check
If parsed content already exists for this exact file hash, parsing is skipped entirely (cache hit).
Parsing by File Type
| Format | Parser | Output |
|---|---|---|
| Mistral OCR | Per-page Markdown + extracted images | |
| DOCX | Reducto | Per-page Markdown + media |
| DOC | Reducto (legacy) | Per-page Markdown |
| Markdown / Plaintext | Native (no parsing needed) | Indexed directly |
Cache Properties
- Hash-based — Identical files uploaded by different users reuse the same parsed output
- Idempotent — Re-uploading the same file doesn’t trigger redundant parsing
- Automatic cleanup — When an artifact is deleted or a project is removed, its cached content is cleaned up
Parse Status
Each artifact tracks its parsing state internally:| Status | Meaning |
|---|---|
none | No parsed content exists yet |
pending | Parse job is queued or in progress |
ready | Parsed content is available for AI features |
What Uses Parsed Content
Parsed Markdown is consumed by several AI features behind the scenes:- Chat
viewtool — When the agent reads a document in text mode, it uses parsed Markdown for PDFs and DOCX files - Semantic search — Parsed content is chunked, embedded, and indexed in the vector database
- Review runs — The review agent reads parsed content to analyze documents
You never interact with parsed files directly. The document viewer always shows the original source file. Parsing is a backend optimization that makes AI features fast and accurate.
Limitations
- PDF parsing supports files up to 25 MB
- Very large documents may take a few minutes to parse
- Complex layouts (multi-column, heavy tables) may have reduced parsing accuracy — use the
visualmode in chat for layout-sensitive analysis
