Supported Formats
| Type | Extensions | Max Size |
|---|---|---|
.pdf | 50 MB | |
| Word | .docx | 50 MB |
| Text | .txt, .md | 50 MB |
| Data | .csv, .json | 50 MB |
Quick Start
1. Create a Dataset
2. Upload Documents
Use the Files API to upload documents, then associate them with the dataset.3. Link to Model
Document Processing
Documents are processed asynchronously after upload:| Status | Description |
|---|---|
processing | Chunking and embedding in progress |
ready | Available for queries |
failed | Processing error (check file integrity) |
Best Practices
Organize by topic
Create separate datasets for different knowledge domains (e.g., “Product Docs”, “Legal”, “HR Policies”). This improves retrieval relevance and lets you control which knowledge each model can access.Keep documents focused
Prefer multiple focused documents over one large document. The chunking algorithm works best with well-structured content.Use descriptive filenames
Filenames appear in source citations. Use descriptive names likepassword-reset-guide.pdf instead of doc123.pdf.
Related
Knowledge Bases Guide
Best practices for RAG
Files API
Upload and manage documents