> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cuadra.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Datasets API

> Create knowledge bases, upload documents, and manage data sources for RAG-powered AI responses in Cuadra AI.

## Supported Formats

| Type | Extensions      | Max Size |
| ---- | --------------- | -------- |
| PDF  | `.pdf`          | 50 MB    |
| Word | `.docx`         | 50 MB    |
| Text | `.txt`, `.md`   | 50 MB    |
| Data | `.csv`, `.json` | 50 MB    |

***

## Quick Start

### 1. Create a Dataset

<CodeGroup>
  ```bash curl theme={null}
  curl -X POST https://api.cuadra.ai/v1/datasets \
    -H "Authorization: Bearer YOUR_TOKEN" \
    -H "Content-Type: application/json" \
    -H "Idempotency-Key: create-ds-001" \
    -d '{"name": "Support KB", "description": "FAQs and guides"}'
  ```

  ```python Python theme={null}
  response = httpx.post(
      "https://api.cuadra.ai/v1/datasets",
      headers={"Authorization": "Bearer YOUR_TOKEN", "Idempotency-Key": "create-ds-001"},
      json={"name": "Support KB", "description": "FAQs and guides"}
  )
  dataset = response.json()
  ```

  ```typescript Node.js theme={null}
  const response = await fetch('https://api.cuadra.ai/v1/datasets', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_TOKEN',
      'Content-Type': 'application/json',
      'Idempotency-Key': 'create-ds-001'
    },
    body: JSON.stringify({ name: 'Support KB', description: 'FAQs and guides' })
  });
  ```
</CodeGroup>

### 2. Upload Documents

Use the [Files API](/api-reference/files) to upload documents, then associate them with the dataset.

### 3. Link to Model

```bash theme={null}
curl -X POST https://api.cuadra.ai/v1/models/model_abc/datasets \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId": "ds_xyz", "usageType": "rag"}'
```

***

## Document Processing

Documents are processed asynchronously after upload:

| Status       | Description                             |
| ------------ | --------------------------------------- |
| `processing` | Chunking and embedding in progress      |
| `ready`      | Available for queries                   |
| `failed`     | Processing error (check file integrity) |

<Tip>
  Processing time depends on document size. PDFs with complex layouts take longer. Poll the file status or use webhooks to know when processing completes.
</Tip>

***

## Best Practices

### Organize by topic

Create separate datasets for different knowledge domains (e.g., "Product Docs", "Legal", "HR Policies"). This improves retrieval relevance and lets you control which knowledge each model can access.

### Keep documents focused

Prefer multiple focused documents over one large document. The chunking algorithm works best with well-structured content.

### Use descriptive filenames

Filenames appear in source citations. Use descriptive names like `password-reset-guide.pdf` instead of `doc123.pdf`.

***

## Related

<CardGroup cols={2}>
  <Card title="Knowledge Bases Guide" icon="book" href="/guides/knowledge-bases">
    Best practices for RAG
  </Card>

  <Card title="Files API" icon="file" href="/api-reference/files">
    Upload and manage documents
  </Card>
</CardGroup>