> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cuadra.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# API Overview

> The Cuadra AI REST API provides endpoints for chat completions, model management, datasets, and usage tracking. JSON-based with Bearer token authentication.

## Base URL

```
https://api.cuadra.ai/v1
```

***

## Authentication

Include your access token in the `Authorization` header:

```http theme={null}
Authorization: Bearer YOUR_TOKEN
```

| Method            | Use Case                                   |
| ----------------- | ------------------------------------------ |
| **JWT Sessions**  | Frontend apps (from Stytch B2B auth)       |
| **M2M OAuth 2.0** | Backend services (client credentials flow) |

See [Authentication](/api-reference/authentication) for setup details.

***

## Request Format

All requests use JSON:

<CodeGroup>
  ```bash curl theme={null}
  curl -X POST https://api.cuadra.ai/v1/chats \
    -H "Authorization: Bearer YOUR_TOKEN" \
    -H "Content-Type: application/json" \
    -d '{"modelId": "model_abc", "messages": [{"role": "user", "content": "Hello"}]}'
  ```

  ```python Python theme={null}
  import httpx

  response = httpx.post(
      "https://api.cuadra.ai/v1/chats",
      headers={"Authorization": "Bearer YOUR_TOKEN"},
      json={"modelId": "model_abc", "messages": [{"role": "user", "content": "Hello"}]}
  )
  ```

  ```typescript Node.js theme={null}
  const response = await fetch('https://api.cuadra.ai/v1/chats', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_TOKEN',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ modelId: 'model_abc', messages: [{ role: 'user', content: 'Hello' }] })
  });
  ```
</CodeGroup>

### Idempotency

For POST requests, include `Idempotency-Key` to safely retry without duplicates:

```http theme={null}
Idempotency-Key: unique-request-id-123
```

***

## Response Format

### Success

```json theme={null}
{
  "id": "chat_xyz789",
  "message": {
    "role": "assistant",
    "content": "Hello! How can I help?"
  },
  "usage": {
    "inputTokens": 15,
    "outputTokens": 8,
    "totalTokens": 23
  }
}
```

### Error (RFC 7807)

```json theme={null}
{
  "type": "about:blank",
  "title": "Unauthorized",
  "status": 401,
  "detail": "Invalid or expired token."
}
```

***

## Rate Limits

| Scope            | Limit               |
| ---------------- | ------------------- |
| Per organization | 300 requests/minute |
| Per user         | 60 requests/minute  |

Rate-limited requests return HTTP 429 with a `Retry-After` header.

***

## Pagination

List endpoints use cursor-based pagination:

```bash theme={null}
GET /v1/models?limit=20
```

```json theme={null}
{
  "data": [...],
  "nextCursor": "cursor_abc123",
  "hasMore": true
}
```

***

## Endpoints

| Endpoint            | Description            |
| ------------------- | ---------------------- |
| `POST /v1/chats`    | Create chat completion |
| `GET /v1/models`    | List models            |
| `POST /v1/models`   | Create model           |
| `GET /v1/datasets`  | List datasets          |
| `POST /v1/datasets` | Create dataset         |
| `GET /v1/usage`     | Get usage metrics      |

<CardGroup cols="2">
  <Card title="Chat API" icon="comments" href="/api-reference/chat">
    Completions with streaming
  </Card>

  <Card title="Authentication" icon="lock" href="/api-reference/authentication">
    JWT and M2M setup
  </Card>
</CardGroup>

***

## FAQ

### Is the API RESTful?

Yes. The Cuadra AI API follows REST conventions with resource-based URLs, standard HTTP methods (GET, POST, PATCH, DELETE), and JSON payloads.

### What's the latency?

Depends on the LLM provider and response length. Typical first-token latency is 200-500ms. Use `stream: true` for perceived faster responses.

### Is there a sandbox environment?

No separate sandbox. Use the Free plan for testing.

### How do I handle rate limits?

Implement exponential backoff. Check the `Retry-After` header on 429 responses. See [Errors](/api-reference/errors) for retry logic examples.
