Chat Completions
Request
Parameters
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Model ID (e.g., qwen3.5:9b) |
messages | array | yes | Conversation messages |
max_tokens | integer | no | Max tokens to generate (default: 2048) |
temperature | float | no | Sampling temperature (default: 0.7) |
stream | boolean | no | Enable SSE streaming (default: false) |
timeout_secs | integer | no | Timeout in seconds (default: 120) |
response_format | object | no | Structured output schema |
Message format
Each message has arole and content:
Vision (multimodal)
For vision models,content can be an array of parts:
Structured output
Useresponse_format to get JSON conforming to a schema:
Response
Streaming
Set"stream": true to receive Server-Sent Events:
data: line with a JSON chunk:
data: [DONE].