Chat
POST /v1/chat/completions
Use chat completions for LLM and vision models. Streaming is supported with Server-Sent Events.
Request
Section titled “Request”{ "model": "gemini-3-flash", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"} ], "temperature": 0.7, "max_tokens": 1024, "stream": false}Key fields
Section titled “Key fields”| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Model ID |
messages | array | yes | Chat messages |
temperature | number | no | Sampling temperature |
top_p | number | no | Nucleus sampling |
max_tokens | integer | no | Maximum output tokens |
stream | boolean | no | Enable SSE streaming |
stop | string or array | no | Stop sequences |
tools | array | no | Function calling tools |
tool_choice | string or object | no | Tool selection strategy |
response_format | object | no | JSON mode with {"type":"json_object"} |
seed | integer | no | Reproducibility seed when supported |
think | boolean | no | Parel reasoning mode when supported |
Vision input
Section titled “Vision input”{ "model": "gemini-3-flash", "messages": [ { "role": "user", "content": [ {"type": "text", "text": "Describe this image."}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}} ] } ]}Streaming
Section titled “Streaming”stream = client.chat.completions.create( model="gemini-3-flash", messages=[{"role": "user", "content": "Write a short poem"}], stream=True,)
for chunk in stream: text = chunk.choices[0].delta.content if text: print(text, end="")