Skip to content

Chat

POST /v1/chat/completions

Use chat completions for LLM and vision models. Streaming is supported with Server-Sent Events.

{
"model": "gemini-3-flash",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is 2+2?"}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}
FieldTypeRequiredDescription
modelstringyesModel ID
messagesarrayyesChat messages
temperaturenumbernoSampling temperature
top_pnumbernoNucleus sampling
max_tokensintegernoMaximum output tokens
streambooleannoEnable SSE streaming
stopstring or arraynoStop sequences
toolsarraynoFunction calling tools
tool_choicestring or objectnoTool selection strategy
response_formatobjectnoJSON mode with {"type":"json_object"}
seedintegernoReproducibility seed when supported
thinkbooleannoParel reasoning mode when supported
{
"model": "gemini-3-flash",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image."},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}
]
}
stream = client.chat.completions.create(
model="gemini-3-flash",
messages=[{"role": "user", "content": "Write a short poem"}],
stream=True,
)
for chunk in stream:
text = chunk.choices[0].delta.content
if text:
print(text, end="")