GeminiChat
Google: Gemini 2.5 Flash Lite
google/gemini-2.5-flash-lite
1049KContext Window
66KMax Output
Supported Protocols:reasoninginclude_reasoningstructured_outputsresponse_formatmax_tokenstemperaturetop_pseedtoolstool_choicestop
Normal
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence.
Capabilities
👁 Vision🧠 Reasoning🔧 Function CallingText GenerationCode GenerationAnalysis & ReasoningReasoning
Technical Specs
Input Modality
Text、Image、Text、Audio、Video
Output Modality
Text
Arch
—
Default Temperature
0.7
Default Top_P
1
Pricing
Pay per use, no monthly fees| Billing Type | Unit | Price |
|---|---|---|
| Text Input | — | $0.1000/M tokens |
| Text Output | — | $0.4000/M tokens |
| Cache Read | — | $0.0100/M tokens |
| Cache Write 1h | — | $0.0833/M tokens |
| Cache Write | — | $0.0833/M tokens |
| Audio Cache Read | — | $0.0300/M tokens |
| Image Input | — | < $0.001/张 |
| Audio Input | — | $0.3000/分钟 |
| Web Search | — | $0.0140/次 |
Quick Start
from openai import OpenAI
client = OpenAI(
base_url="https://api.uniontoken.ai/v1",
api_key="YOUR_UNIONTOKEN_API_KEY",
)
response = client.chat.completions.create(
model="google/gemini-2.5-flash-lite",
messages=[
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)FAQ
Gemini
Google: Gemini 2.5 Flash Lite
google/gemini-2.5-flash-lite
In< ¥0.001/1K
Out< ¥0.001/1K
Context Window1049K
Max Output66K
Related Models
View All → →Ready to get started?
Get 1M free tokens on registration, no monthly fees or minimum spend
Register Now →