Gemini 2.5 Flash

Google · Active · Updated May 18, 2026

Google's fastest and most cost-efficient model with upgraded reasoning, supporting high-volume production workloads.

Input Price
$0.30/M
per million tokens
Output Price
$2.50/M
per million tokens
Context Window
1,048,576
tokens
Max Output
8,192
tokens

Technical Specifications

ProviderGoogle
Release DateMarch 1, 2026
Pricing Typeper token
Input Price$0.3.00 / 1M tokens
Output Price$2.5.00 / 1M tokens
Cached Input$0.03 / 1M tokens
Context Window1,048,576 tokens
Max Output8,192 tokens
Input Modalitiestext, image, audio
Output Modalitiestext
Statusactive
Availabilityapi
Latencyvery fast
Rate Limit30,000 RPM
Pricing URLView official pricing →
Docs URLView documentation →

Capability Scores

Coding
76
Reasoning
74
Math
75
Image
66
Speed
96

Overview

Gemini 2.5 Flash builds on Google's speed-optimized model line with significantly improved reasoning and coding capabilities. It retains the massive 1M token context window while offering better benchmark scores across the board. At just $0.15 per million input tokens, it remains one of the most cost-effective models for high-throughput applications that need to process very long contexts.

Pros

  • +Fastest inference among long-context models (speed: 96/100)
  • +1M context window at a budget-friendly price
  • +Significantly improved reasoning over previous generation

Cons

  • Moderate coding and reasoning performance
  • Text-only output — no audio or image generation
  • Not suitable for complex multi-step agentic tasks

Compare with Alternatives

Use Cases

Real-time content moderation at scale
High-volume data extraction with long-context understanding
Cost-sensitive QA systems with extensive reference documents

Frequently Asked Questions about Gemini 2.5 Flash

How much does Gemini 2.5 Flash cost?
Gemini 2.5 Flash costs $0.3 per million input tokens and $2.5 per million output tokens. Cached input is $0.03 per million tokens.
What is the context window of Gemini 2.5 Flash?
Gemini 2.5 Flash has a 1,048,576 token context window, with a maximum output of 8,192 tokens.
Is Gemini 2.5 Flash good for coding?
Gemini 2.5 Flash scores 76/100 on coding benchmarks.
What modalities does Gemini 2.5 Flash support?
Gemini 2.5 Flash supports text, image, audio input and text output.