Gemini 2.5 Flash

Name: Gemini 2.5 Flash
Brand: Google
Price: 0.3 USD

Google · Active · Updated May 18, 2026

Google's fastest and most cost-efficient model with upgraded reasoning, supporting high-volume production workloads.

Pricing Compare

Input Price

$0.30/M

per million tokens

Output Price

$2.50/M

per million tokens

Context Window

1,048,576

tokens

Max Output

8,192

tokens

Technical Specifications

Provider	Google
Release Date	March 1, 2026
Pricing Type	per token
Input Price	$0.3.00 / 1M tokens
Output Price	$2.5.00 / 1M tokens
Cached Input	$0.03 / 1M tokens
Context Window	1,048,576 tokens
Max Output	8,192 tokens
Input Modalities	text, image, audio
Output Modalities	text
Status	active
Availability	api
Latency	very fast
Rate Limit	30,000 RPM
Pricing URL	View official pricing →
Docs URL	View documentation →

Capability Scores

Coding

Reasoning

Math

Image

Speed

Overview

Gemini 2.5 Flash builds on Google's speed-optimized model line with significantly improved reasoning and coding capabilities. It retains the massive 1M token context window while offering better benchmark scores across the board. At just $0.15 per million input tokens, it remains one of the most cost-effective models for high-throughput applications that need to process very long contexts.

Pros

+Fastest inference among long-context models (speed: 96/100)
+1M context window at a budget-friendly price
+Significantly improved reasoning over previous generation

Cons

−Moderate coding and reasoning performance
−Text-only output — no audio or image generation
−Not suitable for complex multi-step agentic tasks

Compare with Alternatives

vs Gpt 5 4 mini

GPT-5.4 mini wins on coding and reasoning; Gemini 2.5 Flash wins on speed, pricing, and context window.

Use Cases

Real-time content moderation at scale

High-volume data extraction with long-context understanding

Cost-sensitive QA systems with extensive reference documents

Frequently Asked Questions about Gemini 2.5 Flash

How much does Gemini 2.5 Flash cost?

Gemini 2.5 Flash costs $0.3 per million input tokens and $2.5 per million output tokens. Cached input is $0.03 per million tokens.

What is the context window of Gemini 2.5 Flash?

Gemini 2.5 Flash has a 1,048,576 token context window, with a maximum output of 8,192 tokens.

Is Gemini 2.5 Flash good for coding?

Gemini 2.5 Flash scores 76/100 on coding benchmarks.

What modalities does Gemini 2.5 Flash support?

Gemini 2.5 Flash supports text, image, audio input and text output.

← Back to all models