Google has announced Gemini 2.0 Flash Thinking, an experimental AI model that uses test-time compute to "reason" and solve harder problems. Unlike OpenAI's o1, Google makes the internal reasoning process fully visible to users in real-time, offering a window into how the model arrives at its responses.
Key Points
- This is the first model to bring test-time compute to Google's Gemini 2.0 Flash architecture
- It is available through Google AI Studio and the Gemini API
- The model has a 32k token input limit and only outputs text
Unlike traditional language models, Gemini 2.0 Flash Thinking pauses during computations to reason—considering related prompts and explaining its thinking before offering a solution. The release represents Google's entry into the growing field of "reasoning" AI models.
Jeff Dean, chief scientist at Google DeepMind, explained on social media that the model is "trained to use thoughts to strengthen its reasoning," suggesting a deliberate approach to making AI decision-making more transparent and potentially more reliable.
In demos shared by Google, the model handles both visual and text-only challenges, offering robust insights into problems that range from programming puzzles to physics equations.
Being able to see the reasoning traces of the model is a big deal. As noted by Andrej Karpathy (a founding member of OpenAI) on X, this brings significant value to both users and developers. This transparency not only enhances trust but also provides an educational component, allowing users to learn from the model's logical process and iterative thought. For developers, it opens avenues to analyze and improve the model’s decision-making, making it a more collaborative and insightful tool.
Still, Flash Thinking is very much an experimental model: it has a 32K token input limit, can only handle text and image inputs, and produces text-only outputs. Additionally, many of the built-in tools available in other models like search or code execution are not available.
If you want to explore Gemini 2.0 Flash Thinking for yourself, you have two ways to dive in. You can head to Google AI Studio, and simply select the Gemini 2.0 Flash Thinking Experimental model in the model drop-down menu in the Settings pane. There's a dedicated "Thoughts" panel that shows you exactly how the model reasons through problems. Or, if you prefer to work with code, you can access it through the Gemini API. When using the API, you'll find the model's thoughts as the first element in your response content – just specify either gemini-2.0-flash-thinking-exp
or gemini-2.0-flash-thinking-exp-1219
as your model code.