
Grok 3, xAI's latest entry in a very competitive AI race was a highly anticipated launch by the entire AI community. Benchmark tests shared by the team show Grok 3 outperforming leading models from OpenAI, Google, and Anthropic in specialized areas including mathematics competitions (AIME), scientific reasoning (GPQA), and coding challenges (LCB).
Key Points
- Grok 3 achieved record-breaking 1400+ Elo score in Chatbot Arena blind testing
- Powered by massive 200,000 GPU infrastructure, representing 10x increase in computing power
- Introduces advanced reasoning modes and AI-powered DeepSearch capability
- Available through tiered subscription model starting with X Premium+ access
During a live-streamed presentation, xAI showcased Grok 3's enhanced capabilities through impressive demonstrations. The model successfully created a novel game combining Tetris with Bejeweled and generated an animated visualization of optimal Earth-Mars spacecraft trajectories, highlighting its ability to handle complex creative and computational tasks.

The Grok 3 family introduces several specialized models:
- Grok 3 Base Model: Enhanced core chatting capabilities balancing utility and engagement
- Grok 3 Mini: Optimized for speed with slightly reduced accuracy for quick queries
- Grok 3 Reasoning and Mini Reasoning: Specialized versions incorporating advanced problem-solving capabilities
- "Big Brain" mode: Optional enhanced computational resources for tackling especially complex problems
A standout feature is DeepSearch, xAI's version of similar AI-powered research tools launched by other AI labs. This capability provides comprehensive analysis of internet and platform data, potentially transforming how users access and interpret vast information sources. The company also announced upcoming voice interaction features, promising natural conversational experiences in the coming weeks.
The technical foundation for these advances lies in xAI's Colossus supercomputer, which doubled its capacity from 100,000 to 200,000 GPUs in just months. This unprecedented computing power has translated into superior benchmark performance, with Grok 3 surpassing competitors across various tests.
Independent validation has also come from notable industry figures. Andrej Karpathy, former OpenAI researcher and ex-Tesla AI lead, praised Grok 3's logical reasoning capabilities, noting its performance rivals OpenAI's o1-pro model, which commands a $200 monthly subscription. However, he also identified some limitations, including occasional hallucinations and factual inaccuracies in the DeepSearch feature.
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.
— Andrej Karpathy (@karpathy) February 18, 2025
Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan… pic.twitter.com/qIrUAN1IfD
Access to Grok 3 follows a tiered model, with basic features available to X Premium+ subscribers ($40/month) and advanced capabilities, including enhanced reasoning and DeepSearch, offered through a separate SuperGrok subscription ($30/month).
The launch represents a remarkable achievement for xAI, which has rapidly developed competitive AI technology in just over a year since its founding. The company's success has been largely driven by its innovative approach to computing infrastructure and access to substantial computational resources. With additional computing clusters planned, xAI appears positioned for continued advancement in model capabilities.
As the AI landscape becomes increasingly competitive, Grok 3's debut suggests a significant shift in the industry dynamics. While established players like OpenAI and Google maintain strong positions, xAI's rapid progress and technical achievements indicate the emergence of a serious new contender in advanced artificial intelligence.