Elon Musk’s AI company, xAI, has launched two new iterations of its Grok chatbot—Grok-2 and Grok-2 Mini—further advancing the company’s ambitious AI roadmap. Despite being just over a year old, xAI has made remarkable progress, releasing Grok-1 in November 2023, followed by the multimodal Grok-1.5V in April, earlier this year. The introduction of Grok-2, with its upgraded performance and new capabilities, marks another significant leap forward in the company’s rapid development trajectory.
Grok-2, the flagship model, has demonstrated competitive results in recent benchmarks. An early version, tested as "sus-column-r" on the LMSYS chatbot arena, secured the third position overall, excelling particularly in coding, hard prompts, and math tasks.
xAI says that Grok-2 outperforms both Claude 3.5 Sonnet and GPT-4-Turbo on the LMSYS leaderboard in overall Elo score. On the popular academic benchmarks, the model's performance was pretty much in line with other frontier models like GPT-4o, Claude 3.5, Llama 3 and Gemini 1.5.
Grok-2 mini, a smaller but capable variant, aims to balance speed and answer quality. Both models are now available to Grok users on the X social platform, with an enterprise API release planned later this month.
A big new feature of the new Grok models is their image-generation capability, powered by Black Forest Lab’s Flux 1 model. Users can generate and share images directly on the X via a post or DM.
However, this new capability raises important questions about content authenticity on social media. Currently, there appear to be very limited guardrails around what content can be generated, with users already creating images featuring the likenesses of political figures.
Moreover, there is no visual indicator on X to signify that an image has been AI-generated, which could potentially lead to issues of misinformation or misrepresentation.
Additionally, there doesn't appear to be any support for digital watermarking techniques and embedded content credentials within the generated images.
In defense of xAI, it’s worth noting that Flux 1 is an open-source model, so users would have access to these capabilities regardless of its integration into Grok. The company could argue that it values freedom of speech and expression, principles that have been central to Elon Musk’s vision for X.
However, this situation brings several important questions to the forefront of the AI ethics debate around responsibility and liability associated with AI-generated content. Who should be held accountable if inappropriate or illegal content is created—the individual, the model provider and/or the platform? Should AI tools be designed to censor certain language or restrict the creation of specific types of content, or does that infringe on personal freedoms?
These are complex issues. And it would be prudent for the AI industry and society as a whole to debate and address them before these technologies become more capable and pervasive.
It's worth noting that xAI has shared minimal technical details about Grok-2 and Grok-2 mini. Critical information such as context length and model sizes remains undisclosed. This makes it challenging to properly assess and rank the capabilities and potential limitations of these new models.
Both models are now available to X Premium and Premium+ users, with an enterprise API release planned later this month. As this is a beta release, further refinements and more detailed technical information may be forthcoming.
One more thing, it's not clear whether Grok-2 is a multimodal model. While its predecessor, Grok-1.5V, could process various visual inputs including documents, diagrams, and photographs, xAI has not specified whether Grok-2 retains these abilities. The company previously touted Grok-1.5V's performance on their RealWorldQA benchmark, demonstrating its prowess in multi-disciplinary reasoning and understanding spatial relationships. These benchmarks are notably absent from Grok-2's announcement.
We've reached out to Elon Musk and the xAI team for clarification on Grok-2's multimodal status.