Grok xAI

xAI Launches Grok-2 Models with Image Generation Capabilities

August 14, 2024 • 3 min read

Elon Musk’s AI company, xAI, has launched two new iterations of its Grok chatbot—Grok-2 and Grok-2 Mini—further advancing the company’s ambitious AI roadmap. Despite being just over a year old, xAI has made remarkable progress, releasing Grok-1 in November 2023, followed by the multimodal Grok-1.5V in April, earlier this year. The introduction of Grok-2, with its upgraded performance and new capabilities, marks another significant leap forward in the company’s rapid development trajectory.

Grok-2, the flagship model, has demonstrated competitive results in recent benchmarks. An early version, tested as "sus-column-r" on the LMSYS chatbot arena, secured the third position overall, excelling particularly in coding, hard prompts, and math tasks.

xAI says that Grok-2 outperforms both Claude 3.5 Sonnet and GPT-4-Turbo on the LMSYS leaderboard in overall Elo score. On the popular academic benchmarks, the model's performance was pretty much in line with other frontier models like GPT-4o, Claude 3.5, Llama 3 and Gemini 1.5.

Woah, another exciting update from Chatbot Arena❤️‍🔥

The results for @xAI’s sus-column-r (Grok 2 early version) are now public**!

With over 12,000 community votes, sus-column-r has secured the #3 spot on the overall leaderboard, even matching GPT-4o! It excels in Coding (#2),… https://t.co/gqSWSwYN0z pic.twitter.com/j9UYDBYNt4
— lmsys.org (@lmsysorg) August 14, 2024

Grok-2 mini, a smaller but capable variant, aims to balance speed and answer quality. Both models are now available to Grok users on the X social platform, with an enterprise API release planned later this month.

A big new feature of the new Grok models is their image-generation capability, powered by Black Forest Lab’s Flux 1 model. Users can generate and share images directly on the X via a post or DM.

My first Grok 2 image generated in a couple of seconds… unrealhttps://t.co/FC0cgWWhh3 pic.twitter.com/XTb1spoiIn
— Adriano 🚀 (@AdrianoinJapan) August 14, 2024

However, this new capability raises important questions about content authenticity on social media. Currently, there appear to be very limited guardrails around what content can be generated, with users already creating images featuring the likenesses of political figures.

Ok Boys and Girls Grok 2.0 is the best image generator available right now in the AI world. pic.twitter.com/a1gyV77O7o
— Uttkarsh Singh (@Uttupaaji) August 14, 2024

Moreover, there is no visual indicator on X to signify that an image has been AI-generated, which could potentially lead to issues of misinformation or misrepresentation.

some images I’ve managed to create using X’s new Grok 2 AI image generator. Mario with a cigarette and a beer, and Master Chief playing on a PS5 pic.twitter.com/RvluDXqnw2
— Tom Warren (@tomwarren) August 14, 2024

Additionally, there doesn't appear to be any support for digital watermarking techniques and embedded content credentials within the generated images.

In defense of xAI, it’s worth noting that Flux 1 is an open-source model, so users would have access to these capabilities regardless of its integration into Grok. The company could argue that it values freedom of speech and expression, principles that have been central to Elon Musk’s vision for X.

However, this situation brings several important questions to the forefront of the AI ethics debate around responsibility and liability associated with AI-generated content. Who should be held accountable if inappropriate or illegal content is created—the individual, the model provider and/or the platform? Should AI tools be designed to censor certain language or restrict the creation of specific types of content, or does that infringe on personal freedoms?

These are complex issues. And it would be prudent for the AI industry and society as a whole to debate and address them before these technologies become more capable and pervasive.

It's worth noting that xAI has shared minimal technical details about Grok-2 and Grok-2 mini. Critical information such as context length and model sizes remains undisclosed. This makes it challenging to properly assess and rank the capabilities and potential limitations of these new models.

Both models are now available to X Premium and Premium+ users, with an enterprise API release planned later this month. As this is a beta release, further refinements and more detailed technical information may be forthcoming.

One more thing, it's not clear whether Grok-2 is a multimodal model. While its predecessor, Grok-1.5V, could process various visual inputs including documents, diagrams, and photographs, xAI has not specified whether Grok-2 retains these abilities. The company previously touted Grok-1.5V's performance on their RealWorldQA benchmark, demonstrating its prowess in multi-disciplinary reasoning and understanding spatial relationships. These benchmarks are notably absent from Grok-2's announcement.

We've reached out to Elon Musk and the xAI team for clarification on Grok-2's multimodal status.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

An Exclusive Leadership Retreat

Leading in the Intelligence Age

xAI Launches Grok-2 Models with Image Generation Capabilities