Gemma 2, Google's latest release in its family of open-weight models, is now available to reseachers and developers worldwide. The company is releasing the model in two variants—9 billion and 27 billion parameter sizes. Here's the technical report.
Earlier this year, Google introduced the Gemma family of lightweight, state-of-the-art open models, building on the same research and technology used to create the Gemini models. The family has since expanded with variants like CodeGemma, RecurrentGemma, and PaliGemma, each tailored for specific AI tasks. With Gemma 2, which Google first announced last month, the company is pushing the boundaries even further.
The 27B model, in particular, offers performance competitive with proprietary models more than twice its size – a feat that would have been unthinkable just six months ago. It approaches the performance of larger models like Llama 3 70B and Claude 3 Sonnet, despite being significantly smaller.
One of the most striking aspects of Gemma 2 is its efficiency. Google claims the 27B model can run at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. THis significantly reduces the hardware requirements and costs associated with deploying such powerful AI models. This efficiency doesn't come at the expense of speed – Gemma 2 is optimized for rapid inference across a wide range of hardware, from high-end cloud setups to consumer-grade gaming laptops.
For developers and researchers, Gemma 2 is definitely worth a consideration. It's released under a commercially-friendly license, allowing for both research and potential monetization of innovations built on top of it. The model is designed for broad compatibility, integrating seamlessly with popular AI frameworks like Hugging Face Transformers, JAX, PyTorch, and TensorFlow.
Gemma 2 is already available in Google AI Studio and is coming soon to the Vertex AI Model Garden. You can also download Gemma 2’s model weights from Kaggle and Hugging Face Models.