Stability AI Unveils 1.6B Parameter Stable LM 2 Model

Stability AI Unveils 1.6B Parameter Stable LM 2 Model

Stability AI has released Stable LM 2 1.6B, the first model in their new series of multilingual language models. Its small size, multilingual capabilities, and strong performance across a variety of natural language tasks makes it a valuable tool for developers.

The model was pre-trained on 2 trillion tokens of a filtered mixture of open-source, large-scale datasets (supplemented with multi-lingual data from CulturaX) on 512 NVIDIA A100 40GB GPUs (AWS P4d instances). It fluently handles English, Spanish, German, Italian, French, Portuguese, and Dutch. Stability AI says the model architecture incorporates new algorithmic advancements in language modeling to balance speed and performance while allowing for quicker training times and iteration.

Open LLM Leaderboard evaluations.

In the world of small language models, the Stable LM 2 1.6B is a standout performer. Benchmark results show it achieves state-of-the-art results for a model under 2 billion parameters. It outscored models like Microsoft's Phi-1.5 (1.3B), TinyLlama 1.1B, and Falcon 1B on the majority of tasks in the Open LLM Leaderboard.

0-shot average accuracy performance on Okapi translated benchmarks and multilingual LAMBADA.

Thanks to its multilingual data, Stable LM 2 also delivered stronger accuracy on translated dataset versions from ARC, HellaSwag, TruthfulQA, and others. Additionally, the model's prowess is further illustrated in its MT Bench performance, where it shows competitive results, matching or even surpassing significantly larger models.

According to MT Bench results, Stable LM 2 1.6B shows competitive performance, matching or even surpassing significantly larger models.

One of the most appealing aspects of the Stable LM 2 1.6B is its compact size and speed. This means significantly lower hardware requirements both for training and deployment. However, it is important to note that smaller models do have some drawbacks, such as increased hallucinations and decreased reasoning/other emergent capabilities. You can quickly test the model using the embedded Hugging Face Space below:

Stability AI has released the base model as well as an instruction-tuned version. The company has also provided the last checkpoint before the pre-training cooldown, complete with optimizer states. This transparency is a boon for developers looking to fine-tune and experiment with the model. The company says it will release a technical report that will provide more specifics on the data details and training procedures.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.