G42's Inception Unveils Jais, A Cutting-Edge Arabic Language Model

G42's Inception Unveils Jais, A Cutting-Edge Arabic Language Model
Image Credit: G42

Inception, a G42 company focused on applied AI research and advancements in the Middle East, has announced the release of Jais, an advanced 13-billion parameter Arabic language model. Developed in partnership with Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Jais represents a major advancement in Arabic natural language processing capabilities.

The release includes two models, Jais-13b which is the base Arabic centric model with 13B parameters as well as Jais-13b-chat, an instruction-tuned version. By open-sourcing Jais, Inception says it aims to engage the scientific, academic, and developer communities to accelerate the growth of a vibrant Arabic language AI ecosystem. This can serve as a model for other languages currently underrepresented in mainstream AI.

Summary of Jais models

The new model was trained on G42's Condor Galaxy 1 supercomputer, which was built in collaboration with Cerebras Systems. Condor Galaxy 1 delivers multi-exaFLOP AI computing power, enabling the rapid development of complex models like Jais. G42 and Cerebras first partnered in 2021 to bring high-performance AI infrastructure to the region.

Located in Santa Clara, California, Condor Galaxy 1, links 64 Cerebras CS-2 systems together into a single, easy-to-use AI supercomputer, with an AI training capacity of 4 exaFLOPs.

With its 13 billion parameters, Jais significantly outperforms previous open-source Arabic language models. It was trained on a dataset of 395 billion Arabic and English tokens, allowing it to process both languages with high accuracy. Unlike other multilingual models, Jais gives equal weight to Arabic, comprising 33% of its training data.

"We believe that innovation thrives when we collaborate," says Andrew Jackson, CEO of Inception. "With this release, we are setting a new standard for AI advancement in the Middle East and ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape. 

According to G42, Jais is the "world's highest quality Arabic Large Language Model" and represents a milestone for AI advancement in the Arabic-speaking world. By open-sourcing the model, G42 aims to spur innovation and collaboration in Arabic natural language processing.

Researchers praised the careful methodology behind the model, including the use of specialized techniques like ALiBi positional embeddings and SwiGLU activation functions. These optimizations unlock Jais' ability to understand nuanced linguistic patterns, provide improved context handling, and generate human-like text in Arabic.

English Results

G42 positions the release of Jais as an important step toward technology that bridges divides rather than exacerbating them. With over 400 million Arabic speakers worldwide, the availability of advanced Arabic language models helps democratize access to AI capabilities.

Jais is now available on Hugging Face, allowing developers and researchers to leverage its Arabic and English proficiencies. G42 plans to continue expanding the capabilities of Jais to further enhance Arabic language understanding.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.

Subscribe