Sign up for our daily and weekly newsletters to get the latest updates and exclusive content on leading AI developments.
When it comes to large language models (LLMs), size is a crucial factor because it affects where and how the model can be used. Stability AI, known for its stable diffusion text-to-image technology, has just released one of its smallest models yet: Stable LM 2 1.6B. Stability AI initially introduced Stable LM in April 2023, featuring models with 3 billion and 7 billion parameters. The new model, Stable LM 2 1.6B, is the second one released by the company in 2024, following the launch of Stable Code 3B earlier this week.
This new, more compact Stable LM aims to make generative AI more accessible to developers, incorporating multilingual data in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. Leveraging recent advancements in language modeling algorithms, Stability AI strives to balance speed and performance optimally.
Carlos Riquelme, Head of the Language Team at Stability AI, mentioned that generally, larger models trained on similar data tend to perform better. However, recent advancements in algorithms and higher-quality data have allowed newer smaller models to sometimes outperform older, larger ones.
Stability AI claims that Stable LM 2 1.6B outperforms other small language models with under 2 billion parameters on various benchmarks, including those by Microsoft’s Phi-2 (2.7B), TinyLlama 1.1B, and Falcon 1B. It even surpasses some larger models, such as Stability AI’s own previous Stable LM 3B model.
Despite its impressive performance, the smaller size of Stable LM 2 1.6B comes with some limitations. Stability AI warns that smaller models like this one may exhibit higher rates of hallucinations or potential toxic language.
Stability AI has been focusing on smaller, more powerful LLM options for the past few months. In December 2023, they released the StableLM Zephyr 3B model, offering better performance in a smaller size compared to the initial April version.
Riquelme elaborated that the new Stable LM 2 models are trained on more and varied data, including multilingual documents. An intriguing aspect he highlighted is the sequence in which the data is presented to the model during training, suggesting that varying data types at different stages could be beneficial.
Additionally, Stability AI is offering the new models in both pre-trained and fine-tuned options, as well as a format they describe as the last model checkpoint before the pre-training cooldown.
Riquelme explained that during training, the model is continually updated and its performance improves. Initially, the model knows nothing, but by the end, it has learned most aspects of the data. However, models may become less flexible towards the end of training as they finalize their learning.
To address this, Stability AI is providing the model just before the final training stage, making it hopefully easier to adapt to other tasks or datasets. They believe in the potential of individual developers to use these new tools and models in innovative ways.