2023: A Remarkable Year for Open-Source Language Models

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

The arrival of ChatGPT in late 2022 ignited a fierce competition among AI companies and tech giants, all aiming to lead the growing market for large language model (LLM) applications. Due to this competition, many firms chose to offer their language models as proprietary services. They sold API access but kept the model weights, training datasets, and techniques under wraps.

Despite the trend towards private models, 2023 saw a rise in open-source LLMs. These models can be downloaded, run on your servers, and tailored for specific uses. The open-source community has matched the pace of private models, cementing its importance in the LLM industry landscape.

Here’s how the open-source LLM ecosystem evolved in 2023.

Is Bigger Better?

Before 2023, the common belief was that making LLMs more effective required increasing their size. Open-source models like BLOOM and OPT, similar to OpenAI’s GPT-3 with its 175 billion parameters, represented this idea. Although available to the public, these large models needed significant computational power and expertise to function well.

This view changed in February 2023 when Meta introduced Llama, a family of models ranging from 7 to 65 billion parameters. Llama showed that smaller language models could perform as well as larger ones by training on a significantly larger dataset. While GPT-3 had trained on about 300 billion tokens, Llama’s models used up to 1.4 trillion tokens. This approach, training smaller models on more data, proved to be revolutionary.

The Benefits of Open-Source Models

Llama’s success was due to two main features: its ability to run on just a few GPUs and its open-source release. This allowed the research community to build on its findings quickly. Llama’s release sparked the emergence of several open-source LLMs, each adding new elements to the ecosystem.

Notable models included Cerebras-GPT by Cerebras, Pythia by EleutherAI, MosaicML’s MPT, X-GEN by Salesforce, and Falcon by TIIUAE.

In July, Meta released Llama 2, leading to numerous derivative models. Mistral.AI stood out with its models, Mistral and Mixtral. Mixtral, in particular, received praise for its capabilities and cost-effectiveness.

Models like Alpaca, Vicuna, Dolly, and Koala were further developed on these foundational models and fine-tuned for specific applications. Data from Hugging Face, a machine learning model hub, shows the significant impact: there are over 14,500 model results for “Llama,” 3,500 for “Mistral,” and 2,400 for “Falcon.” Despite being released in December, Mixtral has already inspired 150 projects.

The open-source nature of these models facilitates the creation of new models and allows developers to combine them in various ways, enhancing their practicality and effectiveness in real-world applications.

The Future of Open Source Models

While proprietary models continue to advance and compete, the open-source community remains a strong contender. Tech giants are also recognizing this, increasingly integrating open-source models into their products.

Microsoft, a major backer of OpenAI, has released two open-source models, Orca and Phi-2, and has improved the integration of open-source models on its Azure AI Studio platform. Similarly, Amazon, a significant investor in Anthropic, introduced Bedrock, a cloud service for both proprietary and open-source models.

In 2023, many enterprises were caught by surprise by the capabilities of LLMs, highlighted by ChatGPT’s success. As CEOs asked their teams to define Generative AI use cases, companies quickly built proof-of-concept applications using closed model APIs.

However, relying on external APIs for core technologies poses significant risks, including exposing sensitive source code and customer data. This is not a sustainable strategy for companies prioritizing data privacy and security.

The growing open-source ecosystem offers a valuable option for businesses looking to integrate generative AI while addressing these concerns.

As AI becomes a standard method of building technology, it will need to be developed and managed in-house, with all the necessary privacy, security, and compliance measures. And as history suggests, this will likely involve open-source solutions.

Related Posts

DataStax Simplifies Development of Generative AI RAG Applications with Advanced Data API

Fable’s Simulation Open-Sources AI Toolset for Tomorrow’s Westworlds

How Integrating Endpoint Data in LLM Training Enhances Cybersecurity

Sure, here’s a rephrased version of the title: