Mistral Introduces Efficient Tools for Streamlined Customization of Its Models

Mistral Introduces Efficient Tools for Streamlined Customization of Its Models

Get daily and weekly updates with exclusive content on leading AI industry coverage.

Fine-tuning is essential for improving large language model (LLM) outputs and customizing them for specific enterprise needs. When done right, it leads to more accurate and useful model responses, allowing organizations to get more value and precision from their AI applications. However, it’s not a cheap process and can be quite costly, making it difficult for some businesses to utilize it effectively.

Mistral, an open-source AI model provider, is stepping into the fine-tuning arena. Just 14 months after launching, the company is poised to reach a $6 billion valuation. They’re now offering new customization options on their AI developer platform, La Plateforme. These new tools are designed to provide efficient fine-tuning, significantly cutting training costs and lowering entry barriers.

The company, named after the strong winds in southern France, has been rapidly rolling out innovations and attracting substantial funding. In a blog post, they explained that when tuning a smaller model for specific uses, it can match the performance of larger models, cut deployment costs, and boost application speed.

Mistral, known for releasing powerful LLMs under open source licenses, allows anyone to adapt these models for free. They also offer paid tools like an API and their developer platform, La Plateforme, to simplify development. Instead of running a Mistral LLM on your servers, you can build an app using their API calls.

Customers now have the option to customize Mistral models on their platform, on their own infrastructure with open-source code from Mistral on Github, or through custom training services. Developers can now use mistral-finetune, a lightweight codebase based on the LoRA paradigm, which requires fewer parameters for model training. This allows for efficient fine-tuning without compromising performance or memory efficiency.

For serverless fine-tuning, Mistral offers new services refined through their R&D. Their LoRA adapters maintain the base model’s knowledge while enabling efficient serving. This new service aims to help AI application developers by providing fast and cost-effective model adaptations.

These fine-tuning services are compatible with Mistral’s 7.3B parameter model Mistral 7B and Mistral Small. Current users can immediately start customizing their models using Mistral’s API, and new models will be added soon. Custom training services also allow fine-tuning Mistral AI models on specific applications using proprietary data, often incorporating techniques like continuous pretraining to integrate unique knowledge into the models. This approach results in highly specialized and optimized models for specific domains.

Mistral has also launched an AI fine-tuning hackathon, running until June 30, allowing developers to experiment with their new fine-tuning API.

Since its founding in April 2023 by former Google DeepMind and Meta employees, Mistral has rapidly grown. They secured a record $118 million seed round, the largest in European history, and partnered with IBM and others. In February, they released Mistral Large through a deal with Microsoft to make it available via Azure cloud. Recently, SAP and Cisco announced their support for Mistral, and the company introduced Codestral, a code-centric LLM claimed to outperform others. Mistral is reportedly nearing a $600 million funding round, which would value the company at $6 billion.

Mistral Large competes directly with OpenAI and Meta’s Llama 3, and according to benchmarks, it is the second most capable commercial language model behind OpenAI’s GPT-4. Mistral 7B, released in September 2023, claims to outperform Llama on multiple benchmarks and is closing in on CodeLlama 7B’s performance on code tasks.

What’s next for Mistral? We’ll likely find out soon.