Braintrust Data Provides Enterprises with a Swift Approach to Assess LLMs

Braintrust Data Provides Enterprises with a Swift Approach to Assess LLMs

Stay updated with the latest industry-leading AI coverage by joining our daily and weekly newsletters.

Braintrust Data, a startup based in California, has just raised $5.1 million in seed funding led by Greylock Partners. This company, founded by Ankur Goyal—who previously sold his AI venture Impira to Figma—focuses on helping enterprises evaluate and improve AI models more efficiently. Although only two months old, Braintrust has already attracted dozens of customers and investments from industry veterans like Elad Gil, Clem Delangue, and Greg Brockman, among others.

Despite being a new player, Braintrust aims to solve the common problem of AI evaluation by providing a specialized tool that allows development teams to assess and refine their AI models before they go live. They plan to use the new funding to expand their team and continue developing their tools, enabling developers to innovate and deploy AI solutions more rapidly.

In the realm of modern business applications, AI often acts as the backbone. However, maintaining high performance can be challenging. Minor code adjustments attempting to enhance applications can sometimes disrupt entire workflows, forcing backend teams into a reactive mode to identify and fix issues. This approach can negatively affect customer experience, which is why teams emphasize the evaluation phase in development. They measure AI performance, analyze relevant data and metrics, and experiment with various models and techniques to achieve optimal results.

While effective, this traditional evaluation method is time-consuming and labor-intensive, often delaying feature releases. This was a problem Goyal encountered in his previous roles. Inspired by these challenges, he created Braintrust Data to facilitate faster, real-world testing of code changes. With Braintrust, teams can quickly set up evaluations, capture user feedback, and log interactions with AI models. This instant feedback loop allows them to see improvements or regressions immediately and debug issues before full deployment.

Braintrust launched its product in August 2023 and has since secured hundreds of enterprise and startup customers, including notable names like Airtable, Zapier, Coda, and Instacart. These customers have reported significant improvements in AI accuracy—over 30% in just a few weeks—leading to faster development cycles, increased user engagement, and enhanced team collaboration. The product’s ability to operate within an enterprise’s cloud environment addresses critical security needs, especially for AI tasks involving sensitive information.

In addition to performance evaluations, Braintrust also offers features to speed up AI development. These include a prompt playground for comparing multiple prompts, benchmarking tools, dataset management, and an AI proxy for accessing various popular models like those from OpenAI, Anthropic, LLaMa 2, and Mistral.

As businesses increasingly invest in AI, tools that can evaluate and refine AI models will become essential. Braintrust differentiates itself by providing insights before the models reach production, unlike many competitors who focus on post-deployment observability. By doing so, they help engineering teams move significantly faster and more efficiently.

With the recent funding from Greylock, Braintrust’s total raised capital now stands at $8.3 million. The company plans to use these funds to hire more talent and aggressively advance their product roadmap, enhancing their market-leading evaluation tools and expanding their offerings.