Join our daily and weekly newsletters to stay updated with the latest news and exclusive insights on industry-leading AI developments.
Enterprises are very optimistic about the potential of generative AI, investing billions and creating various applications like chatbots and search tools. However, making a commitment to AI and actually deploying it in production are two entirely different challenges.
Maxim, a startup based in California and founded by former Google and Postman executives Vaibhavi Gangwar and Akshay Deo, has launched a platform to tackle this problem. The startup has also secured $3 million in funding from Elevation Capital and other angel investors.
Maxim aims to solve the biggest challenge developers face when building large language model (LLM)-powered AI applications: monitoring different components throughout the development lifecycle. A minor error can disrupt the entire project, causing trust issues and delays.
Maxim’s platform focuses on testing and improving AI quality and safety both before and after release. It establishes a standard for evaluation, helping organizations streamline their AI development lifecycle and deliver high-quality products quickly.
Developing generative AI applications is particularly challenging because traditional software development followed a predictable path with standardized testing practices. In contrast, generative AI introduces numerous variables, leading to a complex, non-deterministic development process. Developers must monitor everything from the model used to the data and the way questions are framed by users.
Most organizations address this complexity by either hiring experts to manage every variable or by building internal tools, both of which are costly and distract from core business functions. Recognizing this gap, Gangwar and Deo launched Maxim to provide end-to-end evaluation across the AI development lifecycle. This includes prompt engineering and testing before release, as well as monitoring and optimization post-release.
Maxim’s platform consists of four core components: an experimentation suite, an evaluation toolkit, observability, and a data engine.
The experimentation suite, featuring a prompt CMS, IDE, visual workflow builder, and connectors to external data sources, serves as a testing ground for teams to iterate on different components of their AI systems. Imagine testing a single prompt on various models for a customer service chatbot to see which performs best.
The evaluation toolkit provides a unified framework for both AI and human-driven evaluations, enabling teams to measure improvements or regressions across large test suites. It displays evaluation results on dashboards covering aspects like tone, faithfulness, toxicity, and relevance.
The observability component works post-release, allowing users to monitor real-time production logs. It automates online evaluations to track and resolve issues, ensuring the application maintains the expected quality.
Using the observability suite, users can set up automated controls for various quality, safety, and security signals, such as toxicity, bias, hallucinations, and jailbreak attempts. Real-time alerts can notify them of any performance, cost, or quality regressions.
If issues are related to data, the data engine helps users curate and enrich datasets for fine-tuning, addressing problems more efficiently.
Although Maxim is still in its early stages, it claims to have helped a few dozen early partners test, iterate, and deploy their AI products up to five times faster. Most of its clients are in the B2B tech, gen AI services, BFSI, and Edtech industries, where evaluation challenges are most significant. The company focuses on mid-market and enterprise clients and plans to expand its market reach with its general availability.
Maxim also offers features tailored for enterprises, such as role-based access controls, compliance options, team collaboration, and the possibility of deploying within a virtual private cloud.
The company aims to standardize testing and evaluation across the AI development lifecycle. While competing with well-funded players like Dynatrace and Datadog, Maxim claims its integrated approach covers all phases of testing, experimentation, and observability in one platform.
The next steps for Maxim involve expanding its team and scaling operations to partner with more enterprises developing AI products. The company also plans to enhance platform capabilities, including proprietary domain-specific evaluations for quality and security, as well as a multi-modal data engine.