Patronus AI emerged from stealth today with the launch of its automated evaluation and security platform for large language models (LLMs). The start-up also announced $3 million in seed funding to support its mission of enabling enterprises to safely deploy LLMs with minimized risk.
Patronus AI was founded by AI experts Anand Kannappan and Rebecca Qian, who previously led research efforts at Meta. They recognized the industry's need for rigorous, scalable testing and benchmarking as interest in generative AI surges.
The Patronus platform aims to address key challenges enterprises face in leveraging LLMs: the inability to predict failures, lack of testing standards, and ineffective manual evaluation. Patronus provides scoring, test generation, and benchmarking to proactively detect potential model mistakes. It enables companies to:
- Score Model Performance: Using key criteria like hallucinations and safety, the platform grades models based on their performance in real-world scenarios.
- Automate Testing: Patronus AI can generate adversarial test suites at scale, significantly reducing the time and cost currently involved in manual evaluations.
- Benchmarking: The platform aids businesses in choosing the best model for their specific use cases by providing comparative assessments.
"The risk of unexpected model behavior and incorrect outputs has made many companies hesitant to fully embrace LLMs. Our platform automates and scales the labor-intensive and costly evaluation methods currently prevalent in enterprises," said CEO Anand Kannappan.
The seed round was led by Lightspeed Venture Partners, with participation from Factorial Capital and others. Lightspeed partner Nnamdi Iregbulem cited the company's technology and team as reasons for investing.
Early customers include AI companies like Cohere, Nomic, and Naologic. Traditional enterprises in sectors like finance are also piloting Patronus.
"This launch is just the tip of the iceberg for us," said Rebecca Qian, co-founder of Patronus AI. "We're eager to scale our world-class team and continue to partner with enterprises to make generative AI a trusted and integral part of their operations."
Patronus joins other companies like Arthur that are tackling business challenges in the emerging field of machine learning operations (MLOps). MLOps focuses on the operational side of machine learning, providing tools and processes for monitoring, testing, and troubleshooting models. This allows companies to reliably roll out AI systems at scale. As generative AI reaches broader enterprise adoption, MLOps is becoming increasingly critical infrastructure.