San Francisco-based Scale AI has announced the launch of a new research initiative called the Safety, Evaluations and Analysis Lab (SEAL) that will focus on enhancing transparency and standardization around the evaluation and safety testing processes for large language models (LLMs).
The lab will be led by Dr. Summer Yue, formerly a research lead on safety for Google DeepMind's LLM Bard. Dr. Yue brings expertise in AI safety and standards from her background at DeepMind to spearhead Scale's efforts around frontier AI evaluations.
Scale's goal with SEAL is to collaborate with government and industry stakeholders to develop robust benchmarking methodologies and products for evaluating potential risks with LLMs. This includes risks outlined in the U.S. government's Executive Order on AI like enabling dangerous content and cybersecurity vulnerabilities.
The company sees an opportunity to bring more structure and transparency to how safety standards are set and applied in the AI industry. Scale plans to share its progress openly and engage with the broader research community as it refines its evaluation offerings.
Key focus areas for research at SEAL include:
- Designing comprehensive LLM safety evaluation benchmarks
- Improving reproducibility and reliability in testing methodologies
- Advancing automated rating systems using Scale's data resources
- Developing "red teaming" techniques to rigorously probe for flaws
Scale will continue providing customized evaluation services for partners tailored to specific product needs. However, the company believes establishing broadly applicable safety benchmarks will bring more efficiency and transparency to the LLM development process.
With major AI players like Google, Microsoft, Anthropic and others racing to release more capable LLMs, Scale's investments in safety testing reflect the industry's acknowledgement of the critical importance of trustworthy AI. The SEAL lab represents a meaningful commitment to pinpointing risks before large language models make it into production applications.