OpenAI has announced a collaboration with Los Alamos National Laboratory (LANL), the birthplace of the atomic bomb, to explore how AI can safely advance bioscience research. The goal is to evaluate GPT-4o, their most capable AI model, in real-wold laboratory settings.
The project will assess GPT-4o's ability to assist both expert scientists and novices with complex biological tasks using its visual and voice capabilities. It's the first study of its kind, pushing the boundaries of AI integration in scientific research.
Mira Murati, OpenAI's CTO, expressed enthusiasm for this partnership, stating, "As a private company dedicated to serving the public interest, we're thrilled to announce a first-of-its-kind partnership with Los Alamos National Laboratory to study bioscience capabilities."
Researchers will run a series of lab experiments to test GPT-4o's performance in standard biological procedures. These tasks, while safe, are intended to serve as proxies for more complex operations that could pose dual-use concerns. The approach will introduce two major improvements on previous work:
- Real-world lab work: Unlike earlier text-based evaluations, this study puts GPT-4o to the test in actual laboratory settings. Written tasks don't fully capture the skills needed for biological benchwork. Knowing how to describe mass spectrometry is one thing; performing it correctly with real samples is far more challenging. This study will assess how GPT-4o performs when faced with the complexities of real laboratory environments.
- Multimodal interactions: While previous studies relied solely on GPT-4's text capabilities, GPT-4o can process visual and audio inputs. This leap allows for more intuitive problem-solving in the lab. For example, a researcher encountering an unfamiliar setup can show GPT-4o the problem through a camera and receive real-time guidance, rather than struggling to describe the issue in text.
This partnership also builds on OpenAI’s existing work in biothreat risk assessments and preparedness frameworks. By incorporating wet lab techniques and multiple modalities, the collaboration aims to develop new benchmarks for AI safety and efficacy in scientific research.