The Preparedness Challenge invites participants to consider how OpenAI’s state-of-the-art natural language, speech, and image generation models could hypothetically be misused by malicious actors.
The Frontier Model Forum was established in July by Anthropic, Google, Microsoft and OpenAI to promote the responsible development of large language models and other advanced AI systems referred to as "frontier models."
PAI's guidance aims to translate shared principles into practical actions companies can take to foster responsible model development. It offers a framework for collective research and standards around model safety as capabilities evolve.
As generative AI systems gain ground, the urgency to evaluate their social and ethical risks escalates. DeepMind introduces a holistic framework, emphasizing the significance of context in AI safety, marking a step forward in responsible AI evolution.
Researchers who provide qualified submissions demonstrating concrete security issues will receive bounties ranging from $2,000 to $15,000.
The program aims to help startups applying AI to cybersecurity challenges to boost their customer base, generate more revenue, and expand internationally.
By emphasizing corpus safety, model security, and rigorous assessment, the regulation intends to ensure that the rise of AI in China is both innovative and secure—all while upholding its socialist principles.
The new initiatives empower TikTok creators to showcase AI as part of their creative process. At the same time, the disclosures give viewers crucial context about the content they consume.
The goal is to identify potential risks and improve the safety of systems like ChatGPT and DALL-E before release.
The round marks the largest early stage raise this year for an AI security company, was led by Microsoft’s venture fund M12 and Moore Strategic Ventures.
The latest companies joining the effort are Adobe, Cohere, IBM, Nvidia, Palantir, Salesforce, Scale AI and Stability AI.
Companies can evaluate criteria such as accuracy, fairness, content quality, and more across different LLMs using standardized prompts designed for business applications.
The company says a content moderation system using GPT-4 results in much faster iteration on policy changes, reducing the cycle from months to hours.
The company's unique vantage point as a data partner for OpenAI, Meta, Microsoft, Anthropic, and others, positions them perfectly to usher in a new era of standardized AI testing and evaluation.
A new paper takes a closer look at serious open challenges related to RLHF, and emphasizes that relying solely on it for AI alignment is profoundly risky
The new class of adversarial attacks are capable of circumventing the alignment measures designed to prevent the generation of inappropriate or harmful content in multiple LLMs.
The Forum's establishment is rooted in its members' collective commitment to pooling their expertise for the sake of safer, more responsible AI development.
Given the strategic implications of this technology, Anthropic says frontier models must be secured to levels surpassing standard practices for commercial technologies.
The commitments, which companies say will remain in place until relevant regulations are enacted, address key risks associated with generative AI models surpassing today’s state-of-the-art capabilities.
The model learns when predictive AI is offering correct information, and when it's better to defer to a clinician.
The company's latest initiative aims to solve superintelligence alignment within four years.