AI
AI Safety Under the Microscope: Researchers Develop Benchmark for Assessing Model Risks
To go back to this article, navigate to My Profile and then select View saved stories.
Experts Have Assessed AI Models for Potential Dangers, Uncovering Vast Discrepancies
Bo Li, a faculty member at the University of Chicago with expertise in challenging and exposing the flaws in AI models, has emerged as a key resource for several advisory companies. These firms are increasingly prioritizing concerns over the legal, ethical, and regulatory risks posed by AI models over their intelligence levels.
Li, along with associates from various academic institutions, Virtue AI – a company initiated by Li, and Lapis Labs, have recently formulated a classification system for AI hazards. Additionally, they've created a standard that measures the extent to which various substantial language models deviate from established rules. Li conveyed to WIRED the necessity for establishing guidelines aimed at ensuring AI safety, both in the context of adhering to regulations and in everyday applications.
The team examined policies on AI governance and standards set by authorities from regions such as the United States, China, and the European Union, along with reviewing the operational policies of 16 leading AI firms globally.
The scientists developed AIR-Bench 2024, a benchmarking tool that employs a multitude of prompts to assess how well-known AI models perform regarding particular risks. For instance, it reveals that Anthropic's Claude 3 Opus is highly effective at declining to create cybersecurity threats, whereas Google's Gemini 1.5 Pro excels in not producing nonconsensual sexual content.
Databricks' DBRX Instruct, a newly developed model, performed poorly in all areas. Upon its launch in March, the company pledged ongoing enhancements to the safety mechanisms of DBRX Instruct.
Anthropic, Google, and Databricks have yet to reply to a request for a statement.
Grasping the landscape of risks, along with the advantages and disadvantages of particular models, could grow in importance for businesses aiming to implement AI in specific markets or for certain applications. For example, a business planning to utilize a Large Language Model (LLM) for customer support might prioritize the model's tendency to generate inappropriate language under provocation over its ability to conceptualize a nuclear device.
Bo highlights that the study uncovers intriguing challenges in the development and oversight of AI. Specifically, the team discovered that corporate guidelines are generally more detailed than governmental regulations, indicating potential areas for stricter regulatory measures.
The study also indicates that there is potential for numerous firms to enhance the security of their models. "When you evaluate certain models based on a company's internal guidelines, they don't always align," Bo notes. "This implies there's significant scope for enhancement."
This week, a pair of researchers from MIT introduced a comprehensive database they've created, which aggregates the potential dangers of AI drawn from 43 distinct AI risk frameworks. According to Neil Thompson, a research scientist at MIT who is part of the initiative, the aim is to provide clarity in the complex and often chaotic domain of AI risks. He notes that numerous organizations are still in the initial phases of integrating AI, highlighting the crucial need for advice on the potential hazards involved.
Peter Slattery, who is spearheading the initiative and is part of MIT’s FutureTech group focusing on advancements in computing, points out that the database reveals a disparity in the attention given to various AI risks. For example, while over 70 percent of the frameworks discussed highlight concerns about privacy and security, only about 40 percent address the issue of misinformation.
As artificial intelligence continues to advance, the strategies used to identify and assess its dangers must also progress. Li emphasizes the need to delve into new challenges, including the compelling nature of AI interactions. Her firm conducted an in-depth study of Meta's most advanced Llama 3.1 model, revealing that despite its enhanced functionality, its safety has not seen comparable advancements, indicating a wider issue. "There hasn't been a noticeable enhancement in safety," Li observes.
Discover More…
Explore the World of Politics: Subscribe to our newsletter and tune into our podcast.
The outcomes of distributing no-strings-attached cash to individuals
Weight loss isn't guaranteed for all Ozempic users
The Pentagon plans to allocate $141 billion towards the development of an apocalypse device.
Gathering Announcement: Be part of the Energy Tech Summit happening on October 10th in Berlin.
Additional Content from WIRED
Critiques and Manuals
© 2024 Condé Nast. All rights reserved. Purchases made through our site may generate revenue for WIRED as part of our Affiliate Partnerships with retail partners. Content from this site is prohibited from being copied, shared, distributed, or used in any form without explicit written consent from Condé Nast. Ad Choices
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.