AI

AI Safety Under the Microscope: Researchers Develop Benchmark for Assessing Model Risks

Published

1 month ago

August 16, 2024

To go back to this article, navigate to My Profile and then select View saved stories.

Experts Have Assessed AI Models for Potential Dangers, Uncovering Vast Discrepancies

Bo Li, a faculty member at the University of Chicago with expertise in challenging and exposing the flaws in AI models, has emerged as a key resource for several advisory companies. These firms are increasingly prioritizing concerns over the legal, ethical, and regulatory risks posed by AI models over their intelligence levels.

Li, along with associates from various academic institutions, Virtue AI – a company initiated by Li, and Lapis Labs, have recently formulated a classification system for AI hazards. Additionally, they've created a standard that measures the extent to which various substantial language models deviate from established rules. Li conveyed to WIRED the necessity for establishing guidelines aimed at ensuring AI safety, both in the context of adhering to regulations and in everyday applications.

The team examined policies on AI governance and standards set by authorities from regions such as the United States, China, and the European Union, along with reviewing the operational policies of 16 leading AI firms globally.

The scientists developed AIR-Bench 2024, a benchmarking tool that employs a multitude of prompts to assess how well-known AI models perform regarding particular risks. For instance, it reveals that Anthropic's Claude 3 Opus is highly effective at declining to create cybersecurity threats, whereas Google's Gemini 1.5 Pro excels in not producing nonconsensual sexual content.

Databricks' DBRX Instruct, a newly developed model, performed poorly in all areas. Upon its launch in March, the company pledged ongoing enhancements to the safety mechanisms of DBRX Instruct.

Anthropic, Google, and Databricks have yet to reply to a request for a statement.

Grasping the landscape of risks, along with the advantages and disadvantages of particular models, could grow in importance for businesses aiming to implement AI in specific markets or for certain applications. For example, a business planning to utilize a Large Language Model (LLM) for customer support might prioritize the model's tendency to generate inappropriate language under provocation over its ability to conceptualize a nuclear device.

Bo highlights that the study uncovers intriguing challenges in the development and oversight of AI. Specifically, the team discovered that corporate guidelines are generally more detailed than governmental regulations, indicating potential areas for stricter regulatory measures.

The study also indicates that there is potential for numerous firms to enhance the security of their models. "When you evaluate certain models based on a company's internal guidelines, they don't always align," Bo notes. "This implies there's significant scope for enhancement."

This week, a pair of researchers from MIT introduced a comprehensive database they've created, which aggregates the potential dangers of AI drawn from 43 distinct AI risk frameworks. According to Neil Thompson, a research scientist at MIT who is part of the initiative, the aim is to provide clarity in the complex and often chaotic domain of AI risks. He notes that numerous organizations are still in the initial phases of integrating AI, highlighting the crucial need for advice on the potential hazards involved.

Peter Slattery, who is spearheading the initiative and is part of MIT’s FutureTech group focusing on advancements in computing, points out that the database reveals a disparity in the attention given to various AI risks. For example, while over 70 percent of the frameworks discussed highlight concerns about privacy and security, only about 40 percent address the issue of misinformation.

As artificial intelligence continues to advance, the strategies used to identify and assess its dangers must also progress. Li emphasizes the need to delve into new challenges, including the compelling nature of AI interactions. Her firm conducted an in-depth study of Meta's most advanced Llama 3.1 model, revealing that despite its enhanced functionality, its safety has not seen comparable advancements, indicating a wider issue. "There hasn't been a noticeable enhancement in safety," Li observes.

Discover More…

Explore the World of Politics: Subscribe to our newsletter and tune into our podcast.

The outcomes of distributing no-strings-attached cash to individuals

Weight loss isn't guaranteed for all Ozempic users

The Pentagon plans to allocate $141 billion towards the development of an apocalypse device.

Gathering Announcement: Be part of the Energy Tech Summit happening on October 10th in Berlin.

Additional Content from WIRED

Critiques and Manuals

© 2024 Condé Nast. All rights reserved. Purchases made through our site may generate revenue for WIRED as part of our Affiliate Partnerships with retail partners. Content from this site is prohibited from being copied, shared, distributed, or used in any form without explicit written consent from Condé Nast. Ad Choices

Choose a global website

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Automobilnews News – The first AI News Portal world wide

AI Safety Under the Microscope: Researchers Develop Benchmark for Assessing Model Risks

Related

Discover more from Automobilnews News - The first AI News Portal world wide

You may like

Leave a Reply Cancel reply

Leave a Reply

SUBSCRIBE FOR FREE

Unveiling Ferrari’s Latest Supercar Innovations: A Deep Dive into Maranello’s Masterpieces and Cutting-Edge Technologies

Nigel Mansell Criticizes Ferrari’s “Short-Sighted” Decision on Adrian Newey, Predicts Bright Future for Aston Martin

Revealing the AI Gap: How U.S. Teens Outpace Their Parents in Generative AI Use and Understanding

Peter Windsor Dismisses Russell’s Pirelli Complaints as “Nonsense,” Questions Mercedes Driver’s Approach Post-Azerbaijan GP

Revolutionizing Creativity: YouTube to Unleash Generative AI Video Creation with Veo Model Integration

Wolff Identifies Tyre Temperature Control as Mercedes’ Key Challenge at Singapore Grand Prix

SocialAI: Navigating the Echo Chamber of AI-Generated Companions

Into the AI Abyss: Navigating the Uncanny World of SocialAI

Nigel Mansell Weighs in on McLaren’s Team Strategy: Urges Lando Norris to “Step Up” Amid Title Race

Lionsgate and AI Firm Runway Forge Groundbreaking Partnership: A New Era for Film Production and Copyright Concerns

Renault Master H2 Tech: Der Wasserstoff-Revolutionär mit 700 km Reichweite stellt sich in Hannover vor

UN Calls for Global AI Oversight with Urgency Matching Climate Change Initiatives

AEC Erschließt Europäischen Markt mit GMC Yukon und Sierra – Luxuriöse US-Größen zu Stolzen Preisen

Adult Industry Advocates Seek Inclusion in AI Regulation Talks, Highlighting Oversight Risks

Alfa Romeo Junior (2024) Debütiert in Deutschland: Preise und Details zu Hybrid- und Elektromodellen

Unveiling the Westminster Accounts: A Comprehensive Guide to MPs’ Earnings and Donations

Unveiling Political Finances: Explore MPs’ Earnings and Donations with the New Westminster Accounts Tool

Meituan’s Delivery Workers Earn $11 Billion in 2023 as CEO Wang Xing Addresses Gig Worker Welfare Concerns Amidst Policy Pressure

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

V12 AI REVOLUTION COMMING SOON !

SPORT NEWS

Nigel Mansell Criticizes Ferrari’s “Short-Sighted” Decision on Adrian Newey, Predicts Bright Future for Aston Martin

Peter Windsor Dismisses Russell’s Pirelli Complaints as “Nonsense,” Questions Mercedes Driver’s Approach Post-Azerbaijan GP

Wolff Identifies Tyre Temperature Control as Mercedes’ Key Challenge at Singapore Grand Prix

Business NEWS

Meituan’s Delivery Workers Earn $11 Billion in 2023 as CEO Wang Xing Addresses Gig Worker Welfare Concerns Amidst Policy Pressure

Cash Dethroned: Asia’s Family Offices Shift Focus to Equities, Bonds, and Private Assets Amid Bullish Market Outlook

Rising Power: China’s Renewable Energy Surge and the Impending Shift in Global Wealth Distribution

POLITCS NEWS

Unveiling the Westminster Accounts: A Comprehensive Guide to MPs’ Earnings and Donations

Unveiling Political Finances: Explore MPs’ Earnings and Donations with the New Westminster Accounts Tool

Outrage as Huw Edwards Avoids Jail: Calls Intensify for Reform of Leniency Appeal Process

Chatten Sie mit uns

Discover more from Automobilnews News - The first AI News Portal world wide

Leave a Reply
Cancel reply

”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”