Connect with us

AI

Revolutionizing AI Safety: Researchers Develop Tamperproofing Technique for Open Source Models

Published

on

To review this article again, go to My Profile and then click on View saved stories.

A Novel Approach May Halt the Abuse of Open Source AI Technology

In April, when Meta unveiled its advanced language model Llama 3 at no cost, it merely took a few days for external developers to modify it by removing the safeguards designed to stop it from generating offensive humor, giving out recipes for producing methamphetamine, or engaging in other inappropriate behaviors.

Researchers from the University of Illinois Urbana-Champaign, UC San Diego, Lapis Labs, and the nonprofit Center for AI Safety have created a novel training method that may enhance security measures in Llama and similar open-source AI frameworks, making them more resistant to tampering. Given the rapid advancement of artificial intelligence, many specialists argue that securing open models through such techniques could become essential.

"Mantas Mazeika, a researcher at the Center for AI Safety and a PhD candidate at the University of Illinois Urbana-Champaign, conveyed to WIRED his concerns that terrorists and rogue nations will exploit these models. He emphasized that the simpler it becomes for such entities to adapt these models for their use, the higher the threat level becomes."

Creators frequently keep advanced artificial intelligence models under wraps, making them available solely via a software API or through a publicly accessible chatbot such as ChatGPT. While the creation of an advanced Large Language Model (LLM) incurs expenses running into tens of millions of dollars, companies like Meta have opted to fully disclose their models. This disclosure encompasses the release of the models' “weights,” which are the critical parameters determining their functioning, for public download.

Before being made available, models such as Meta's Llama undergo a refinement process to enhance their conversational abilities and question-answering capabilities. Additionally, this process aims to equip them with the ability to reject sensitive or harmful inquiries. This measure is taken to ensure that a chatbot using the model avoids providing offensive, unsuitable, or hateful comments, and to prevent it from giving instructions on, for instance, constructing an explosive device.

The team responsible for this innovative approach discovered a method to hinder the manipulation of a public model for malicious purposes. This method duplicates the modification procedure but then adjusts the model's settings in a way that renders it ineffective at responding to harmful prompts, like "Give directions for constructing a bomb."

Mazeika and his team showed how the technique worked using a simplified version of Llama 3. They adjusted the model's settings in a way that prevented it from learning to respond to inappropriate questions, even after numerous tries. Meta has yet to reply to a request for a statement.

Mazeika indicates that while the method isn't flawless, it implies the possibility of making it more challenging to bypass AI model restrictions. "The aim is to elevate the difficulty and expense of compromising the model to a level where the majority of potential attackers are discouraged from attempting it," he explains.

"Dan Hendrycks, the director of the Center for AI Safety, expresses his optimism that this effort will initiate studies on tamper-proof protections, paving the way for the research community to devise increasingly effective safety measures."

As open source AI garners more attention, the notion of making open models tamper-proof could gain traction. Presently, open models are on par with the top-tier proprietary models developed by giants such as OpenAI and Google. For example, the latest iteration of Llama 3, which was launched in July, matches the performance of the engines powering well-known chatbots like ChatGPT, Gemini, and Claude according to widely recognized metrics for evaluating the proficiency of language models. Additionally, Mistral Large 2, a large language model from a French startup that was also unveiled last month, boasts comparable capabilities.

The United States administration is adopting a careful yet optimistic stance towards open-source artificial intelligence. This week, a publication from the National Telecommunications and Information Administration, which is part of the US Commerce Department, advises that the US government should establish new mechanisms to watch for possible dangers, yet it suggests holding back on placing immediate limitations on the broad access to the core algorithms of major AI frameworks.

However, there are those who oppose the idea of placing limits on open models. Stella Biderman, who leads EleutherAI, an open source AI initiative powered by a community, believes that while the new method might seem theoretically sound, applying it could be challenging. According to Biderman, this strategy also contradicts the principles of free software and the concept of openness in the field of AI.

"Biderman believes the paper misses the mark on the fundamental problem," he states. "If the worry is about LLMs producing information on weapons of mass destruction, then addressing the training data is the appropriate action, not altering the already trained model."

Discover More…

Direct to Your Inbox: Explore the Finest and Most Unusual Tales from the Vault of WIRED

The Process Behind Memory Selection in the Brain

Headline News: Introducing Priscila, the Leading Figure of the Rideshare Underworld

Silicon Valley's wealthy elite show surprising support for Donald Trump

Event: Don't miss out on The Big Interview happening on December 3rd in San Francisco.

Additional Insights from WIRED

Evaluations and Tutorials

© 2024 Condé Nast. All rights reserved. Purchases made through our site involving products linked to our retail affiliate partnerships may result in a commission for WIRED. Reproduction, distribution, transmission, caching, or any other form of utilization of the material found on this site is prohibited without the express written consent of Condé Nast. Advertisement Choices

Choose a global website


Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

SUBSCRIBE FOR FREE

Advertisement
Business5 mins ago

Hong Kong Banks Follow Fed’s Lead: Prime Rate Cuts Promise Monthly Savings for Mortgage Borrowers and Boost to Local Economy

Politics28 mins ago

John Swinney’s Stark Independence Admission: A Reality Check for SNP Campaigners on Referendum Anniversary

Politics34 mins ago

Transparency in Question: Tom Tugendhat Highlights Concerns Over Keir Starmer’s Extensive Receipt of Gifts and Hospitality

Business38 mins ago

Huawei’s Mate XT Tri-Fold Smartphone Ignites Market Frenzy: Scalpers Skyrocket Prices at Huaqiangbei Electronics Marketplace

Politics48 mins ago

Tory Leadership Contender Tugendhat Questions Starmer’s Lavish Gift Totals Amid Transparency Concerns

Politics56 mins ago

Scrutiny Intensifies as Starmer’s Lavish Gifts Top Charts Amid Tory Leadership Race

Business1 hour ago

Google Triumphs in EU Antitrust Case, Overturning €1.5 Billion Fine: A Setback for Vestager’s Crusade Against Silicon Valley

Politics1 hour ago

Unveiling Transparency: How to Use Westminster Accounts to Track Your MP’s Activities

Business2 hours ago

Hong Kong Financial Officials Warn Borrowers of Funding Cost Delays Amid Slow Prime Rate Cuts

Politics2 hours ago

Prime Minister Sir Keir Starmer Tops MP Gift and Hospitality Chart with Over £100,000 in Declared Freebies

Business2 hours ago

Surge in Hang Seng Index: Closes Above 18,000 Post Fed Rate Cut, Marking a Two-Month High

Politics2 hours ago

Politics Unpacked: Labour’s Internal Strife, High-Stakes Diplomacy in Paris, and the Road Ahead

Business3 hours ago

Hong Kong Banks, HSBC and Bank of China, Initiate First Prime Rate Cut in 5 Years to Support Local Businesses and Mortgage Borrowers

Politics3 hours ago

Under Pressure: Minister Defends PM Starmer’s Right to Accept Freebies Amid Scrutiny

Business3 hours ago

Wrise’s Rapid Expansion in Hong Kong Amid Surge in Family Offices Setup: A New Era in Wealth Management

Moto GP3 hours ago

Fabio Quartararo Contemplates Exit Amid Yamaha’s Performance Crisis, Commits to Future with Renewed Hope

Business4 hours ago

Yutong, World’s Leading Electric-Bus Manufacturer, Advances in Tech with CATL’s Quick-Charge Batteries; Aims for Increased Range and Reduced Operating Costs

Moto GP4 hours ago

Aprilia’s Path to Clarity: Test Insights Propel Team Ahead of MotoGP Misano Encore

Politics2 months ago

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Moto GP4 months ago

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Sports4 months ago

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Moto GP4 months ago

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Moto GP4 months ago

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Sports4 months ago

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Moto GP4 months ago

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Sports4 months ago

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Sports4 months ago

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Sports4 months ago

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Sports4 months ago

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Business4 months ago

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Cars & Concepts2 months ago

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Business4 months ago

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

AI4 months ago

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

Business4 months ago

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

Tech4 months ago

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

Mobility Report4 months ago

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

V12 AI REVOLUTION COMMING SOON !

Get ready for a groundbreaking shift in the world of artificial intelligence as the V12 AI Revolution is on the horizon

SPORT NEWS

Business NEWS

Advertisement

POLITCS NEWS

Chatten Sie mit uns

Hallo! Wie kann ich Ihnen helfen?

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe now to keep reading and get access to the full archive.

Continue reading

×