Connect with us

AI

Revolutionizing AI Safety: Researchers Develop Tamperproofing Technique for Open Source Models

Published

on

To review this article again, go to My Profile and then click on View saved stories.

A Novel Approach May Halt the Abuse of Open Source AI Technology

In April, when Meta unveiled its advanced language model Llama 3 at no cost, it merely took a few days for external developers to modify it by removing the safeguards designed to stop it from generating offensive humor, giving out recipes for producing methamphetamine, or engaging in other inappropriate behaviors.

Researchers from the University of Illinois Urbana-Champaign, UC San Diego, Lapis Labs, and the nonprofit Center for AI Safety have created a novel training method that may enhance security measures in Llama and similar open-source AI frameworks, making them more resistant to tampering. Given the rapid advancement of artificial intelligence, many specialists argue that securing open models through such techniques could become essential.

"Mantas Mazeika, a researcher at the Center for AI Safety and a PhD candidate at the University of Illinois Urbana-Champaign, conveyed to WIRED his concerns that terrorists and rogue nations will exploit these models. He emphasized that the simpler it becomes for such entities to adapt these models for their use, the higher the threat level becomes."

Creators frequently keep advanced artificial intelligence models under wraps, making them available solely via a software API or through a publicly accessible chatbot such as ChatGPT. While the creation of an advanced Large Language Model (LLM) incurs expenses running into tens of millions of dollars, companies like Meta have opted to fully disclose their models. This disclosure encompasses the release of the models' “weights,” which are the critical parameters determining their functioning, for public download.

Before being made available, models such as Meta's Llama undergo a refinement process to enhance their conversational abilities and question-answering capabilities. Additionally, this process aims to equip them with the ability to reject sensitive or harmful inquiries. This measure is taken to ensure that a chatbot using the model avoids providing offensive, unsuitable, or hateful comments, and to prevent it from giving instructions on, for instance, constructing an explosive device.

The team responsible for this innovative approach discovered a method to hinder the manipulation of a public model for malicious purposes. This method duplicates the modification procedure but then adjusts the model's settings in a way that renders it ineffective at responding to harmful prompts, like "Give directions for constructing a bomb."

Mazeika and his team showed how the technique worked using a simplified version of Llama 3. They adjusted the model's settings in a way that prevented it from learning to respond to inappropriate questions, even after numerous tries. Meta has yet to reply to a request for a statement.

Mazeika indicates that while the method isn't flawless, it implies the possibility of making it more challenging to bypass AI model restrictions. "The aim is to elevate the difficulty and expense of compromising the model to a level where the majority of potential attackers are discouraged from attempting it," he explains.

"Dan Hendrycks, the director of the Center for AI Safety, expresses his optimism that this effort will initiate studies on tamper-proof protections, paving the way for the research community to devise increasingly effective safety measures."

As open source AI garners more attention, the notion of making open models tamper-proof could gain traction. Presently, open models are on par with the top-tier proprietary models developed by giants such as OpenAI and Google. For example, the latest iteration of Llama 3, which was launched in July, matches the performance of the engines powering well-known chatbots like ChatGPT, Gemini, and Claude according to widely recognized metrics for evaluating the proficiency of language models. Additionally, Mistral Large 2, a large language model from a French startup that was also unveiled last month, boasts comparable capabilities.

The United States administration is adopting a careful yet optimistic stance towards open-source artificial intelligence. This week, a publication from the National Telecommunications and Information Administration, which is part of the US Commerce Department, advises that the US government should establish new mechanisms to watch for possible dangers, yet it suggests holding back on placing immediate limitations on the broad access to the core algorithms of major AI frameworks.

However, there are those who oppose the idea of placing limits on open models. Stella Biderman, who leads EleutherAI, an open source AI initiative powered by a community, believes that while the new method might seem theoretically sound, applying it could be challenging. According to Biderman, this strategy also contradicts the principles of free software and the concept of openness in the field of AI.

"Biderman believes the paper misses the mark on the fundamental problem," he states. "If the worry is about LLMs producing information on weapons of mass destruction, then addressing the training data is the appropriate action, not altering the already trained model."

Discover More…

Direct to Your Inbox: Explore the Finest and Most Unusual Tales from the Vault of WIRED

The Process Behind Memory Selection in the Brain

Headline News: Introducing Priscila, the Leading Figure of the Rideshare Underworld

Silicon Valley's wealthy elite show surprising support for Donald Trump

Event: Don't miss out on The Big Interview happening on December 3rd in San Francisco.

Additional Insights from WIRED

Evaluations and Tutorials

© 2024 Condé Nast. All rights reserved. Purchases made through our site involving products linked to our retail affiliate partnerships may result in a commission for WIRED. Reproduction, distribution, transmission, caching, or any other form of utilization of the material found on this site is prohibited without the express written consent of Condé Nast. Advertisement Choices

Choose a global website


Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

SUBSCRIBE FOR FREE

Advertisement
Business18 mins ago

Hong Kong’s Student Housing Boom: Investors Capitalize on Skyrocketing Rents and High Demand

China19 mins ago

Taiwanese Activist Sentenced to Nine Years in Mainland China, Prompting Travel Advisory from Taipei

F127 mins ago

**Adrian Newey Set to Join Aston Martin: Official Announcement Expected Within Days**

Business47 mins ago

Nina Hospitality Eyes UK, Singapore Expansion as Hong Kong Market Rebounds: A Strategy Fueled by Rebranding and Real Estate Acquisitions

China48 mins ago

Debt Desperation: How China’s Local Governments Resort to Harsh Tactics to Meet Financial Targets

F153 mins ago

Aston Martin F1 Secures Major Investment, Eyes £1.5bn Valuation as Adrian Newey Joins Team

China1 hour ago

China’s AI Visionaries: Time Recognizes Zhuang Rongwen, Liang Rubo, and Wang Xiaochuan as Key Influencers in Global AI Landscape

F11 hour ago

Max Verstappen’s Future in Doubt Amid Red Bull’s Struggles: Possible Exit and F1 Sabbatical on the Horizon?

Automakers & Suppliers2 hours ago

**”Driving Innovation: How Lamborghini’s Latest Technological Advancements Are Redefining the Luxury Car Market”**

F12 hours ago

McLaren’s No1 Dilemma: Rosberg Weighs in on Team Orders Amid Norris-Piastri Rivalry

AI4 hours ago

AI, Deepfakes, and the Election: Navigating the New Frontier of Digital Propaganda

AI4 hours ago

Game On: AI’s Leap into Video Game Creation with MarioVGG’s Simulated Super Mario Bros. Gameplay

Politics5 hours ago

Green Party Calls for Wealth Tax to Bolster Public Services, Accuses Government of Misplaced Financial Priorities

Cars & Concepts5 hours ago

General Motors Gears Up for EV Racing with Innovative ‘Endurance Mode’ Patent

Politics5 hours ago

Fire Brigades Union Slams Tony Blair as ‘Despicable’ Over Grenfell Comments Amid Deregulation Critique

Politics6 hours ago

Suspended Labour Councillor Denies Incitement Charges Amid UK’s Racial Tensions

Politics6 hours ago

Migrants Skeptical of PM’s Vow to Crush Smuggling Rings: An Inside Look from Manchester

Politics7 hours ago

Sharp Rise in Pension Credit Sign-Ups Following New Winter Fuel Payment Policy

Politics2 months ago

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Moto GP3 months ago

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Sports3 months ago

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Moto GP3 months ago

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Moto GP3 months ago

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Sports3 months ago

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Moto GP3 months ago

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Sports3 months ago

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Sports3 months ago

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Sports3 months ago

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Sports3 months ago

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Business3 months ago

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Cars & Concepts1 month ago

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Business4 months ago

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

AI4 months ago

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

Business4 months ago

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

Tech4 months ago

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

Business4 months ago

Hong Kong’s Ambitious Leap: The City’s Strategic Roadmap to Becoming a Global Innovation and Tech Hub

V12 AI REVOLUTION COMMING SOON !

Get ready for a groundbreaking shift in the world of artificial intelligence as the V12 AI Revolution is on the horizon

SPORT NEWS

Business NEWS

Advertisement

POLITCS NEWS

Chatten Sie mit uns

Hallo! Wie kann ich Ihnen helfen?

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe now to keep reading and get access to the full archive.

Continue reading

×