AI

Game Theory and AI: A Revolutionary Approach to Enhancing Reliability and Consistency in Language Models

Published

3 months ago

June 10, 2024

To go back to this piece, navigate to My Profile and then click on Saved stories to view them.

James Nadis

Exploring the Role of Game Theory in Enhancing AI Dependability

This story was initially published in Quanta Magazine.

Picture this: you have a buddy who responds to the identical query with varying replies, based solely on the way you pose the question. Asking, "What is the capital of Peru?" elicits one response, while inquiring, "Is Lima the capital of Peru?" prompts a different one. Such behavior might raise concerns about your friend's cognitive health, and undoubtedly, it would make it challenging for you to rely on any information they provide.

This phenomenon is unfolding with numerous extensive language models (LLMs), the highly potent machine learning instruments behind ChatGPT and various wonders of artificial intelligence. When posed with a generative question, one that invites an expansive response, the outcome is one answer, whereas a discriminative question, which requires selecting from among alternatives, frequently leads to another answer. "A gap emerges when the identical question is presented in a varied manner," observed Athul Paul Jacob, a PhD candidate at the Massachusetts Institute of Technology.

Jacob and his team developed a strategy to enhance the consistency and reliability of a language model's responses. They introduced a concept known as the consensus game, which essentially involves the language model competing with itself. By applying principles of game theory, this approach aims to harmonize the model's different modes to converge on a unified answer, thereby enhancing the model's precision and coherence.

"Studies examining how these frameworks maintain coherence internally have scarcely been conducted," stated Shayegan Omidshafiei, the top science executive at Field AI, a robotics firm. "This document stands out as pioneering in addressing this issue through an ingenious and methodical approach, which involves devising a game for the language model to engage in by itself."

"Ahmad Beirami, who works as a research scientist at Google Research, expressed his enthusiasm about the project, noting that for many years, language models have consistently produced answers to prompts using the same method. He highlighted that the innovative approach of incorporating a game into this mechanism by the team from MIT represents a groundbreaking shift in the paradigm, opening up possibilities for numerous new applications."

Leveraging Gaming for AI Advancement

The innovative approach being taken to enhance artificial intelligence (AI) through gaming marks a departure from previous methods, which gauged an AI system's effectiveness by its ability to conquer games. For instance, back in 1997, IBM's Deep Blue computer made headlines by defeating chess grandmaster Garry Kasparov, setting a significant benchmark for AI capabilities. Move forward nineteen years, and Google DeepMind's AlphaGo made waves by winning four of five matches against ex-Go champion Lee Sedol, showcasing another domain where human dominance was challenged. Additionally, machines have outperformed humans in games like checkers and two-player poker, along with other competitive games where one player's win means the other's loss.

Athul Paul Jacob contributed to the creation of the consensus game, a method designed to enhance the precision and dependability of extensive language models.

Authored by Matt

Authored by Sachi Mulkey

Authored by David Robson

Attributed to Joseph

The challenge of mastering the game of Diplomacy, cherished by renowned figures such as John F. Kennedy and Henry Kissinger, significantly eclipsed previous hurdles faced by AI researchers. Unlike typical games with just two adversaries, Diplomacy involves seven players, making it difficult to predict their intentions. Success in the game hinges on the ability to strategically negotiate and form alliances, which might be betrayed at any moment. The complexity of Diplomacy is such that a team from Meta celebrated when their AI program, named Cicero, achieved a level of performance comparable to humans after participating in 40 games in 2022. Although Cicero did not defeat the reigning world champion, it performed impressively, ranking within the top 10 percent when pitted against human competitors.

In the midst of the project, Jacob, who was part of the Meta team, found it remarkable that Cicero used a language model to interact with other players. He believed there was more that could be achieved. According to him, the team's objective was "to create the most advanced language model possible to excel in this game." However, he pondered the potential of shifting their focus towards designing the most exceptional game possible to enhance the capabilities of large language models.

In 2023, Jacob embarked on an exploration of a specific inquiry at MIT, collaborating with Yikang Shen, Gabriele Farina, and his mentor, Jacob Andreas, on a project that would evolve into the consensus game. The fundamental concept originated from the notion of viewing a dialogue between two individuals as a collaborative endeavor, where the objective is achieved once the listener grasps the speaker's intended message. Specifically, the consensus game aims to synchronize the language model's dual aspects—the generator, responsible for generating questions, and the discriminator, tasked with distinguishing between different types of questions.

Following several intermittent periods of progress, the group developed this concept into a complete game. Initially, the generator is presented with a question, which might be supplied by a person or derived from an already compiled list. Take, for instance, the query, "Where was Barack Obama born?" Subsequently, the generator gathers possible answers, such as Honolulu, Chicago, and Nairobi. These choices may also be provided by an individual, selected from a list, or identified through a search performed by the language model.

Before providing a response, the generator receives instructions on whether to give a correct or incorrect answer, based on the outcome of a fair coin flip.

When the outcome is heads, the mechanism strives to provide an accurate answer. The creator forwards the initial query and its selected answer to the evaluator. If the evaluator concludes that the creator deliberately provided the right answer, both are rewarded with a point as a form of motivation.

When the coin results in tails, the generator outputs an answer it deems incorrect. Should the discriminator determine that this incorrect answer was provided on purpose, each of them receives a point once more. This concept aims to motivate a consensus. "It's akin to training a dog," Jacob mentioned. "You reward them for performing correctly."

Both the generator and discriminator begin with initial assumptions, represented as probability distributions over various options. For instance, the generator might estimate, based on its analysis of online data, that there is an 80% likelihood that Obama was born in Honolulu, 10% in Chicago, 5% in Nairobi, and 5% in other locations. The discriminator might have a different set of probabilities. While these two entities are incentivized to find common ground, penalties are imposed for straying too far from their initial beliefs. This mechanism promotes the integration of their world knowledge—sourced from the internet—into their decisions, aiming to enhance the model's precision. Without such a mechanism, they could mistakenly converge on an incorrect answer like Delhi and still accumulate points.

Authored by Matt

Authored by Sachi Mulkey

Authored by David Robson

Byline: Auth

In each matchup, the two systems engage in approximately 1,000 rounds of competition. Throughout these extensive trials, both parties adapt by understanding the opponent's strategies and adjusting their own tactics in response.

Over time, the generator and the discriminator start to find common ground, reaching a state known as Nash equilibrium. This concept is fundamental to game theory. It signifies a balance within a game where no participant can improve their position by changing their approach. Take rock-paper-scissors as an illustration; players achieve optimal results by selecting each option one-third of the time. Deviating from this strategy will consistently lead to poorer outcomes.

In the game of consensus, the dynamics can unfold in various scenarios. For example, the discriminator may realize that it earns a point whenever it responds with "correct" each time the generator proposes "Honolulu" as the place where Obama was born. Through continuous interactions, both the generator and discriminator will understand that they are rewarded for this behavior, leading to a lack of incentive to change their strategy. This situation is just one illustration of a Nash equilibrium in this context. Additionally, the team from MIT utilized an adapted version of Nash equilibrium that factors in the players’ initial beliefs. This adjustment ensures that their reactions remain based in practicality.

The researchers noted that the overall outcome of engaging the language model in this game is an enhancement in its accuracy and consistency in providing answers, regardless of the variations in how questions are presented. To evaluate the impact of the consensus game, the group conducted experiments using a collection of conventional questions on different mid-sized language models ranging from 7 billion to 13 billion parameters. These models consistently achieved a greater number of correct answers compared to those that did not participate in the game, surpassing even significantly larger models with as many as 540 billion parameters. Engaging in the game also boosted the coherence within the model's responses.

Authored by Matt

By [Your Name]

—

Please

By David Robson

—

Authored by Joseph

Essentially, any large language model could improve by competing in the game solo, and completing 1,000 games could be achieved in just milliseconds using a typical laptop. Omidshafiei mentioned, "An attractive advantage of this method is that it's extremely efficient in terms of computation, requiring neither additional training nor changes to the original language model."

Exploring Linguistic Games

Following his early achievements, Jacob is delving into additional methods to integrate game theory with large language model (LLM) studies. Initial findings indicate that an LLM that is already performing well can enhance its capabilities further by engaging in a new game, provisionally named the ensemble game, involving several smaller models. In this setup, the main LLM collaborates with at least one smaller model acting as a supporter while also contending with at least one smaller model in opposition. For instance, if the main LLM is tasked with identifying the president of the United States, it earns a point both for agreeing with its supporting model and for providing a different response than its opponent. This strategy of interaction with considerably smaller models is shown not only to elevate an LLM's efficiency but also to achieve this enhancement without the need for additional training or modifications to its parameters.

Ian Gemp applies the principles of game theory to practical scenarios, allowing big language models to assist in decision-making processes that require strategy.

This is merely the beginning. According to Ian Gemp, a research scientist at Google DeepMind, the concept of games can apply to numerous real-life circumstances, allowing the application of game theory strategies in diverse scenarios. In a paper published in February 2024, Gemp and his team concentrated on complex negotiation instances that go beyond simple Q&A interactions. "Our primary goal with this endeavor is to enhance the strategic capabilities of language models," he stated.

During an academic gathering, he highlighted an instance concerning the method of evaluating papers for journal or conference approval, particularly following a critical first review. He explained how, since language models can determine the likelihood of various outcomes, scholars are able to create decision trees akin to those used in poker. These trees map out potential actions and their outcomes. "By employing this approach, it's possible to identify and prioritize various counterarguments," Gemp noted. Essentially, the model advises on the most strategic response.

Authored by Matt

By [Your Name]

Authored by David Robson

By [Your Name

Leveraging the understanding gained from game theory, language models will be equipped to manage interactions of greater complexity, moving beyond the confines of simple query-response scenarios. "The significant advantage in the future relates to extended dialogues," Andreas noted. "The subsequent phase involves an AI engaging with a human, rather than merely interacting with another language model."

Jacob sees the work being done by DeepMind as complementary to the existing methods of consensus and ensemble games. He explains, "Both approaches essentially merge the concepts of language models and game theory," despite having slightly different objectives. Whereas the Gemp team is transforming typical scenarios into game formats to aid in strategic decision-making, Jacob mentions, "our focus is on leveraging our understanding of game theory to enhance the performance of language models across a variety of tasks."

Currently, Jacob describes these initiatives as "two branches of the same tree," indicating they are separate approaches to improving language models' capabilities. "I foresee that in one to two years, these two branches will merge," he envisions.

This article has been republished with authorization from Quanta Magazine, a publication that operates independently under the Simons Foundation. Its purpose is to broaden the public's comprehension of science by highlighting recent findings and movements in the fields of mathematics, physics, and biological sciences.

Suggested for You…

Delivered to your email: Subscribe to Plaintext for in-depth perspectives on technology by Steven Levy.

Within the largest undercover operation ever conducted by the FBI

The WIRED AI Elections Initiative: Monitoring over 60 worldwide electoral events

Ecuador is completely without electricity due to a severe drought.

Be confident: These are the top mattresses available for online purchase

Wood, Charlie

Journalist Profile

Byline: Authored by Paul Sutter for

Emily Mullin

David Robson

Lyndie Chiou

Emily Mullin

Max G. Levy

Additional Content from WIRED

Critiques and Manuals

© 2024 Condé Nast. All rights reserved. Purchases made through our website might result in WIRED receiving a share of the sale as a part of our affiliate agreements with retail partners. Content from this website is not to be copied, shared, broadcast, stored, or utilized in any form without the explicit written consent of Condé Nast. Advertisement Preferences.

Choose a global website

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Automobilnews News – The first AI News Portal world wide

Game Theory and AI: A Revolutionary Approach to Enhancing Reliability and Consistency in Language Models

Related

Discover more from Automobilnews News - The first AI News Portal world wide

You may like

Leave a Reply Cancel reply

Leave a Reply

SUBSCRIBE FOR FREE

Jack Miller Reflects on ‘Bleak’ Summer and Revels in Pramac Yamaha Deal for 2025 MotoGP Season

Mercedes Unveil Strategic Pit Lane Start for Hamilton in Baku Amid Anticipation of Major F1 Upgrades

Francesco Bagnaia Chooses Neutral Ground Amid Valentino Rossi and Marc Marquez Controversy

**Lewis Hamilton Condemns FIA President’s Swearing Clampdown Comments as Racially Insensitive**

Yamaha Confirms V4 Engine Development for MotoGP with Potential 2025 Debut

Resilient Hamilton Vows to ‘Give It Absolutely Everything’ After Azerbaijan Setback Ahead of Singapore GP

Fabio Quartararo Criticizes Yamaha’s Disorganized Test Team Amid Strategic Shifts and New Partnerships

New Audi F1 Contender Sparks Speculation as Bottas Stays Tight-Lipped on Future

Brad Binder Praises ‘Radical’ 2025 KTM MotoGP Prototype: ‘Quite Different’ to Current Model

Charles Leclerc Unveils Ferrari’s Internal Debate Over McLaren’s Controversial Rear Wing

Marc Marquez Praises Pecco Bagnaia for Defusing Misano Crowd Boos: A Call for Respect in MotoGP

Exploring the Apex of Innovation: Lamborghini’s Latest Supercar Technologies and Luxury Advancements

Unveiling Ferrari’s Latest Supercar Innovations: A Deep Dive into Maranello’s Masterpieces and Cutting-Edge Technologies

Nigel Mansell Criticizes Ferrari’s “Short-Sighted” Decision on Adrian Newey, Predicts Bright Future for Aston Martin

Revealing the AI Gap: How U.S. Teens Outpace Their Parents in Generative AI Use and Understanding

Peter Windsor Dismisses Russell’s Pirelli Complaints as “Nonsense,” Questions Mercedes Driver’s Approach Post-Azerbaijan GP

Revolutionizing Creativity: YouTube to Unleash Generative AI Video Creation with Veo Model Integration

Wolff Identifies Tyre Temperature Control as Mercedes’ Key Challenge at Singapore Grand Prix

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

V12 AI REVOLUTION COMMING SOON !

SPORT NEWS

Jack Miller Reflects on ‘Bleak’ Summer and Revels in Pramac Yamaha Deal for 2025 MotoGP Season

Mercedes Unveil Strategic Pit Lane Start for Hamilton in Baku Amid Anticipation of Major F1 Upgrades

Francesco Bagnaia Chooses Neutral Ground Amid Valentino Rossi and Marc Marquez Controversy

Business NEWS

Meituan’s Delivery Workers Earn $11 Billion in 2023 as CEO Wang Xing Addresses Gig Worker Welfare Concerns Amidst Policy Pressure

Cash Dethroned: Asia’s Family Offices Shift Focus to Equities, Bonds, and Private Assets Amid Bullish Market Outlook

Rising Power: China’s Renewable Energy Surge and the Impending Shift in Global Wealth Distribution

POLITCS NEWS

Unveiling the Westminster Accounts: A Comprehensive Guide to MPs’ Earnings and Donations

Unveiling Political Finances: Explore MPs’ Earnings and Donations with the New Westminster Accounts Tool

Outrage as Huw Edwards Avoids Jail: Calls Intensify for Reform of Leniency Appeal Process

Chatten Sie mit uns

Discover more from Automobilnews News - The first AI News Portal world wide

Leave a Reply
Cancel reply

Lewis Hamilton Condemns FIA President’s Swearing Clampdown Comments as Racially Insensitive

”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”