AI
Game Theory and AI: A Revolutionary Approach to Enhancing Reliability and Consistency in Language Models
To go back to this piece, navigate to My Profile and then click on Saved stories to view them.
James Nadis
Exploring the Role of Game Theory in Enhancing AI Dependability
This story was initially published in Quanta Magazine.
Picture this: you have a buddy who responds to the identical query with varying replies, based solely on the way you pose the question. Asking, "What is the capital of Peru?" elicits one response, while inquiring, "Is Lima the capital of Peru?" prompts a different one. Such behavior might raise concerns about your friend's cognitive health, and undoubtedly, it would make it challenging for you to rely on any information they provide.
This phenomenon is unfolding with numerous extensive language models (LLMs), the highly potent machine learning instruments behind ChatGPT and various wonders of artificial intelligence. When posed with a generative question, one that invites an expansive response, the outcome is one answer, whereas a discriminative question, which requires selecting from among alternatives, frequently leads to another answer. "A gap emerges when the identical question is presented in a varied manner," observed Athul Paul Jacob, a PhD candidate at the Massachusetts Institute of Technology.
Jacob and his team developed a strategy to enhance the consistency and reliability of a language model's responses. They introduced a concept known as the consensus game, which essentially involves the language model competing with itself. By applying principles of game theory, this approach aims to harmonize the model's different modes to converge on a unified answer, thereby enhancing the model's precision and coherence.
"Studies examining how these frameworks maintain coherence internally have scarcely been conducted," stated Shayegan Omidshafiei, the top science executive at Field AI, a robotics firm. "This document stands out as pioneering in addressing this issue through an ingenious and methodical approach, which involves devising a game for the language model to engage in by itself."
"Ahmad Beirami, who works as a research scientist at Google Research, expressed his enthusiasm about the project, noting that for many years, language models have consistently produced answers to prompts using the same method. He highlighted that the innovative approach of incorporating a game into this mechanism by the team from MIT represents a groundbreaking shift in the paradigm, opening up possibilities for numerous new applications."
Leveraging Gaming for AI Advancement
The innovative approach being taken to enhance artificial intelligence (AI) through gaming marks a departure from previous methods, which gauged an AI system's effectiveness by its ability to conquer games. For instance, back in 1997, IBM's Deep Blue computer made headlines by defeating chess grandmaster Garry Kasparov, setting a significant benchmark for AI capabilities. Move forward nineteen years, and Google DeepMind's AlphaGo made waves by winning four of five matches against ex-Go champion Lee Sedol, showcasing another domain where human dominance was challenged. Additionally, machines have outperformed humans in games like checkers and two-player poker, along with other competitive games where one player's win means the other's loss.
Athul Paul Jacob contributed to the creation of the consensus game, a method designed to enhance the precision and dependability of extensive language models.
Authored by Matt
Authored by Sachi Mulkey
Authored by David Robson
Attributed to Joseph
The challenge of mastering the game of Diplomacy, cherished by renowned figures such as John F. Kennedy and Henry Kissinger, significantly eclipsed previous hurdles faced by AI researchers. Unlike typical games with just two adversaries, Diplomacy involves seven players, making it difficult to predict their intentions. Success in the game hinges on the ability to strategically negotiate and form alliances, which might be betrayed at any moment. The complexity of Diplomacy is such that a team from Meta celebrated when their AI program, named Cicero, achieved a level of performance comparable to humans after participating in 40 games in 2022. Although Cicero did not defeat the reigning world champion, it performed impressively, ranking within the top 10 percent when pitted against human competitors.
In the midst of the project, Jacob, who was part of the Meta team, found it remarkable that Cicero used a language model to interact with other players. He believed there was more that could be achieved. According to him, the team's objective was "to create the most advanced language model possible to excel in this game." However, he pondered the potential of shifting their focus towards designing the most exceptional game possible to enhance the capabilities of large language models.
In 2023, Jacob embarked on an exploration of a specific inquiry at MIT, collaborating with Yikang Shen, Gabriele Farina, and his mentor, Jacob Andreas, on a project that would evolve into the consensus game. The fundamental concept originated from the notion of viewing a dialogue between two individuals as a collaborative endeavor, where the objective is achieved once the listener grasps the speaker's intended message. Specifically, the consensus game aims to synchronize the language model's dual aspects—the generator, responsible for generating questions, and the discriminator, tasked with distinguishing between different types of questions.
Following several intermittent periods of progress, the group developed this concept into a complete game. Initially, the generator is presented with a question, which might be supplied by a person or derived from an already compiled list. Take, for instance, the query, "Where was Barack Obama born?" Subsequently, the generator gathers possible answers, such as Honolulu, Chicago, and Nairobi. These choices may also be provided by an individual, selected from a list, or identified through a search performed by the language model.
Before providing a response, the generator receives instructions on whether to give a correct or incorrect answer, based on the outcome of a fair coin flip.
When the outcome is heads, the mechanism strives to provide an accurate answer. The creator forwards the initial query and its selected answer to the evaluator. If the evaluator concludes that the creator deliberately provided the right answer, both are rewarded with a point as a form of motivation.
When the coin results in tails, the generator outputs an answer it deems incorrect. Should the discriminator determine that this incorrect answer was provided on purpose, each of them receives a point once more. This concept aims to motivate a consensus. "It's akin to training a dog," Jacob mentioned. "You reward them for performing correctly."
Both the generator and discriminator begin with initial assumptions, represented as probability distributions over various options. For instance, the generator might estimate, based on its analysis of online data, that there is an 80% likelihood that Obama was born in Honolulu, 10% in Chicago, 5% in Nairobi, and 5% in other locations. The discriminator might have a different set of probabilities. While these two entities are incentivized to find common ground, penalties are imposed for straying too far from their initial beliefs. This mechanism promotes the integration of their world knowledge—sourced from the internet—into their decisions, aiming to enhance the model's precision. Without such a mechanism, they could mistakenly converge on an incorrect answer like Delhi and still accumulate points.
Authored by Matt
Authored by Sachi Mulkey
Authored by David Robson
Byline: Auth
In each matchup, the two systems engage in approximately 1,000 rounds of competition. Throughout these extensive trials, both parties adapt by understanding the opponent's strategies and adjusting their own tactics in response.
Over time, the generator and the discriminator start to find common ground, reaching a state known as Nash equilibrium. This concept is fundamental to game theory. It signifies a balance within a game where no participant can improve their position by changing their approach. Take rock-paper-scissors as an illustration; players achieve optimal results by selecting each option one-third of the time. Deviating from this strategy will consistently lead to poorer outcomes.
In the game of consensus, the dynamics can unfold in various scenarios. For example, the discriminator may realize that it earns a point whenever it responds with "correct" each time the generator proposes "Honolulu" as the place where Obama was born. Through continuous interactions, both the generator and discriminator will understand that they are rewarded for this behavior, leading to a lack of incentive to change their strategy. This situation is just one illustration of a Nash equilibrium in this context. Additionally, the team from MIT utilized an adapted version of Nash equilibrium that factors in the players’ initial beliefs. This adjustment ensures that their reactions remain based in practicality.
The researchers noted that the overall outcome of engaging the language model in this game is an enhancement in its accuracy and consistency in providing answers, regardless of the variations in how questions are presented. To evaluate the impact of the consensus game, the group conducted experiments using a collection of conventional questions on different mid-sized language models ranging from 7 billion to 13 billion parameters. These models consistently achieved a greater number of correct answers compared to those that did not participate in the game, surpassing even significantly larger models with as many as 540 billion parameters. Engaging in the game also boosted the coherence within the model's responses.
Authored by Matt
By [Your Name]
—
Please
By David Robson
—
Authored by Joseph
Essentially, any large language model could improve by competing in the game solo, and completing 1,000 games could be achieved in just milliseconds using a typical laptop. Omidshafiei mentioned, "An attractive advantage of this method is that it's extremely efficient in terms of computation, requiring neither additional training nor changes to the original language model."
Exploring Linguistic Games
Following his early achievements, Jacob is delving into additional methods to integrate game theory with large language model (LLM) studies. Initial findings indicate that an LLM that is already performing well can enhance its capabilities further by engaging in a new game, provisionally named the ensemble game, involving several smaller models. In this setup, the main LLM collaborates with at least one smaller model acting as a supporter while also contending with at least one smaller model in opposition. For instance, if the main LLM is tasked with identifying the president of the United States, it earns a point both for agreeing with its supporting model and for providing a different response than its opponent. This strategy of interaction with considerably smaller models is shown not only to elevate an LLM's efficiency but also to achieve this enhancement without the need for additional training or modifications to its parameters.
Ian Gemp applies the principles of game theory to practical scenarios, allowing big language models to assist in decision-making processes that require strategy.
This is merely the beginning. According to Ian Gemp, a research scientist at Google DeepMind, the concept of games can apply to numerous real-life circumstances, allowing the application of game theory strategies in diverse scenarios. In a paper published in February 2024, Gemp and his team concentrated on complex negotiation instances that go beyond simple Q&A interactions. "Our primary goal with this endeavor is to enhance the strategic capabilities of language models," he stated.
During an academic gathering, he highlighted an instance concerning the method of evaluating papers for journal or conference approval, particularly following a critical first review. He explained how, since language models can determine the likelihood of various outcomes, scholars are able to create decision trees akin to those used in poker. These trees map out potential actions and their outcomes. "By employing this approach, it's possible to identify and prioritize various counterarguments," Gemp noted. Essentially, the model advises on the most strategic response.
Authored by Matt
By [Your Name]
Authored by David Robson
By [Your Name
Leveraging the understanding gained from game theory, language models will be equipped to manage interactions of greater complexity, moving beyond the confines of simple query-response scenarios. "The significant advantage in the future relates to extended dialogues," Andreas noted. "The subsequent phase involves an AI engaging with a human, rather than merely interacting with another language model."
Jacob sees the work being done by DeepMind as complementary to the existing methods of consensus and ensemble games. He explains, "Both approaches essentially merge the concepts of language models and game theory," despite having slightly different objectives. Whereas the Gemp team is transforming typical scenarios into game formats to aid in strategic decision-making, Jacob mentions, "our focus is on leveraging our understanding of game theory to enhance the performance of language models across a variety of tasks."
Currently, Jacob describes these initiatives as "two branches of the same tree," indicating they are separate approaches to improving language models' capabilities. "I foresee that in one to two years, these two branches will merge," he envisions.
This article has been republished with authorization from Quanta Magazine, a publication that operates independently under the Simons Foundation. Its purpose is to broaden the public's comprehension of science by highlighting recent findings and movements in the fields of mathematics, physics, and biological sciences.
Suggested for You…
Delivered to your email: Subscribe to Plaintext for in-depth perspectives on technology by Steven Levy.
Within the largest undercover operation ever conducted by the FBI
The WIRED AI Elections Initiative: Monitoring over 60 worldwide electoral events
Ecuador is completely without electricity due to a severe drought.
Be confident: These are the top mattresses available for online purchase
Wood, Charlie
Journalist Profile
Byline: Authored by Paul Sutter for
Emily Mullin
David Robson
Lyndie Chiou
Emily Mullin
Max G. Levy
Additional Content from WIRED
Critiques and Manuals
© 2024 Condé Nast. All rights reserved. Purchases made through our website might result in WIRED receiving a share of the sale as a part of our affiliate agreements with retail partners. Content from this website is not to be copied, shared, broadcast, stored, or utilized in any form without the explicit written consent of Condé Nast. Advertisement Preferences.
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.