AI
AI’s Incremental Advance: Anthropic’s Claude 3.5 Sonnet Edges Out Rivals Amidst Speculation on GPT-5’s Potential Breakthrough
To look over this article again, go to My Profile and then select View saved stories.
Knight Will
The Anticipation for the Next Major Advancement in AI Continues
The tech community was abuzz when OpenAI revealed GPT-4, its most advanced large language model to date, this past March. This iteration was unmistakably superior in its ability to engage in conversation, write code, and tackle a wide array of complex issues, even extending to academic assignments.
Today, Anthropic, a competitor of OpenAI, revealed its latest development in artificial intelligence that promises to enhance chatbots and various applications. While this new model is considered top-notch by certain standards, it represents a modest progression rather than a significant breakthrough.
Anthropic has introduced an enhanced version of its AI, named Claude 3.5 Sonnet, which builds upon the capabilities of the existing Claude 3 series. This updated model excels at mathematics, programming, and logical challenges, outperforming previous standards. According to Anthropic, this version operates more swiftly, captures subtleties in language more effectively, and even exhibits an improved sense of humor.
This information is undoubtedly beneficial for individuals who are developing applications and services utilizing Anthropic's AI frameworks. However, this announcement also serves as a reminder that the anticipation for a significant advancement in AI technology, similar to what was achieved with GPT-4, continues to grow. The tech community has been eagerly awaiting the arrival of GPT-5 from OpenAI for over a year, with the company's CEO, Sam Altman, fueling rumors about its potential to revolutionize AI capabilities once again. Training GPT-4 already exceeded a cost of $100 million, and it is widely anticipated that GPT-5 will be substantially larger and carry a heftier price tag.
Despite the introduction of more advanced models by OpenAI, Google, and other AI creators that surpass the capabilities of GPT-4, the global community anticipates a significant breakthrough. Recently, advancements in artificial intelligence have shifted towards gradual improvements, focusing on creative changes to the architecture and learning processes of models instead of simply expanding the size and computational power, a strategy previously employed by GPT-4.
Michael Gerstenhaber, who leads the product division at Anthropic, has announced the launch of the Claude 3.5 Sonnet model, which not only surpasses its predecessor in size but also exhibits enhanced capabilities primarily due to advancements in its training methodology. This includes targeted feedback aimed at sharpening the model's ability to reason logically. According to Anthropic, Claude 3.5 Sonnet performs better than the leading models from tech giants such as OpenAI, Google, and Facebook across several esteemed AI benchmarks. These benchmarks include GPQA, which assesses graduate-level knowledge in subjects like biology, physics, and chemistry; MMLU, which spans a range of fields including computer science and history; and HumanEval, which tests the model's coding skills. The margin of improvement over other models is noted to be slim, only a few percentage points. Despite the modest nature of these advancements, the pace of development in the AI field is noteworthy, especially considering that Anthropic unveiled its previous model lineup just three months prior. Gerstenhaber highlights, "The speed at which AI intelligence is evolving really puts into perspective the rapid progress we are making."
Over a year since the launch of GPT-4 ignited a surge in AI investments, advancing machine intelligence significantly further appears increasingly challenging. The difficulty lies in sourcing novel data sets for training AI models like GPT-4, which already process vast amounts of internet-based text, images, and videos. Expanding these models to enhance their learning capabilities could require an investment of billions. OpenAI's recent update, introduced last month as GPT-4o, emphasizes improvements in making interactions more seamless and intuitive, focusing on the quality of interaction rather than major strides in the AI's problem-solving capacity.
Authored by Christopher
Authored by Mark
By [Your Name]
Authored by Dmitri Alperovitch
Evaluating the advancement of AI through standard metrics, such as those promoted by Anthropic for Claude, might not provide an accurate picture. AI developers are highly motivated to tailor their algorithms to excel in these evaluations, and the information utilized in these common assessments might be integrated into their training datasets. "The benchmarks used in the research community are plagued by issues of data pollution, irregular scoring systems and disclosure, and the unconfirmed skills of those marking the tests," states Summer Yue, research director at Scale AI, a firm that assists numerous AI companies in developing their models.
Scale is innovating in the assessment of AI intelligence through its Safety, Evaluations, and Alignment Lab by crafting evaluations using undisclosed data and scrutinizing the qualifications of individuals who assess a model's performance.
Yue is optimistic that businesses will strive to prove the effectiveness of their models through more significant methods. She believes this could be achieved by highlighting practical uses that have a tangible impact on business, sharing clear indicators of performance, as well as case studies and feedback from clients.
Anthropic is promoting the significant effects of Claude 3.5 Sonnet. According to Gerstenhaber, organizations that have adopted this new iteration are experiencing advantages from its enhanced responsiveness and capability to tackle problems. Among the users is the investment company Bridgewater Associates, which employs Claude for assistance in programming tasks. Additionally, some other financial institutions, whose names Gerstenhaber chose not to reveal, are utilizing the model for investment recommendations. "The feedback from the initial access phase has been overwhelmingly favorable," he mentions.
The timeline remains uncertain for the anticipated advancement in artificial intelligence. OpenAI has announced the initiation of training for its upcoming significant model. Meanwhile, it's essential to develop alternative methods to evaluate the actual effectiveness of the technology.
Suggested For You…
Direct to your email: Subscribe to Plaintext for in-depth insights on technology from Steven Levy.
Step into the chaotic world of programmatic advertising.
What is the required number of electric vehicle (EV) charging points to supplant fuel stations across the US
A charitable organization aimed to reform the tech industry's environment, yet it struggled to maintain governance over itself.
Perpetual Sunshine: Discover the Ultimate Shades for Any Quest
Knight Will
Knight Will
Knight Will
Knight Will
Jessica Fletcher
Amanda Hoover
Knight Will
Steve Nadis
Additional Coverage from WIRED
Evaluations and Instructions
© 2024 Condé Nast. Rights reserved. WIRED might receive a share of revenue from items bought via our website, as part of our Affiliate Relationships with retail partners. The content on this website is protected and cannot be copied, shared, broadcast, stored, or utilized in any form without the explicit written consent of Condé Nast. Advertising Choices
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.