AI
Anthropic Unveils Claude: The AI That Can Operate Your Computer, Aiming to Transform Routine Tasks into Automated Ease
To go back to this article, go to My Profile and then look at the stories you've saved.
Anthropic Aims for Its AI to Operate Your PC
Initially, there was a period of adjustment required for individuals to get comfortable with the concept of chatbots appearing to possess their own intellects. Now, the forthcoming challenge might be to have faith in artificial intelligence to also manage our computer systems.
Today, Anthropic, a leading rival of OpenAI, revealed that its AI system, named Claude, has been trained to perform various tasks on a computer, such as browsing the internet, launching apps, and entering text through mouse and keyboard interactions.
"Jared Kaplan, Anthropic's chief science officer and associate professor at Johns Hopkins University, believes we are on the verge of a revolutionary period in which models will have the ability to utilize all the tools humans employ to accomplish tasks."
Kaplan presented WIRED with a previously recorded demonstration where a version of Claude, capable of utilizing tools, was tasked with organizing a trip to watch the sunrise at the Golden Gate Bridge alongside a companion. Following the request, Claude initiated the Chrome web browser, conducted a search on Google to gather pertinent details like the best location for viewing and the prime time for arrival, and then proceeded to employ a calendar application to set up an event to be shared with a companion. (The demonstration did not extend to providing additional guidelines, for instance, the most efficient route to reach the destination.)
During a subsequent demonstration, Claude was tasked with creating a basic website for self-promotion. In an almost dreamlike scenario, the AI model utilized its own interface to input a command, triggering the generation of the required code. Following this, it employed Visual Studio Code, a widely-used programming editor from Microsoft, to construct a straightforward website. It then proceeded to launch a text terminal to initiate a basic web server for the site's testing. The outcome was a fairly good website with a retro, 1990s vibe as its landing page. When prompted to rectify an issue on the website, Claude revisited the code editor, pinpointed the problematic code segment, and removed it.
Anthropic's Chief Product Officer, Mike Krieger, expresses the company's aspiration for AI assistants to handle mundane administrative duties, allowing individuals to focus on more creative pursuits. “Imagine eliminating several hours spent on repetitive tasks like copying and pasting. What would you do with that extra time?” he muses. “For me, it would mean more time for playing guitar.”
Starting today, Anthropic is offering access to the advanced capabilities of its most sophisticated multimodal large language model, Claude 3.5 Sonnet, via its application programming interface (API). Additionally, the firm unveiled an upgraded iteration of a more compact model, Claude 3.5 Haiku, today.
Showcases of artificial intelligence (AI) systems often impress audiences, yet ensuring these technologies operate smoothly and without significant (or expensive) mistakes in practical applications remains difficult. Present-day versions are capable of responding to inquiries and engaging in dialogue with a level of proficiency that closely resembles human interaction. They serve as the foundation for chatbots like ChatGPT from OpenAI and Gemini from Google. Furthermore, these systems are adept at executing computer-based tasks upon receiving straightforward instructions, by either interacting with the computer's display and input devices such as keyboards and trackpads, or via direct software interfaces.
Anthropic reports that its AI system, Claude, surpasses other AI models in various critical evaluations, among them SWE-bench, a benchmark assessing the software engineering prowess of an agent, and OSWorld, a test measuring an agent's proficiency in navigating and utilizing a computer OS. These assertions, however, await confirmation from independent sources. According to Anthropic, Claude achieves a success rate of 14.9 percent in correctly executing tasks within OSWorld. Although this performance is significantly below the average human success rate of about 75 percent, it is notably better than the performance of the leading competitors, including OpenAI's GPT-4, which has a success rate of approximately 7.7 percent.
Anthropic reports that numerous firms are currently experimenting with the agentic iteration of Claude. This encompasses Canva, employing it for design and editing automation, and Replit, which utilizes the model to assist with programming tasks. Additional initial adopters comprise The Browser Company, Asana, and Notion.
Ofir Press, who is conducting postdoctoral research at Princeton University and contributed to creating SWE-bench, points out that AI with agency often falls short in long-term planning and frequently has difficulty correcting its mistakes. He emphasizes the importance of demonstrating their utility by achieving robust results in challenging and practical tests. For instance, he mentions the significance of the AI's ability to consistently organize various travel plans for a user and handle all the associated ticket reservations.
Kaplan points out Claude's impressive ability to rectify certain mistakes quite adeptly. For example, when encountering a critical failure during the launch of a web server, the model skillfully altered its command to rectify the issue. Additionally, it figured out the necessity of allowing popups to overcome obstacles while navigating the internet.
Numerous technology firms are currently in a competitive rush to create artificial intelligence tools, aiming to secure a larger portion of the market and gain significant recognition. Indeed, the time may soon arrive when numerous individuals will have access to these AI tools readily. Microsoft has invested over $13 billion in OpenAI and is currently experimenting with tools that are capable of operating Windows computers. Meanwhile, Amazon, having made substantial investments in Anthropic, is investigating the potential for these tools to suggest and, in the future, directly purchase items for its consumers.
Sonya Huang, a Sequoia venture firm partner with a focus on AI enterprises, mentioned that despite the hype surrounding AI agents, the majority of businesses are essentially just giving AI-enhanced tools a new name. In a conversation with WIRED prior to the announcement from Anthropic, she shared her insight that the technology presently excels in specific areas, like tasks related to programming. "It's essential to select areas where it's acceptable for the model to make mistakes," she notes. "These are the areas where companies genuinely dedicated to agent-based solutions will emerge."
One of the major difficulties in managing advanced AI is that mistakes can have much more serious consequences than a confused chatbot response. To mitigate these issues, Anthropic has set specific limitations on Claude's capabilities, such as restricting its power to make purchases using an individual's credit card.
Press from Princeton University believes that if mistakes can be sufficiently prevented, users may begin to perceive AI and computers from an entirely fresh perspective. "I'm extremely enthusiastic about this forthcoming period," he states.
Recommended for You…
Direct to your email: A selection of our top stories, curated daily just for you.
Conversation: Bobbi Althoff Discusses Her Path to Wealth and Its Extent
An adviser to JD Vance shared numerous posts on Reddit for an extended period, discussing substance use.
The physician associated with the "euthanasia capsule" seeks the aid of artificial intelligence for end-of-life care
Come be a part of WIRED Health happening on March 18 in London.
Additional Content from WIRED
Critiques and Instructions
© 2024 Condé Nast. All rights reserved. WIRED may receive a share of revenue from items bought via our website, as a result of our affiliate agreements with retail partners. Content from this website cannot be duplicated, shared, broadcast, stored, or used in any form without explicit written consent from Condé Nast. Ad Choices
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.