Connect with us


AI Startup Perplexity Accused of Plagiarism and Violating Web Protocols in WIRED Investigation



To access this article again, go to My Profile and select Saved Stories.

Authored by Tim Marchman

Confusion Surrounds Our Report on Perplexity's Dubious Practices

This week, WIRED released an article discussing the AI-based search company Perplexity, recently criticized by Forbes for plagiarism. Within this piece, my colleague Dhruv Mehrotra and I uncovered that the firm was covertly harvesting content through web crawlers, gathering data from sites that had attempted to restrict its access, thereby contravening its declared commitment to respect the Robots Exclusion Protocol.

Our research, alongside discoveries made by developer Robb Knight, pinpointed a particular IP address highly likely associated with Perplexity, yet not included in its publicly disclosed IP addresses. This IP was caught scraping trial websites seemingly in reaction to inputs made to the firm's publicly accessible chat interface. Server logs revealed that this IP accessed assets owned by Condé Nast, the parent company of WIRED, on no fewer than 822 occasions over the last three months—a figure that probably falls short of the actual count, given that the company only stores a limited fraction of its data.

We also disclosed that the chatbot was engaging in deceptive behavior, from a technical standpoint. In a particular test, it created a narrative about a girl chasing a path of mushrooms as a summary for the content of a website, which, based on server logs, its agent seemingly never tried to visit.

Perplexity's top executive, Aravind Srinivas, along with the company itself, did not directly challenge the details presented in WIRED's article. Srinivas, in his response, mentioned, "WIRED's questions demonstrate a significant misunderstanding of Perplexity's operations and the fundamentals of the Internet," according to his statement. With financial backing from Jeff Bezos’ private investment firm and Nvidia among its supporters, Perplexity claims a valuation of one billion dollars following its latest capital raising initiative. Furthermore, a report by The Information last month indicated that the company was negotiating a potential deal expected to triple its valuation to three billion dollars. (Attempts to reach Bezos for comment were unsuccessful; Nvidia opted not to comment.)

Upon releasing our article, I engaged with three top chatbots to discuss its contents. ChatGPT by OpenAI and Anthropic's Claude both provided speculative responses about the topic of the article, indicating they couldn't access the piece itself. In contrast, the Perplexity chatbot delivered a detailed six-paragraph summary, comprising 287 words that precisely outlined the article's findings and the supporting evidence. (WIRED's server records reveal an attempt by a bot, likely associated with Perplexity though not from a known IP address of theirs, to access the article on its release day, receiving a 404 error instead. It's important to note that not all traffic data is preserved by the company, so this snapshot may not fully represent the bot's or other Perplexity entities' activities.) At the beginning of the generated summary, there's a hyperlink to the original article, and a small gray icon at the end of the final five paragraphs offers a link back to the source material. Notably, a sentence in the fifth paragraph's last third is an exact copy of one from the original article, describing a fictional tale of a girl named Amelia exploring a mystical forest filled with glowing mushrooms known as Whisper Woods.

My colleagues and I were taken aback by what we perceived to be an act of plagiarism. It seemed to meet the standards outlined by the Poynter Institute, notably passing the stringent seven-to-10 word test. This rule suggests that accidentally reproducing seven consecutive words from someone else's writing is unlikely. (Kelly McBride, a senior vice president at Poynter who has mentioned the usefulness of this test for detecting plagiarism, did not respond to an email inquiry.)

Authored by Mark

Authored by Dmitri Alperovitch

By Matt Kamen

Authored by Kim Zetter

"Should a student of mine submit a narrative resembling this, I'd present their case to the committee on academic integrity for copying," remarked John Schwartz, a professor at the University of Texas at Austin's school of journalism, upon reviewing both the initial article and its condensed version. "To me, it seems overly similar. As I went through the version from Perplexity, it felt like I was hearing an echo."

Confusion and the company's chief executive officer, Srinivas, failed to reply to a comprehensive inquiry seeking their reaction to the negative evaluations made by specialists regarding the company for this article.

Bill Grueskin, a professor specializing in practical journalism at Columbia Journalism School, communicated via email that the overview seemed "pretty much okay" for a chatbot that was clearly identified as such. However, he mentioned it was difficult to form a full opinion without having read the original article from WIRED. Grueskin pointed out that it's problematic to use a direct quote without attribution, stating, "Quoting a sentence verbatim without quote marks is bad, of course." He expressed that he would be quite dismayed if a news organization published a summary generated by AI without revealing its source, or even worse, if it implied the summary was written by a human. It's noted that Perplexity did not suggest the content was created by a human.

Fortunately for Perplexity and its supporters, the issue at hand is more of an intellectual argument. The term plagiarism refers to a set of ethical standards crucial in fields such as journalism and academia, where tracing the origin of information is key. However, it does not carry legal weight by itself. For instance, if another company were to use a significant portion of the film Inside Out 2 in their own work, Disney would not accuse them of plagiarism but would instead file a lawsuit for copyright violation. In a similar vein, a letter from Forbes to Perplexity reportedly warning of legal measures specifically cites “willful infringement” of Forbes’ copyright. According to legal authorities, this places Perplexity in a relatively more secure legal position—at least for the time being.

"Deciding on the copyright issue is challenging," explains James Grimmelmann, a digital and information law professor at Cornell University. He points out that, although the summary conveys factual information that isn't subject to copyright, it also reproduces some content and encapsulates the essence of the original work. "This copyright scenario isn't clear-cut, yet it's not to be taken lightly or dismissed as inconsequential," he states.

Grimmelmann points out several potential legal challenges that Perplexity might face, including issues related to consumer protection, false advertising, or engaging in misleading business practices. These concerns arise from claims that could be directed at the company for declaring adherence to the Robots Exclusion Protocol while failing to comply with its standards. Despite the protocol being an optional guideline, it is generally followed by most. Furthermore, Grimmelmann suggests that Perplexity could be at risk of facing hot news misappropriation charges. This situation occurs when a competitor is accused of summarizing content in a way that either prevents the original publisher from gaining initial financial benefits or diminishes the content's worth to its paying audience, potentially violating copyright laws. The fact that Perplexity can bypass paywalls and operates on an automated basis are particularly troubling points for the company, according to Grimmelmann.

Grimmelmann suggests that Perplexity could be losing the legal shield provided by Section 230 of the Communications Decency Act. This legislation, among its provisions, safeguards search engines such as Google from being held accountable for defamation when they direct users to content that defames, by treating them as mere conduits of information from other sources. In his view, Perplexity enjoys the same protection as long as it provides accurate summaries of content. (There's ongoing discussion about whether material produced by AI falls under the protection of Section 230 at all.)

Authored by Mark

Authored by Dmitri Alperovitch

Authored by Matt Kamen

Authored by Kim Zetter

"If they inaccurately summarize the story and turn it into something defamatory that it originally wasn't, that's when they could face legal issues. This risk is heightened if they fail to properly attribute the source, making it difficult for readers to verify the story themselves," he explains. "Should Perplexity's modifications introduce defamation, Section 230 protections wouldn't apply, as established by several legal precedents interpreting the law."

In an instance witnessed by WIRED, the chatbot developed by Perplexity inaccurately stated, despite directly referencing the source, that WIRED had published an article claiming a certain law enforcement officer in California was guilty of a criminal act. (“We've openly acknowledged that responses won’t always be correct and might include fabrications,” responded Srinivas when asked about the issue for the article published earlier this week, “however, enhancing accuracy and the overall experience for users remains a fundamental part of our goals.”)

Grimmelmann suggests, “For a formal approach, I believe these allegations could overcome a motion to dismiss based on various legal theories. It doesn't guarantee a victory, but should the assertions made by Forbes, WIRED, and the police officer—among potential claimants—hold true, these are the type of issues that, if substantiated and coupled with unfavorable facts for Perplexity, might result in them being held liable.”

Contrary to Grimmelmann's viewpoint, Pam Samuelson, a law and information professor at UC Berkeley, argues in an email that copyright violation revolves around utilizing someone else's creative work in a manner that diminishes the original creator's potential to earn fair compensation for the value of the unauthorized usage. She suggests that the infringement likely does not occur with the mere use of one exact sentence.

Bhamati Viswanathan, an academic member at New England Law, expresses her doubts that the summary meets the required level of close resemblance often needed for a copyright infringement case to succeed. However, she believes this doesn't conclude the issue. "It definitely shouldn't be considered acceptable," she stated in an email. "My stance is that it should suffice to move your lawsuit beyond the initial dismissal stage—especially considering the evidence you possess of genuine copying."

Overall, she contends that concentrating solely on the specific technical aspects of these arguments might not be the best approach, given that technology firms are capable of tweaking their operations to comply with the technicalities of outdated copyright legislation, yet still significantly undermine its intended goals. She posits that to address the market imbalances and truly support the foundational objectives of US intellectual property rights—which include enabling individuals to reap financial rewards from unique creative endeavors such as journalism, thereby encouraging them to continue producing work that theoretically benefits society—a completely new legal structure might be required.

In her viewpoint, there are compelling reasons to believe that the foundation of generative AI involves a significant amount of copyright infringement. She poses an initial question: what steps do we take next? Furthermore, she delves into a more profound inquiry: how can we protect the livelihoods of creators and the sustainability of the creative industries? Paradoxically, AI is underscoring the increased value and demand for creativity. Yet, while acknowledging this, there's a visible threat to the very systems that support creators in earning an income through their art. This presents a critical challenge that demands immediate resolution, not a deferred one.

Suggested for You…

Delivered to your email: Fast Forward by Will Knight delves into the progress in artificial intelligence.

In the largest undercover investigation ever conducted by the FBI

The WIRED AI Elections Initiative: Monitoring over 60 worldwide electoral events

Ecuador finds itself utterly without electricity due to a severe drought.

Be confident: Here are the top mattresses available for purchase on the internet.

Dmitri Alperovitch

Author Profile:

Name: Matthew

Joseph Cox

Johnathan Fields

Name: Matt

Additional Content from WIRED

Evaluations and Instructions

© 2024 Condé Nast. All rights reserved. Purchases made through our website may result in WIRED receiving a share of the sales, courtesy of our Affiliate Partnership agreements with various retailers. Reproduction, distribution, transmission, caching, or any form of usage of the site's material is strictly prohibited without the express written consent from Condé Nast. Advertising Choices

Choose a global website

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *


Cars & Concepts28 mins ago

Ram’s Innovative Step: The Dual-Purpose Deployable Rear Step and Diffuser Aiming to Boost Electric Truck Efficiency

Cars & Concepts57 mins ago

Rivian Accelerates Electric Hatchback Race: R3X to Launch Before R3, CEO Confirms

Cars & Concepts1 hour ago

Volkswagen Group Brands Porsche and Audi Adjust EV Transition Timelines, Embrace PHEVs Amidst Technological and Demand Challenges

Cars & Concepts2 hours ago

GM’s Electric Vehicle Setbacks: Delays for Electric Truck Plant and Buick EV Unveiling

Moto GP4 hours ago

Trackhouse Racing Secures Raul Fernandez for 2025 MotoGP Season with New Two-Year Deal

Automakers & Suppliers4 hours ago

Driving the Future: Unveiling Ferrari’s Top Innovations in Supercar Technology

Automakers & Suppliers4 hours ago

Unveiling Ferrari’s Latest Innovations: Exploring Cutting-Edge Technology and Design in the Iconic Italian Supercar

Moto GP4 hours ago

Marc Marquez Opens Up About Feasible Switch to KTM or Aprilia and Future MotoGP Aspirations Amid Ducati Shift

Automakers & Suppliers6 hours ago

## Lamborghini’s AI Reporter: Unveiling the Future of Italian Luxury Vehicles and High-Performance Automobiles

Automakers & Suppliers6 hours ago

Top BMW Innovations: Unveiling Cutting-Edge Technology and AI Advancements in the Latest BMW News and Models

Automakers & Suppliers6 hours ago

Top Audi News: Exploring Audi’s Latest Innovations and AI-Driven Sustainability Efforts

Politics11 hours ago

Global Shakeup: Starmer Responds to Biden’s Exit from Presidential Race Amidst UK Political Maneuvering

Politics11 hours ago

Mel Stride Eyes Conservative Leadership Bid Amid Uncertain Campaign Timetable

Politics12 hours ago

Labour Urged to Confront ‘Net Zero Nimbys’ to Promote Green Energy and Reduce Utility Bills

Politics12 hours ago

Parliament Elects New Leadership: Committee Chairs and Vice-Chairs Set for Two-and-a-Half-Year Term

Politics12 hours ago

From Crime Journalist to Leadership Contender: Russell Findlay Announces Bid for Scottish Conservative Leadership

Politics13 hours ago

Parliament Elects New Leadership: Comprehensive Breakdown of Committee Chairs and Vice-Chairs

Politics13 hours ago

Revealed: Scrapped Rwanda Scheme Cost £700m Without Migrant Transfers, Tories Planned £10bn Spend

Moto GP2 months ago

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Moto GP2 months ago

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Sports2 months ago

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Moto GP2 months ago

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Sports2 months ago

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Moto GP2 months ago

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Sports2 months ago

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Sports2 months ago

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Sports2 months ago

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Sports2 months ago

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Business2 months ago

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Business2 months ago

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

Business2 months ago

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

Tech2 months ago

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

AI2 months ago

Revolutionizing the Future: How Leading AI Innovations Like and Are Redefining Industries

Business2 months ago

Hong Kong’s Ambitious Leap: The City’s Strategic Roadmap to Becoming a Global Innovation and Tech Hub

Business2 months ago

Driving Success: Mastering the Automobile Industry with Key Insights on Vehicle Manufacturing, Automotive Sales, and Beyond

Tech2 months ago

Revving Up the Future: Top Automotive Technology Trends Driving Sustainability, Safety, and the Electric Revolution


Get ready for a groundbreaking shift in the world of artificial intelligence as the V12 AI Revolution is on the horizon


Business NEWS



Chatten Sie mit uns

Hallo! Wie kann ich Ihnen helfen?

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe now to keep reading and get access to the full archive.

Continue reading