AI

AI Startup Perplexity Accused of Plagiarism and Violating Web Protocols in WIRED Investigation

Published

4 months ago

June 22, 2024

To access this article again, go to My Profile and select Saved Stories.

Authored by Tim Marchman

Confusion Surrounds Our Report on Perplexity's Dubious Practices

This week, WIRED released an article discussing the AI-based search company Perplexity, recently criticized by Forbes for plagiarism. Within this piece, my colleague Dhruv Mehrotra and I uncovered that the firm was covertly harvesting content through web crawlers, gathering data from sites that had attempted to restrict its access, thereby contravening its declared commitment to respect the Robots Exclusion Protocol.

Our research, alongside discoveries made by developer Robb Knight, pinpointed a particular IP address highly likely associated with Perplexity, yet not included in its publicly disclosed IP addresses. This IP was caught scraping trial websites seemingly in reaction to inputs made to the firm's publicly accessible chat interface. Server logs revealed that this IP accessed assets owned by Condé Nast, the parent company of WIRED, on no fewer than 822 occasions over the last three months—a figure that probably falls short of the actual count, given that the company only stores a limited fraction of its data.

We also disclosed that the chatbot was engaging in deceptive behavior, from a technical standpoint. In a particular test, it created a narrative about a girl chasing a path of mushrooms as a summary for the content of a website, which, based on server logs, its agent seemingly never tried to visit.

Perplexity's top executive, Aravind Srinivas, along with the company itself, did not directly challenge the details presented in WIRED's article. Srinivas, in his response, mentioned, "WIRED's questions demonstrate a significant misunderstanding of Perplexity's operations and the fundamentals of the Internet," according to his statement. With financial backing from Jeff Bezos’ private investment firm and Nvidia among its supporters, Perplexity claims a valuation of one billion dollars following its latest capital raising initiative. Furthermore, a report by The Information last month indicated that the company was negotiating a potential deal expected to triple its valuation to three billion dollars. (Attempts to reach Bezos for comment were unsuccessful; Nvidia opted not to comment.)

Upon releasing our article, I engaged with three top chatbots to discuss its contents. ChatGPT by OpenAI and Anthropic's Claude both provided speculative responses about the topic of the article, indicating they couldn't access the piece itself. In contrast, the Perplexity chatbot delivered a detailed six-paragraph summary, comprising 287 words that precisely outlined the article's findings and the supporting evidence. (WIRED's server records reveal an attempt by a bot, likely associated with Perplexity though not from a known IP address of theirs, to access the article on its release day, receiving a 404 error instead. It's important to note that not all traffic data is preserved by the company, so this snapshot may not fully represent the bot's or other Perplexity entities' activities.) At the beginning of the generated summary, there's a hyperlink to the original article, and a small gray icon at the end of the final five paragraphs offers a link back to the source material. Notably, a sentence in the fifth paragraph's last third is an exact copy of one from the original article, describing a fictional tale of a girl named Amelia exploring a mystical forest filled with glowing mushrooms known as Whisper Woods.

My colleagues and I were taken aback by what we perceived to be an act of plagiarism. It seemed to meet the standards outlined by the Poynter Institute, notably passing the stringent seven-to-10 word test. This rule suggests that accidentally reproducing seven consecutive words from someone else's writing is unlikely. (Kelly McBride, a senior vice president at Poynter who has mentioned the usefulness of this test for detecting plagiarism, did not respond to an email inquiry.)

Authored by Mark

Authored by Dmitri Alperovitch

By Matt Kamen

Authored by Kim Zetter

"Should a student of mine submit a narrative resembling this, I'd present their case to the committee on academic integrity for copying," remarked John Schwartz, a professor at the University of Texas at Austin's school of journalism, upon reviewing both the initial article and its condensed version. "To me, it seems overly similar. As I went through the version from Perplexity, it felt like I was hearing an echo."

Confusion and the company's chief executive officer, Srinivas, failed to reply to a comprehensive inquiry seeking their reaction to the negative evaluations made by specialists regarding the company for this article.

Bill Grueskin, a professor specializing in practical journalism at Columbia Journalism School, communicated via email that the overview seemed "pretty much okay" for a chatbot that was clearly identified as such. However, he mentioned it was difficult to form a full opinion without having read the original article from WIRED. Grueskin pointed out that it's problematic to use a direct quote without attribution, stating, "Quoting a sentence verbatim without quote marks is bad, of course." He expressed that he would be quite dismayed if a news organization published a summary generated by AI without revealing its source, or even worse, if it implied the summary was written by a human. It's noted that Perplexity did not suggest the content was created by a human.

Fortunately for Perplexity and its supporters, the issue at hand is more of an intellectual argument. The term plagiarism refers to a set of ethical standards crucial in fields such as journalism and academia, where tracing the origin of information is key. However, it does not carry legal weight by itself. For instance, if another company were to use a significant portion of the film Inside Out 2 in their own work, Disney would not accuse them of plagiarism but would instead file a lawsuit for copyright violation. In a similar vein, a letter from Forbes to Perplexity reportedly warning of legal measures specifically cites “willful infringement” of Forbes’ copyright. According to legal authorities, this places Perplexity in a relatively more secure legal position—at least for the time being.

"Deciding on the copyright issue is challenging," explains James Grimmelmann, a digital and information law professor at Cornell University. He points out that, although the summary conveys factual information that isn't subject to copyright, it also reproduces some content and encapsulates the essence of the original work. "This copyright scenario isn't clear-cut, yet it's not to be taken lightly or dismissed as inconsequential," he states.

Grimmelmann points out several potential legal challenges that Perplexity might face, including issues related to consumer protection, false advertising, or engaging in misleading business practices. These concerns arise from claims that could be directed at the company for declaring adherence to the Robots Exclusion Protocol while failing to comply with its standards. Despite the protocol being an optional guideline, it is generally followed by most. Furthermore, Grimmelmann suggests that Perplexity could be at risk of facing hot news misappropriation charges. This situation occurs when a competitor is accused of summarizing content in a way that either prevents the original publisher from gaining initial financial benefits or diminishes the content's worth to its paying audience, potentially violating copyright laws. The fact that Perplexity can bypass paywalls and operates on an automated basis are particularly troubling points for the company, according to Grimmelmann.

Grimmelmann suggests that Perplexity could be losing the legal shield provided by Section 230 of the Communications Decency Act. This legislation, among its provisions, safeguards search engines such as Google from being held accountable for defamation when they direct users to content that defames, by treating them as mere conduits of information from other sources. In his view, Perplexity enjoys the same protection as long as it provides accurate summaries of content. (There's ongoing discussion about whether material produced by AI falls under the protection of Section 230 at all.)

Authored by Mark

Authored by Dmitri Alperovitch

Authored by Matt Kamen

Authored by Kim Zetter

"If they inaccurately summarize the story and turn it into something defamatory that it originally wasn't, that's when they could face legal issues. This risk is heightened if they fail to properly attribute the source, making it difficult for readers to verify the story themselves," he explains. "Should Perplexity's modifications introduce defamation, Section 230 protections wouldn't apply, as established by several legal precedents interpreting the law."

In an instance witnessed by WIRED, the chatbot developed by Perplexity inaccurately stated, despite directly referencing the source, that WIRED had published an article claiming a certain law enforcement officer in California was guilty of a criminal act. (“We've openly acknowledged that responses won’t always be correct and might include fabrications,” responded Srinivas when asked about the issue for the article published earlier this week, “however, enhancing accuracy and the overall experience for users remains a fundamental part of our goals.”)

Grimmelmann suggests, “For a formal approach, I believe these allegations could overcome a motion to dismiss based on various legal theories. It doesn't guarantee a victory, but should the assertions made by Forbes, WIRED, and the police officer—among potential claimants—hold true, these are the type of issues that, if substantiated and coupled with unfavorable facts for Perplexity, might result in them being held liable.”

Contrary to Grimmelmann's viewpoint, Pam Samuelson, a law and information professor at UC Berkeley, argues in an email that copyright violation revolves around utilizing someone else's creative work in a manner that diminishes the original creator's potential to earn fair compensation for the value of the unauthorized usage. She suggests that the infringement likely does not occur with the mere use of one exact sentence.

Bhamati Viswanathan, an academic member at New England Law, expresses her doubts that the summary meets the required level of close resemblance often needed for a copyright infringement case to succeed. However, she believes this doesn't conclude the issue. "It definitely shouldn't be considered acceptable," she stated in an email. "My stance is that it should suffice to move your lawsuit beyond the initial dismissal stage—especially considering the evidence you possess of genuine copying."

Overall, she contends that concentrating solely on the specific technical aspects of these arguments might not be the best approach, given that technology firms are capable of tweaking their operations to comply with the technicalities of outdated copyright legislation, yet still significantly undermine its intended goals. She posits that to address the market imbalances and truly support the foundational objectives of US intellectual property rights—which include enabling individuals to reap financial rewards from unique creative endeavors such as journalism, thereby encouraging them to continue producing work that theoretically benefits society—a completely new legal structure might be required.

In her viewpoint, there are compelling reasons to believe that the foundation of generative AI involves a significant amount of copyright infringement. She poses an initial question: what steps do we take next? Furthermore, she delves into a more profound inquiry: how can we protect the livelihoods of creators and the sustainability of the creative industries? Paradoxically, AI is underscoring the increased value and demand for creativity. Yet, while acknowledging this, there's a visible threat to the very systems that support creators in earning an income through their art. This presents a critical challenge that demands immediate resolution, not a deferred one.

Suggested for You…

Delivered to your email: Fast Forward by Will Knight delves into the progress in artificial intelligence.

In the largest undercover investigation ever conducted by the FBI

The WIRED AI Elections Initiative: Monitoring over 60 worldwide electoral events

Ecuador finds itself utterly without electricity due to a severe drought.

Be confident: Here are the top mattresses available for purchase on the internet.

Dmitri Alperovitch

Author Profile:

Name: Matthew

Joseph Cox

Johnathan Fields

Name: Matt

Additional Content from WIRED

Evaluations and Instructions

© 2024 Condé Nast. All rights reserved. Purchases made through our website may result in WIRED receiving a share of the sales, courtesy of our Affiliate Partnership agreements with various retailers. Reproduction, distribution, transmission, caching, or any form of usage of the site's material is strictly prohibited without the express written consent from Condé Nast. Advertising Choices

Choose a global website

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Automobilnews News – The first AI News Portal world wide

AI Startup Perplexity Accused of Plagiarism and Violating Web Protocols in WIRED Investigation

Related

Discover more from Automobilnews News - The first AI News Portal world wide

You may like

Leave a Reply Cancel reply

Leave a Reply

SUBSCRIBE FOR FREE

McLaren Fumes Over ‘Unjust’ Penalty: Andrea Stella Criticizes F1 Stewards in Norris-Verstappen Clash

Driving the Future: Lamborghini’s Pioneering Strides in High-Performance and Luxury Automotive Excellence

Who Will Replace Di Giannantonio? VR46’s Crucial Decision Analyzed

Lance Stroll Sets Unwanted F1 Record After United States GP: Most Starts Without a Fastest Lap

Drama Down Under: Marquez’s Triumph, Social Media Buzz, and Wildlife Chaos at the 2024 Australian MotoGP

Esteban Ocon’s Apology to Franco Colapinto: Fastest Lap Drama Unfolds at Austin Grand Prix

United States Grand Prix 2024: Hamilton Hits Season Low as Leclerc and Verstappen Shine

AI in the Balance: How the 2024 US Presidential Election Could Shape the Future of Artificial Intelligence Regulation

AI at a Crossroads: The Impact of the U.S. Presidential Election on Artificial Intelligence Regulation and Innovation

Unequal Packages: Sergio Perez Voices Concerns Over Equipment Disparity with Verstappen at US Grand Prix

Stemming the Tide: Government Steps In as HS2 Costs Skyrocket, Independent Review Launched

Racing Stars and Stripes: The Quest for America’s Next F1 Driver Amid Austin’s High-Octane Drama

National Call to Action: Government Seeks Public Input to Shape Future of NHS Amidst Historic Crisis

Milkshake Assault: Woman Pleads Guilty to Attacking Nigel Farage During Election Campaign

MEPs Advocate Bold Climate Finance and Fossil Fuel Phase-Out Ahead of COP29 in Baku

JK Rowling Declines Peerage Offers, Cites Personal Reasons Amid Political Praise and Criticism

European Parliament Gears Up for Crucial Plenary Session: Key Discussions and Press Briefing Details

Maranello’s Marvels: Unveiling Ferrari’s Iconic Innovations and Performance-Driven Supercar Technologies

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Porsche 911 Goes Hybrid: Iconic Sports Car’s Historic Leap Towards Electrification Revealed on May 28

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Seat Leon (2024): Die Evolution des Spanischen Bestsellers – Neue Technik, Bewährtes Design

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

V12 AI REVOLUTION COMMING SOON !

SPORT NEWS

McLaren Fumes Over ‘Unjust’ Penalty: Andrea Stella Criticizes F1 Stewards in Norris-Verstappen Clash

Who Will Replace Di Giannantonio? VR46’s Crucial Decision Analyzed

Lance Stroll Sets Unwanted F1 Record After United States GP: Most Starts Without a Fastest Lap

Business NEWS

CSRC Chairman Wu Qing Courts International Feedback as China Bolsters Market Momentum Amid Slowing Growth

European Family Offices Drawn to Hong Kong’s Attractive Investment Landscape: BNP Paribas CEO Speaks on China’s Long-term Growth Potential

Hong Kong Property Market Sees Revival: Policy Changes and Robust Home Sales Bolster Confidence, Say Analysts

POLITCS NEWS

Stemming the Tide: Government Steps In as HS2 Costs Skyrocket, Independent Review Launched

National Call to Action: Government Seeks Public Input to Shape Future of NHS Amidst Historic Crisis

Milkshake Assault: Woman Pleads Guilty to Attacking Nigel Farage During Election Campaign

Chatten Sie mit uns

Discover more from Automobilnews News - The first AI News Portal world wide

Leave a Reply
Cancel reply

”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”