Connect with us

AI

AI Startup Perplexity Accused of Plagiarism and Violating Web Protocols in WIRED Investigation

Published

on

To access this article again, go to My Profile and select Saved Stories.

Authored by Tim Marchman

Confusion Surrounds Our Report on Perplexity's Dubious Practices

This week, WIRED released an article discussing the AI-based search company Perplexity, recently criticized by Forbes for plagiarism. Within this piece, my colleague Dhruv Mehrotra and I uncovered that the firm was covertly harvesting content through web crawlers, gathering data from sites that had attempted to restrict its access, thereby contravening its declared commitment to respect the Robots Exclusion Protocol.

Our research, alongside discoveries made by developer Robb Knight, pinpointed a particular IP address highly likely associated with Perplexity, yet not included in its publicly disclosed IP addresses. This IP was caught scraping trial websites seemingly in reaction to inputs made to the firm's publicly accessible chat interface. Server logs revealed that this IP accessed assets owned by Condé Nast, the parent company of WIRED, on no fewer than 822 occasions over the last three months—a figure that probably falls short of the actual count, given that the company only stores a limited fraction of its data.

We also disclosed that the chatbot was engaging in deceptive behavior, from a technical standpoint. In a particular test, it created a narrative about a girl chasing a path of mushrooms as a summary for the content of a website, which, based on server logs, its agent seemingly never tried to visit.

Perplexity's top executive, Aravind Srinivas, along with the company itself, did not directly challenge the details presented in WIRED's article. Srinivas, in his response, mentioned, "WIRED's questions demonstrate a significant misunderstanding of Perplexity's operations and the fundamentals of the Internet," according to his statement. With financial backing from Jeff Bezos’ private investment firm and Nvidia among its supporters, Perplexity claims a valuation of one billion dollars following its latest capital raising initiative. Furthermore, a report by The Information last month indicated that the company was negotiating a potential deal expected to triple its valuation to three billion dollars. (Attempts to reach Bezos for comment were unsuccessful; Nvidia opted not to comment.)

Upon releasing our article, I engaged with three top chatbots to discuss its contents. ChatGPT by OpenAI and Anthropic's Claude both provided speculative responses about the topic of the article, indicating they couldn't access the piece itself. In contrast, the Perplexity chatbot delivered a detailed six-paragraph summary, comprising 287 words that precisely outlined the article's findings and the supporting evidence. (WIRED's server records reveal an attempt by a bot, likely associated with Perplexity though not from a known IP address of theirs, to access the article on its release day, receiving a 404 error instead. It's important to note that not all traffic data is preserved by the company, so this snapshot may not fully represent the bot's or other Perplexity entities' activities.) At the beginning of the generated summary, there's a hyperlink to the original article, and a small gray icon at the end of the final five paragraphs offers a link back to the source material. Notably, a sentence in the fifth paragraph's last third is an exact copy of one from the original article, describing a fictional tale of a girl named Amelia exploring a mystical forest filled with glowing mushrooms known as Whisper Woods.

My colleagues and I were taken aback by what we perceived to be an act of plagiarism. It seemed to meet the standards outlined by the Poynter Institute, notably passing the stringent seven-to-10 word test. This rule suggests that accidentally reproducing seven consecutive words from someone else's writing is unlikely. (Kelly McBride, a senior vice president at Poynter who has mentioned the usefulness of this test for detecting plagiarism, did not respond to an email inquiry.)

Authored by Mark

Authored by Dmitri Alperovitch

By Matt Kamen

Authored by Kim Zetter

"Should a student of mine submit a narrative resembling this, I'd present their case to the committee on academic integrity for copying," remarked John Schwartz, a professor at the University of Texas at Austin's school of journalism, upon reviewing both the initial article and its condensed version. "To me, it seems overly similar. As I went through the version from Perplexity, it felt like I was hearing an echo."

Confusion and the company's chief executive officer, Srinivas, failed to reply to a comprehensive inquiry seeking their reaction to the negative evaluations made by specialists regarding the company for this article.

Bill Grueskin, a professor specializing in practical journalism at Columbia Journalism School, communicated via email that the overview seemed "pretty much okay" for a chatbot that was clearly identified as such. However, he mentioned it was difficult to form a full opinion without having read the original article from WIRED. Grueskin pointed out that it's problematic to use a direct quote without attribution, stating, "Quoting a sentence verbatim without quote marks is bad, of course." He expressed that he would be quite dismayed if a news organization published a summary generated by AI without revealing its source, or even worse, if it implied the summary was written by a human. It's noted that Perplexity did not suggest the content was created by a human.

Fortunately for Perplexity and its supporters, the issue at hand is more of an intellectual argument. The term plagiarism refers to a set of ethical standards crucial in fields such as journalism and academia, where tracing the origin of information is key. However, it does not carry legal weight by itself. For instance, if another company were to use a significant portion of the film Inside Out 2 in their own work, Disney would not accuse them of plagiarism but would instead file a lawsuit for copyright violation. In a similar vein, a letter from Forbes to Perplexity reportedly warning of legal measures specifically cites “willful infringement” of Forbes’ copyright. According to legal authorities, this places Perplexity in a relatively more secure legal position—at least for the time being.

"Deciding on the copyright issue is challenging," explains James Grimmelmann, a digital and information law professor at Cornell University. He points out that, although the summary conveys factual information that isn't subject to copyright, it also reproduces some content and encapsulates the essence of the original work. "This copyright scenario isn't clear-cut, yet it's not to be taken lightly or dismissed as inconsequential," he states.

Grimmelmann points out several potential legal challenges that Perplexity might face, including issues related to consumer protection, false advertising, or engaging in misleading business practices. These concerns arise from claims that could be directed at the company for declaring adherence to the Robots Exclusion Protocol while failing to comply with its standards. Despite the protocol being an optional guideline, it is generally followed by most. Furthermore, Grimmelmann suggests that Perplexity could be at risk of facing hot news misappropriation charges. This situation occurs when a competitor is accused of summarizing content in a way that either prevents the original publisher from gaining initial financial benefits or diminishes the content's worth to its paying audience, potentially violating copyright laws. The fact that Perplexity can bypass paywalls and operates on an automated basis are particularly troubling points for the company, according to Grimmelmann.

Grimmelmann suggests that Perplexity could be losing the legal shield provided by Section 230 of the Communications Decency Act. This legislation, among its provisions, safeguards search engines such as Google from being held accountable for defamation when they direct users to content that defames, by treating them as mere conduits of information from other sources. In his view, Perplexity enjoys the same protection as long as it provides accurate summaries of content. (There's ongoing discussion about whether material produced by AI falls under the protection of Section 230 at all.)

Authored by Mark

Authored by Dmitri Alperovitch

Authored by Matt Kamen

Authored by Kim Zetter

"If they inaccurately summarize the story and turn it into something defamatory that it originally wasn't, that's when they could face legal issues. This risk is heightened if they fail to properly attribute the source, making it difficult for readers to verify the story themselves," he explains. "Should Perplexity's modifications introduce defamation, Section 230 protections wouldn't apply, as established by several legal precedents interpreting the law."

In an instance witnessed by WIRED, the chatbot developed by Perplexity inaccurately stated, despite directly referencing the source, that WIRED had published an article claiming a certain law enforcement officer in California was guilty of a criminal act. (“We've openly acknowledged that responses won’t always be correct and might include fabrications,” responded Srinivas when asked about the issue for the article published earlier this week, “however, enhancing accuracy and the overall experience for users remains a fundamental part of our goals.”)

Grimmelmann suggests, “For a formal approach, I believe these allegations could overcome a motion to dismiss based on various legal theories. It doesn't guarantee a victory, but should the assertions made by Forbes, WIRED, and the police officer—among potential claimants—hold true, these are the type of issues that, if substantiated and coupled with unfavorable facts for Perplexity, might result in them being held liable.”

Contrary to Grimmelmann's viewpoint, Pam Samuelson, a law and information professor at UC Berkeley, argues in an email that copyright violation revolves around utilizing someone else's creative work in a manner that diminishes the original creator's potential to earn fair compensation for the value of the unauthorized usage. She suggests that the infringement likely does not occur with the mere use of one exact sentence.

Bhamati Viswanathan, an academic member at New England Law, expresses her doubts that the summary meets the required level of close resemblance often needed for a copyright infringement case to succeed. However, she believes this doesn't conclude the issue. "It definitely shouldn't be considered acceptable," she stated in an email. "My stance is that it should suffice to move your lawsuit beyond the initial dismissal stage—especially considering the evidence you possess of genuine copying."

Overall, she contends that concentrating solely on the specific technical aspects of these arguments might not be the best approach, given that technology firms are capable of tweaking their operations to comply with the technicalities of outdated copyright legislation, yet still significantly undermine its intended goals. She posits that to address the market imbalances and truly support the foundational objectives of US intellectual property rights—which include enabling individuals to reap financial rewards from unique creative endeavors such as journalism, thereby encouraging them to continue producing work that theoretically benefits society—a completely new legal structure might be required.

In her viewpoint, there are compelling reasons to believe that the foundation of generative AI involves a significant amount of copyright infringement. She poses an initial question: what steps do we take next? Furthermore, she delves into a more profound inquiry: how can we protect the livelihoods of creators and the sustainability of the creative industries? Paradoxically, AI is underscoring the increased value and demand for creativity. Yet, while acknowledging this, there's a visible threat to the very systems that support creators in earning an income through their art. This presents a critical challenge that demands immediate resolution, not a deferred one.

Suggested for You…

Delivered to your email: Fast Forward by Will Knight delves into the progress in artificial intelligence.

In the largest undercover investigation ever conducted by the FBI

The WIRED AI Elections Initiative: Monitoring over 60 worldwide electoral events

Ecuador finds itself utterly without electricity due to a severe drought.

Be confident: Here are the top mattresses available for purchase on the internet.

Dmitri Alperovitch

Author Profile:

Name: Matthew

Joseph Cox

Johnathan Fields

Name: Matt

Additional Content from WIRED

Evaluations and Instructions

© 2024 Condé Nast. All rights reserved. Purchases made through our website may result in WIRED receiving a share of the sales, courtesy of our Affiliate Partnership agreements with various retailers. Reproduction, distribution, transmission, caching, or any form of usage of the site's material is strictly prohibited without the express written consent from Condé Nast. Advertising Choices

Choose a global website


Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

SUBSCRIBE FOR FREE

Advertisement
F113 mins ago

**Lewis Hamilton Condemns FIA President’s Swearing Clampdown Comments as Racially Insensitive**

Moto GP26 mins ago

Yamaha Confirms V4 Engine Development for MotoGP with Potential 2025 Debut

F142 mins ago

Resilient Hamilton Vows to ‘Give It Absolutely Everything’ After Azerbaijan Setback Ahead of Singapore GP

Moto GP57 mins ago

Fabio Quartararo Criticizes Yamaha’s Disorganized Test Team Amid Strategic Shifts and New Partnerships

F11 hour ago

New Audi F1 Contender Sparks Speculation as Bottas Stays Tight-Lipped on Future

Moto GP1 hour ago

Brad Binder Praises ‘Radical’ 2025 KTM MotoGP Prototype: ‘Quite Different’ to Current Model

F12 hours ago

Charles Leclerc Unveils Ferrari’s Internal Debate Over McLaren’s Controversial Rear Wing

Moto GP2 hours ago

Marc Marquez Praises Pecco Bagnaia for Defusing Misano Crowd Boos: A Call for Respect in MotoGP

Automakers & Suppliers2 hours ago

Exploring the Apex of Innovation: Lamborghini’s Latest Supercar Technologies and Luxury Advancements

Automakers & Suppliers4 hours ago

Unveiling Ferrari’s Latest Supercar Innovations: A Deep Dive into Maranello’s Masterpieces and Cutting-Edge Technologies

Sports5 hours ago

Nigel Mansell Criticizes Ferrari’s “Short-Sighted” Decision on Adrian Newey, Predicts Bright Future for Aston Martin

AI5 hours ago

Revealing the AI Gap: How U.S. Teens Outpace Their Parents in Generative AI Use and Understanding

Sports5 hours ago

Peter Windsor Dismisses Russell’s Pirelli Complaints as “Nonsense,” Questions Mercedes Driver’s Approach Post-Azerbaijan GP

AI5 hours ago

Revolutionizing Creativity: YouTube to Unleash Generative AI Video Creation with Veo Model Integration

Sports6 hours ago

Wolff Identifies Tyre Temperature Control as Mercedes’ Key Challenge at Singapore Grand Prix

AI6 hours ago

SocialAI: Navigating the Echo Chamber of AI-Generated Companions

AI6 hours ago

Into the AI Abyss: Navigating the Uncanny World of SocialAI

Sports6 hours ago

Nigel Mansell Weighs in on McLaren’s Team Strategy: Urges Lando Norris to “Step Up” Amid Title Race

Politics2 months ago

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Moto GP4 months ago

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Sports4 months ago

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Moto GP4 months ago

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Moto GP4 months ago

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Sports4 months ago

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Moto GP4 months ago

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Sports4 months ago

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Sports4 months ago

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Sports4 months ago

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Sports4 months ago

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Business4 months ago

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Cars & Concepts2 months ago

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Business4 months ago

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

AI4 months ago

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

Business4 months ago

Driving Success in the Fast Lane: Mastering Market Trends, Technological Innovations, and Strategic Excellence in the Automobile Industry

Mobility Report4 months ago

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

Tech4 months ago

Driving the Future: Exploring Top Innovations in Automotive Technology for Enhanced Safety, Efficiency, and Connectivity

V12 AI REVOLUTION COMMING SOON !

Get ready for a groundbreaking shift in the world of artificial intelligence as the V12 AI Revolution is on the horizon

SPORT NEWS

Business NEWS

Advertisement

POLITCS NEWS

Chatten Sie mit uns

Hallo! Wie kann ich Ihnen helfen?

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe now to keep reading and get access to the full archive.

Continue reading

×