Connect with us

AI

Taking Back Control: How to Opt-Out and Protect Your Data from AI Training

Published

on

To look at this article again, go to My Profile and then click on View saved stories.

Preventing Your Data from Contributing to AI Training

When you make a purchase through links featured in our articles, we might receive a commission. This contributes to our journalistic efforts. Find out more. Additionally, think about subscribing to WIRED.

Every piece of content you've ever shared on the internet—whether it's an awkward tweet, an old blog entry, a glowing review of a diner, or a grainy selfie on Instagram—has likely been ingested and utilized in the development of today's wave of generative artificial intelligence.

Tools such as ChatGPT and various image generators rely heavily on the extensive amounts of data we provide. Whether or not it's being used to fuel a conversational AI or another type of generative application, the information you've uploaded to numerous servers across the internet could be utilized for machine learning functionalities.

Technology corporations have harvested extensive amounts of online data to build generative AI, frequently overlooking the concerns of content creators, copyright rules, or individual privacy. Additionally, companies in possession of massive collections of users' contributions are now aiming to capitalize on the AI boom by monetizing this data. This includes, notably, Reddit.

Nonetheless, amid the accumulating legal actions and inquiries concerning generative AI and its secretive handling of data, there have been modest initiatives aimed at granting individuals more authority over the fate of their online postings. A number of firms have started allowing both private users and commercial clients the choice to exclude their content from being utilized in AI training or from being commercialized for such training. This outlines the options available to users, along with the limitations they face.

Revision Notice: The contents of this guide were revised in October 2024. Fresh websites and services have been incorporated into the ensuing list, and certain instructions have been updated to reflect current practices. This document will be subject to ongoing modifications to align with the changing features and guidelines of the tools mentioned.

Setting Expectations on Opting Out

Before diving into the methods for opting out, it's important to temper expectations. The reality is that numerous AI developers have extensively mined the internet, meaning that any content you've shared online is likely already incorporated into their databases. Furthermore, these companies are often tight-lipped regarding the specifics of the data they've gathered, bought, or utilized for training their algorithms. Niloofar Mireshghallah, an AI privacy specialist at the University of Washington, points out, “Our knowledge in this area is quite limited. By and large, the operations remain opaque.”

Mireshghallah points out that firms often create complex procedures for individuals to withdraw consent for their data to be utilized in AI development. Additionally, many individuals are not fully aware of the agreements they have entered into or the manner in which their data is employed. This issue becomes even more complicated when considering the impact of legal frameworks, including copyright laws and Europe's stringent privacy regulations. Major corporations like Facebook, Google, and X have stipulated in their privacy agreements the potential use of consumer data for AI training purposes.

Mireshghallah points out that although there are numerous technical methods for deleting data or enabling AI systems to "unlearn" information, there's a significant lack of knowledge regarding the procedures being utilized. These methods might be either hidden or require considerable effort. Efforts to have content eliminated from AI training datasets are expected to face substantial challenges. When it comes to companies beginning to permit users to opt out of future data collection or sharing, they typically set the default to automatically include users unless they specifically choose not to participate.

"Many businesses introduce obstacles because they understand individuals aren't likely to search for it," Thorin Klosowski, an advocate for security and privacy at the Electronic Frontier Foundation, mentions. "Choosing to opt-in requires intentional effort, unlike opting out, which necessitates awareness of its existence."

Unlike many, a few firms engaged in developing AI capabilities and machine learning algorithms choose not to enroll users automatically. "By default, we don't utilize user-generated data to refine our models. We might incorporate user interactions and responses into training Claude, but only when we receive explicit consent, for instance, when users rate a particular Claude response with a thumbs up or down to offer feedback," explains Jennifer Martinez, representing Anthropic. In this case, the latest version of Anthropic's Claude chatbot is crafted using publicly accessible data and information from third-party sources—material shared by individuals on the internet—but it excludes personal user data.

The bulk of this guide focuses on opting out of text-based data collection, yet visual artists are also utilizing the "Have I Been Trained?" platform to indicate their artwork should not be utilized in training datasets. Spawning, the company behind this initiative, offers a service that enables individuals to check if their work has been collected without consent and to withdraw from any subsequent training uses. "You can opt out of anything that has a URL. We primarily search for images, but through our browser extension, you can opt out of any type of media," explains Jordan Meyer, Spawning's cofounder and CEO. Stability AI, known for its text-to-image generator Stable Diffusion, is one of the firms that has acknowledged adhering to this opt-out mechanism.

The following roster comprises only those organizations that have established procedures for opting out. For instance, Meta does not provide such a facility. "At this moment, we lack a specific opt-out option, but we have developed tools within our platform enabling users to remove their personal data from conversations involving Meta AI across our applications," states Emil Vazquez, a representative for Meta. The complete methodology for this can be found here.

Additionally, Microsoft's Copilot is set to introduce a new method allowing users to opt out of having their data used for training its generative AI, expected to launch in the near future. "A select number of interactions from users with Copilot and Copilot Pro are utilized to refine the service," explains Donny Turnbaugh, who represents the company. "Microsoft implements measures to anonymize data prior to its usage, aiming to safeguard user identities." Despite the anonymization process—where entered data is stripped of any details that could trace back to the individual—users concerned about privacy might seek further control over their personal information and decide to opt out once the option is made available.

Opting Out of AI Training Participation

When using Adobe's Creative Cloud for file storage, your data might be utilized by Adobe to enhance its programs. This analysis is exclusive to files stored in the cloud and does not extend to data solely kept on your personal devices. Moreover, Adobe asserts that it does not employ these files for generative AI model training purposes, except under a specific condition. As stated in the recently revised FAQ section of Adobe, "Your content will not be used for training generative AI models unless you voluntarily contribute content to the Adobe Stock marketplace."

For individuals with a personal Adobe account, withdrawing from content analysis is straightforward. Simply navigate to Adobe's privacy settings page, proceed to the section titled "Content analysis for product improvement," and switch the toggle to the off position. Those holding a business or educational account are exempt by default.

Amazon Web Services offers AI solutions such as Amazon Rekognition and Amazon CodeWhisperer, which might utilize client data to enhance their services. However, customers have the option to withdraw from this AI enhancement training. Previously, this was considered a complex procedure, but it has been significantly simplified recently. Amazon has detailed the entire opt-out process for entities on a dedicated support page.

Figma, a widely-used design tool, might utilize your data to train its models. Accounts under an Organization or Enterprise subscription are automatically exempt from this. However, accounts with Starter or Professional plans are enrolled by default. To modify this preference, team administrators can navigate to the settings, access the AI section, and deactivate the Content training option.

For those who use Google's Gemini chatbot, there may be instances where your conversations are chosen for review by actual people to enhance the AI's capabilities. However, if you prefer to opt out, the process is straightforward. Simply access Gemini on your web browser, navigate to the Activity section, and locate the Turn Off option in the dropdown menu. From there, you have the choice to disable the Gemini Apps Activity or to both withdraw from this feature and erase your chat history. It's important to note, though, that opting out doesn't remove data already selected for review, and according to the Gemini privacy center on Google, such data could remain for up to three years.

Grammarly has made changes to its policy, allowing individual users to exclude their data from AI training. To activate this option, navigate to your Account, enter Settings, and disable the toggle for Product Improvement and Training. If your account is associated with a corporate or educational license, you're already exempt from this by default.

Kate O'Flaherty authored an insightful article for WIRED covering Grok AI and how to safeguard your privacy on X, the platform hosting the chatbot. This instance mirrors numerous others where countless site users discovered they had been defaulted into AI training programs with scant warning. If your X account is still active, you can withdraw your data from being utilized to refine Grok. Simply navigate to the Settings and privacy area, proceed to Privacy and safety, access the Grok section, and uncheck the option to share your data.

HubSpot, a widely used platform for marketing and sales software, leverages customer data to enhance its AI model. However, there is no direct option to disable the data utilization for AI enhancement. Instead, users are required to email privacy@hubspot.com, specifically asking for their account data to be excluded from the process.

In September, members of a professional networking platform were taken aback upon discovering their information might be utilized to enhance artificial intelligence systems. "Ultimately, individuals are seeking an advantage in their professional lives, and our next-generation AI offerings are designed to provide them with that support," stated Eleanor Crum, a representative for LinkedIn.

To prevent your new LinkedIn updates from being utilized in AI training, navigate to your profile and access the Settings. Click on Data Privacy and toggle off the option marked as Use my data for training content creation AI models.

Individuals often share private details when interacting with chatbots. OpenAI offers several choices regarding the fate of the information shared with ChatGPT, such as the option to prevent it from being used to train future AI models. "We provide users with numerous straightforward methods to manage their data, encompassing tools for retrieving, downloading, and erasing personal data via ChatGPT. Options to exclude their data from being utilized in model training are also readily available," explains Taya Christianson, a representative of OpenAI. (The specific options available can differ based on the type of account one holds, and data from corporate clients are not employed in training models).

OpenAI advises on its support pages that individuals wishing to disable ChatGPT web user data contribution should proceed to Settings, then Data Controls, and deselect the option labeled "Improve the model for everyone." Beyond ChatGPT, OpenAI encompasses a broader spectrum of initiatives. Regarding its Dall-E 3 image creation tool, the company offers a specific request form for users aiming to exclude images from being utilized in upcoming training collections. This form requires the submission of the user's name, email address, confirmation of image ownership or representation of a corporation, a description of the image, and the option to upload the image(s) in question.

OpenAI suggests that for those who have a large number of images available on the internet that they wish to exclude from training datasets, it might be "more practical" to incorporate GPTBot into the robots.txt file on the site hosting the images.

Historically, the robots.txt file, a basic text document typically located at websitename.com/robots.txt, has served the purpose of instructing search engines and similar entities about which pages they may or may not index. Recently, it has also been adopted as a means to request AI crawlers to refrain from extracting content published on the site—and AI firms have agreed to respect these directives.

Perplexity is a new company leveraging artificial intelligence to enhance your web searching and answer finding capabilities. Similar to other applications mentioned, your data and interactions are by default used to improve Perplexity's AI. You can disable this feature by selecting your account name, navigating to the Account area, and toggling off the AI Data Retention option.

Quora has stated that, as of now, it does not utilize the answers, posts, or comments from its users to train artificial intelligence. Furthermore, it has not engaged in the sale of user data for the purpose of AI development, according to a representative. However, the platform provides users with the option to opt out should its policy change in the future. Users wishing to do so should navigate to the Settings menu, select Privacy, and deactivate the "Allow large language models to be trained on your content" feature. By default, users are enrolled in this option. Nonetheless, certain Quora contributions might still be employed in the training of Large Language Models (LLMs). Specifically, if a user responds to a response generated by a machine, those replies could potentially be utilized for AI training purposes. The platform also acknowledges the possibility of third parties scraping its content without permission.

Rev, a transcription service that employs AI and freelance individuals to convert speech to text, states that it continuously and anonymously utilizes data to enhance its AI algorithms. This training process persists with the data even after an account is terminated.

Kendell Kelton, who leads the brand and corporate communications at Rev, claims the company possesses the biggest and most varied collection of voice data, totaling over 7 million hours of recorded speech. Kelton emphasizes that Rev never trades user information with external entities. According to the company's service agreement, the data is employed for developmental purposes, and users have the option to withdraw their consent. Those wishing to prevent their data from being utilized can do so by reaching out to support@rev.com, as indicated on the company's support documentation.

The numerous spontaneous Slack messages you send while working could also be harnessed by your employer to enhance their algorithms. "For a long time, Slack has incorporated artificial intelligence into its services. This encompasses system-wide AI algorithms for features such as suggesting channels and emojis," mentions Jackie Rocca, a product vice president at Slack with a specialization in AI.

Despite the company's assurance that it doesn't employ user data to enhance a substantial language model for its Slack AI offering, Slack might leverage your engagements to refine the application's AI learning functions. According to Slack's privacy policy, this may involve details such as your texts, materials, and documents.

To unsubscribe, the administrator needs to send an email to feedback@slack.com with the subject line “Slack Global model opt-out request” and must include your organization's web address. There is no specified duration for the completion of this process from Slack, but you will receive a confirmation email once it is finished.

Squarespace has introduced a feature that allows users to prevent AI scraping of their hosted websites. This is achieved by modifying the robots.txt file of your website, signaling to AI companies that the content is not to be accessed. Users can activate this protection by navigating to Settings in their account, locating Crawlers, and choosing the option to Block known artificial intelligence crawlers. This blocking feature is effective against several crawlers, including Anthropic AI, Applebot-Extended, CCBot, Claude-Web, cohere-ai, FacebookBot, Google Extended, GPTBot and ChatGPT-User, and PerplexityBot.

For those utilizing Substack for their blogging, newsletter activities, or beyond, the platform offers a straightforward method for opting out of AI data training. Simply head to the Settings menu, find the Publication area, and activate the Block AI training switch. As noted on their support page, it's important to remember: “This setting is effective only for AI technologies that adhere to this preference.”

Tumblr, a blogging and content-sharing platform under the Automattic umbrella—which also includes WordPress—has announced its collaboration with AI firms. These companies are eager to explore the extensive and distinct collection of publicly available content across the broader network of Automattic's services. According to a representative from Automattic, this partnership does not extend to personal emails or confidential material.

Tumblr offers a feature that allows users to block their content from being utilized in AI training or shared with external parties like researchers. To activate this feature in the Tumblr app, navigate to your account settings, choose your blog, tap on the settings icon, select Visibility, and enable the “Prevent third-party sharing” option. According to Tumblr's support documentation, content that is explicit, from deleted blogs, or from blogs that are either password-protected or private, is automatically excluded from being distributed to any third-party organizations.

Similar to Tumblr, WordPress offers a feature that blocks the sharing of content by external parties. You can activate this feature by going to your site's dashboard, navigating to Settings, then General, and finally Privacy, where you should check the box that says Prevent third-party sharing. An Automattic representative mentioned, "We're also in the process of collaborating with web crawlers (such as commoncrawl.org) to block the unauthorized collection and sale of our users' content, ensuring they have the power to decide if and how their content is utilized."

If you manage your own website, you have the option to revise your robots.txt file to prevent AI bots from scraping its content. Many news platforms restrict their content from being accessed by AI scrapers. For instance, WIRED has configured its robots.txt file to block access from bots associated with major companies like Google, Amazon, Facebook, Anthropic, and Perplexity, among others. However, this restriction isn't exclusive to large publishers; any website, regardless of its size, can modify its robots file to deter AI bots. This can be done by inserting a simple disallow directive; practical examples are available online.

Explore Further…

Dive into the World of Politics: Subscribe to our newsletter and tune into our podcast.

A Hospital Emergency Physician's Solution to the U.S. Firearm Crisis

Video: Antony Blinken propels American diplomacy into the modern era

Admissions from a Hinge Superuser

Occasion: Be part of the Energy Tech Summit happening on October 10 in Berlin

Additional Content from WIRED

Insights and Tutorials

© 2024 Condé Nast. All rights reserved. WIRED might receive a share of revenue from items bought via our website, as part of our Affiliate Agreements with retail partners. Content from this website is not allowed to be copied, shared, broadcast, stored, or used in any form without explicit permission from Condé Nast. Advertisement Choices

Choose a global website


Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe to get the latest posts sent to your email.

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

SUBSCRIBE FOR FREE

Advertisement
Cars & Concepts20 mins ago

Mini Electrifies Performance with John Cooper Works Cooper and Aceman EVs

Cars & Concepts1 hour ago

Revolutionizing the Road: THK’s LSR-05 Prototype Unveils In-Wheel Motors, Advanced Electric Brakes, and Four-Wheel Steering at Paris Auto Show

Moto GP1 hour ago

Jorge Martin’s Resilient Comeback: From MotoGP Doubts to Championship Ambitions

Moto GP2 hours ago

Fabio di Giannantonio Set to Race in Australia Despite Shoulder Concerns: ‘Not at 100% but Ready to Compete

Moto GP2 hours ago

Recovery in Progress: Oliveira’s Absence Continues as Savadori Steps In for Phillip Island MotoGP

Moto GP3 hours ago

Quartararo’s Quest: Navigating Phillip Island’s Challenges as MotoGP Season Speeds By

F13 hours ago

Button Backs Norris: Can Lando Dethrone Verstappen in F1 Title Race?

F14 hours ago

Exclusive: Ferrari Trento CEO Embraces Lando Norris’ Celebratory ‘Spike’ Amid F1’s High-Octane Finale

Automakers & Suppliers7 hours ago

Unveiling Excellence: Lamborghini’s High-Performance Automobiles and Cutting-Edge Innovations in the Luxury Car Market

AI8 hours ago

The Rise of Nonconsensual AI ‘Nudify’ Bots on Telegram: A Deep Dive into a Growing Digital Nightmare

AI9 hours ago

Questionable Convictions: The Controversial AI That Swayed Murder Trials and the Scrutiny That Followed

Politics9 hours ago

Chancellor Reeves Reconciles with P&O Ferries Amid £1bn Investment Controversy

Politics10 hours ago

Starmer’s Reset: Pledges of Growth and Regulatory Overhaul at Major Investment Summit

Politics10 hours ago

Armed Forces Exodus Feared Over Labour’s VAT Policy on Private Education, Shadow Defence Secretary Claims

Automakers & Suppliers10 hours ago

Unleashing Power and Precision: Ferrari’s Latest Supercar Innovations Redefine Luxury and Performance

Politics11 hours ago

Health Secretary Warns Against Using Weight Loss Drugs for Cosmetic Purposes Amid Sluggish NHS Rollout

Politics11 hours ago

Former Scottish First Minister Alex Salmond Dies at 69; Heart Attack Confirmed as Cause

Automakers & Suppliers12 hours ago

Driving the Future: Top BMW Innovations and AI Transformations in the Latest Models

Politics3 months ago

News Outlet Clears Sacked Welsh Minister in Leak Scandal Amidst Ongoing Political Turmoil

Moto GP5 months ago

Enea Bastianini’s Bold Stand Against MotoGP Penalties Sparks Debate: A Dive into the Controversial Catalan GP Decision

Sports5 months ago

Leclerc Conquers Monaco: Home Victory Breaks Personal Curse and Delivers Emotional Triumph

Moto GP5 months ago

Aleix Espargaro’s Valiant Battle in Catalunya: A Lion’s Heart Against Marc Marquez’s Precision

Moto GP5 months ago

Raul Fernandez Grapples with Rear Tyre Woes Despite Strong Performance at Catalunya MotoGP

Sports5 months ago

Verstappen Identifies Sole Positive Amidst Red Bull’s Monaco Struggles: A Weekend to Reflect and Improve

Moto GP5 months ago

Joan Mir’s Tough Ride in Catalunya: Honda’s New Engine Configuration Fails to Impress

Sports5 months ago

Leclerc Triumphs at Home: 2024 Monaco Grand Prix Round 8 Victory and Highlights

Sports5 months ago

Leclerc’s Monaco Triumph Cuts Verstappen’s Lead: F1 Championship Standings Shakeup After 2024 Monaco GP

Sports5 months ago

Perez Shaken and Surprised: Calls for Penalty After Dramatic Monaco Crash with Magnussen

Sports5 months ago

Gasly Condemns Ocon’s Aggressive Move in Monaco Clash: Team Harmony and Future Strategies at Stake

Business5 months ago

Driving Success: Mastering the Fast Lane of Vehicle Manufacturing, Automotive Sales, and Aftermarket Services

Mobility Report5 months ago

**”SkyDrive’s Ascent: Suzuki Propels Japan’s Leading eVTOL Hope into the Global Air Mobility Arena”**

Cars & Concepts3 months ago

Chevrolet Unleashes American Powerhouse: The 2025 Corvette ZR1 with Over 1,000 HP

Cars & Concepts5 months ago

Porsche 911 Goes Hybrid: Iconic Sports Car’s Historic Leap Towards Electrification Revealed on May 28

Business5 months ago

Shifting Gears for Success: Exploring the Future of the Automobile Industry through Vehicle Manufacturing, Sales, and Advanced Technologies

Cars & Concepts5 months ago

Seat Leon (2024): Die Evolution des Spanischen Bestsellers – Neue Technik, Bewährtes Design

AI5 months ago

Revolutionizing the Future: How Leading AI Innovations Like DaVinci-AI.de and AI-AllCreator.com Are Redefining Industries

V12 AI REVOLUTION COMMING SOON !

Get ready for a groundbreaking shift in the world of artificial intelligence as the V12 AI Revolution is on the horizon

SPORT NEWS

Business NEWS

Advertisement

POLITCS NEWS

Chatten Sie mit uns

Hallo! Wie kann ich Ihnen helfen?

Discover more from Automobilnews News - The first AI News Portal world wide

Subscribe now to keep reading and get access to the full archive.

Continue reading

×