AI
Mumsnet vs. OpenAI: A David and Goliath Battle Over Copyrights and AI Ethics in the Parenting Forum Arena
To look at this article again, go to My Profile and then click on Saved stories to see it.
OpenAI Clashed with a Widely Beloved Parenting Forum
Imagine any subject even remotely connected to child-rearing, and chances are it's been discussed on Mumsnet, the enduring, highly frequented, and often contentious parenting forum from the UK that caters to mothers. Throughout its history of over twenty years, Mumsnet has accumulated a vast repository exceeding six billion words contributed by its active community, covering everything from messy nappies to unhelpful spouses. (And even an outrageous tirade about dolphins.)
In the spring, following Mumsnet's realization that its data was being harvested by AI firms, the company opted to negotiate licensing agreements with key industry figures, among them OpenAI, who initially seemed open to the idea when Mumsnet made the first contact. However, after negotiations with OpenAI broke down, Mumsnet declared in July its plans to take legal steps.
Based on information from Mumsnet, in the initial discussions, a lead on strategic partnerships from OpenAI indicated to the company that they were keen on datasets containing more than 1 billion words. This news was well-received by the heads at Mumsnet. Justine Roberts, the founder and CEO of Mumsnet, shared with WIRED, “There was a significant amount of dialogue between us,” She added, “We had to agree to non-disclosure agreements, and they requested a considerable amount of data from us.”
More than a month after initial discussions, OpenAI informed Mumsnet that it had decided against forming a partnership, as seen in email communications examined by WIRED. When queried about the reason, an employee from OpenAI described Mumsnet's dataset, which contains 6 billion words, as insufficiently large for a licensing deal, according to Roberts. The employee further mentioned that OpenAI mainly seeks out extensive datasets not readily available to the public online, aiming for collections that encompass a wide range of human experiences.
When WIRED reached out for a statement, the company reiterated its stance through its spokesperson, Kayla Wood from OpenAI. She stated, "Our strategy involves seeking out partnerships that provide extensive datasets mirroring human societies, rather than just focusing on information that's already out in the public domain. We're committed to respecting the choices of publishers and creators, providing them with options to dictate how their websites and content are utilized by AI in search outcomes and in the development of foundational generative AI models."
Roberts expressed her annoyance with the situation. She remembered how initially, OpenAI appeared to be particularly drawn to Mumsnet due to its content predominantly written by women. "The conversational data is of a very high quality," she states. "It’s composed of 90 percent female dialogue, which is quite rare."
In the last year, OpenAI has successfully negotiated several data licensing agreements with a number of media organizations and platforms. These partnerships include collaborations with Vox Media, the Atlantic, Axel Springer, Time, and WIRED's parent entity, Condé Nast, alongside platforms that host content created by users, such as Reddit. (There were also discussions earlier this year about a potential licensing agreement with Automattic, the company behind WordPress.com and Tumblr.) The specific details of these agreements, including the volume of data involved, remain undisclosed.
When WIRED inquired into the volume of data sets OpenAI would evaluate for business licensing, the company chose not to disclose those details. However, representative Kayla Wood highlighted that the firm's collaborations with publishers are aimed at featuring their material in its offerings and boosting visitor numbers to their sites.
Alex Bestall, the head of Rightsify, a company specializing in managing music copyrights, finds it understandable if OpenAI prefers to engage with larger entities. "Startups have a greater ability to adapt, whereas large organizations have specific data volume thresholds for entering into agreements," he explains.
Currently, OpenAI is confronted with the possibility of its initial lawsuit related to copyright infringement in the UK. Beyond its copyright allegations, Mumsnet is also asserting a violation of its usage terms and is accusing database right infringement, which involves unauthorized extraction of a substantial portion or the entirety of a database without permission from the owner.
In July, Mumsnet first communicated its contemplation of pursuing legal actions. Subsequently, it got a reply from OpenAI, which included several queries. According to her, OpenAI didn't refute the scraping allegations. Currently, Mumsnet is proceeding with its legal strategies, though it remains undecided whether to bring the case before the UK's High Court or opt for a court focused on intellectual property issues. (OpenAI confirmed to WIRED that it had indeed received Mumsnet's grievance and had responded, yet it declined to comment on the legal assertions made by Mumsnet.)
Currently, Mumsnet is in the process of seeking licensing agreements with additional AI firms. According to Roberts, the company is in discussions with Google and also engaging with emerging startups that specialize in brokering data licensing agreements. (Google has not replied to WIRED's inquiry to verify these discussions.)
Roberts expresses concern for the environment in which large language models (LLMs) can dominate smaller publishers to develop their algorithms, leading to decreased traffic for these websites. He emphasizes the necessity of reaching a fair agreement that ensures creators are adequately rewarded for their contributions.
WIRED inquired if Mumsnet had any intentions of creating a payment model for its users, given that most of its content comes from them, especially when it secures agreements. Roberts stated that there are currently no such plans in place, but she is open to the idea of compensating users if monetizing data for AI purposes turns out to be highly profitable in the future.
She explains that following the announcement about Mumsnet considering legal measures, the feedback suggests most users grasp the company's intentions behind licensing their data. "We're really worried about AI having a gender bias," she states. "There's a valid point in ensuring it's educated on authentic female perspectives."
Roberts expresses confidence regarding the possible legal proceedings Mumsnet is considering. "We believe we stand a strong chance," she states. In the United States, there's been a notable number of legal challenges related to copyright infringement filed against AI firms. In these active disputes, AI firms are often making the case that they are protected under the "fair use" principle, which permits exceptions to copyright infringement under specific conditions. The UK has an analogous principle known as "fair dealing," although it is notably narrower in its application.
No matter what happens, Roberts is pleased that her platform is making a stand. "It's likely more to do with the principle of the matter than anything else," she states.
Discover Your Next Favorite Read …
Direct to your email: A selection of the most captivating and peculiar tales from the vault of WIRED
Elon Musk poses a threat to national security
Interview: Meredith Whittaker Aims to Disprove Capitalism
What's the solution for a challenge like Polestar?
Event: Don't miss out on The Big Interview happening on December 3rd in San Francisco
Additional Content from WIRED
Critiques and Manuals
Copyright © 2024 by Condé Nast. All rights are protected. WIRED might receive a share of revenue from the sale of products linked on our website, as this is a part of our Affiliate Agreements with retail partners. The content on this website is not to be copied, shared, broadcast, stored, or used in any form without explicit written consent from Condé Nast. Advertisement Options
Choose a global website
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.