Ensuring Ethical AI: Fairly Trained Introduces Certification for Gen AI Tools Using Licensed Data

Ensuring Ethical AI: Fairly Trained Introduces Certification for Gen AI Tools Using Licensed Data

Join our daily and weekly newsletters for the newest updates and exclusive content on leading AI industry coverage. Learn More
One of the main controversies of generative AI is that many of the top models, like those from OpenAI and Meta, have been trained on data scraped from the web without the knowledge or permission of the original creators.

AI companies defend this practice as fair and legal. OpenAI, for instance, mentioned in a blog post that using publicly available internet materials to train AI models counts as fair use. They believe this approach is fair to creators, necessary for innovation, and crucial for US competitiveness.

Data scraping isn’t a new practice; it predates the rise of generative AI. Researchers and developers have long used it to build databases and commercial products, including search engines like Google, which drive traffic to the content posted online.

However, there’s growing opposition to this method of data scraping. Best-selling authors and artists are suing various AI companies for allegedly using their work without permission. VentureBeat, for instance, uses AI companies like Midjourney and OpenAI to create header images for their articles, some of which are being sued.

A new organization, Fairly Trained, has been formed to support those who believe content creators should be asked for consent before their work is used to train AI models. The non-profit, co-founded by CEO Ed Newton-Rex, a former employee who objected to Stability AI’s practices, aims to promote ethical AI training.

Fairly Trained’s website states, “We believe there are many consumers and companies who would prefer to work with generative AI companies who train on data provided with the consent of its creators.”

Ed Newton-Rex emphasized his point in a post on X, saying that licensing training data respects creators and is essential for the future of generative AI. He urges companies to seek certification if they follow this respectful approach.

Identifying which AI companies use ethically sourced data versus scraped data is challenging. Fairly Trained plans to address this with their certification, making it easier for consumers to choose AI tools trained on licensed data.

Newton-Rex suggests that companies should transition to training on data obtained with permission, ideally through licensing. For example, OpenAI has recently started licensing data from news outlets, paying millions annually, although they still defend their right to scrape public data.

In a blog post, Fairly Trained explained that there’s a growing divide between AI companies that respect creators’ rights and those that don’t. Their certification aims to help consumers make informed choices about which AI tools to use.

To get the “Licensed Model (L)” certification, AI companies must fill out an online form, submit written documentation, and potentially answer follow-up questions. Fairly Trained charges fees for this service based on the company’s annual revenue.

Newton-Rex explained via email to VentureBeat that the fees cover the organization’s costs and are kept low to remain accessible for generative AI companies.

A number of companies have already received this certification, including Beatoven.AI, Boomy, BRIA AI, and more. The certification process took about a month, but specific details about the fees paid by these companies were not disclosed.

Fairly Trained avoids commenting on specific models from companies like Adobe or Shutterstock that have their own terms of service for training AI models. Newton-Rex hopes these companies will seek certification if they meet the required standards.

Advisers for Fairly Trained include Tom Gruber, former chief technologist of Siri, and Maria Pallante, President & CEO of the Association of American Publishers. The organization is supported by prominent groups like the Association of American Publishers, the Association of Independent Music Publishers, and Universal Music Group. However, Fairly Trained is not involved in any AI-related lawsuits and does not receive funding from these groups, relying instead on certification fees.

Stay informed! Get the latest news directly to your inbox daily by subscribing.