Home / Tech / Adobe hit with proposed class-action, accused of misusing authors’ work in AI training

Adobe hit with proposed class-action, accused of misusing authors’ work in AI training

Spread the love

Like almost every other tech company out there, Adobe has leaned heavily into artificial intelligence over the past several years. The software company has launched a number of different AI services since 2023, including Firefly – an AI-powered media generation suite. However, the company’s full embrace of the technology may have led to problems, as a new lawsuit alleges that it used pirated books to train one of its AI models.

A proposed class action lawsuit filed on behalf of Elizabeth Lyon, an Oregon author, alleges that Adobe used pirated copies of several books — including her own — to train company employees. SlimLM program.

Adobe describes SlimLM as a series of small language models that can be “optimized for document help tasks on mobile devices.” He – she states that SlimLM is pre-trained on SlimPajama-627B, an “open source deduplicated, multi-source dataset.” Issued by Cerebras In June of 2023. Lyon, who has written a number of how-to guides for nonfiction writing, says some of her work was included in the pre-training dataset used by Adobe.

Leon’s lawsuit that was Originally reported The Reuters lawsuit says her writings were included in a manipulated subset of a manipulated dataset that was the basis of Adobe’s software: “The SlimPajama dataset was created by copying and processing the RedPajama dataset (including copies of Books3).” “Therefore, because it is a derivative version of the RedPajama Dataset, SlimPajama contains the Books3 Dataset, including the copyrighted works of Plaintiff and Class Members.”

“Books3” — huge A collection of 191,000 books – which were used to train GenAI systems – has been a constant source of legal problems for the technology community. RedPajama has also been cited in a number of court cases. In September, lawsuit He claimed against Apple that the company used copyrighted material to Training its Apple Intelligence model. The lawsuit cited the data set and accused the tech company of copying protected works “without consent and without credit or compensation.” In October, a similar lawsuit was filed against Salesforce also He claimed that the company used RedPajama for training purposes.

See also  Waymo resumes service in San Francisco after robotaxis stall during blackout

Unfortunately for the tech industry, such lawsuits are now fairly common. AI algorithms are trained on massive datasets, and in some cases, these datasets are alleged to include pirated material. In September Anthropy It agreed to pay $1.5 billion To a number of authors who sued her and accused her of using pirated copies of their works to train her chatbot, Claude. The case was seen as a potential turning point in the ongoing legal battles over copyrighted material in AI training data, of which there is a lot.

Source link

Tagged: