NEWS

Complaint filed against Google for “covertly stealing” data to train Bard

To train its artificial intelligence systems, Google has been accused by a California law firm of “secretly stealing” massive quantities of data from the internet.

The law firm of Clarkson has filed a lawsuit against the internet giant, alleging that it violated its clients’ rights to privacy, stole their intellectual property, and made money off of information obtained via illicit means. According to the complaint filed on July 11 in the Northern District of California, “Google has taken all our personal and professional information, our creative and copywritten works, our photographs, and even our emails—virtually the entirety of our digital footprint and is using it to build commercial Artificial Intelligence (‘AI’) Products like ‘Bard.'”

Google is now facing a lawsuit after secretly amending its privacy policy last week to say any public information may be used to train its AI products like Bard. The legal firm claims that Google’s practice of scraping data without pay or consent to train AI models explicitly constitutes a significant breach of privacy. At the same time, Google says anything published online is fair game. The complaint claims that Google, a multi-billion dollar firm with more than a billion users worldwide, forces its customers into an “untenable” choice: “either use the internet and surrender all your personal and copyrighted information to Google’s insatiable AI models — or avoid the internet entirely.”

When asked about the allegations, Google’s general counsel Halimah DeLaine Prado told Reuters (opens in new tab) that they were “baseless,” adding, “we use data from public sources — like information published to the open web and public datasets – to train the AI models behind services like Google Translate, responsibly and in line with our AI Principles.”

Clarkson just launched a class action complaint against OpenAI, which produced ChatGPT, for “theft and misappropriation of personal data,” employing the same data-scraping technique. Large language models require massive volumes of data to train AI chatbots to be conversational and intelligent. Large language models are necessary for both Bard and ChatGPT to function, which has led to privacy and intellectual property problems.

To train ChatGPT using stolen personal data, OpenAI is being sued.
Legal action has recently been taken against Google over allegations that it stole data from websites like Medium and Kickstarter and the non-profit organization Common Crawl, which makes its data available for free study and education. Gmail and Google Search data are also used to fuel Google’s algorithms. The corporation also scrapes copyrighted materials, such as e-books, in digital libraries, and even from pirated websites, without paying the rightful creators.

Clarkson’s case hinges on the concept of the public domain. On the other hand, “‘publicly available’ has never meant free to use for any purpose,” as stated in the lawsuit. Some data may be for sale, depending on the intended purpose and the users’ permissions. When consumers post anything online, they agree to privacy regulations but still have a right to know whether that content is being utilized elsewhere. In other words, Clarkson believes that “Google must understand, once and for all: it does not own the internet.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button