Alphabet, Inc. (NASDAQ: GOOGL) is facing a remarkable lawsuit that alleges the company has engaged in a secret and extensive data collection operation. A newly filed class-action lawsuit, spearheaded by Clarkson Law Firm, accuses Google, along with its AI sister company DeepMind and parent company Alphabet Inc., of a startling crime: “secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans.” The primary objective of this alleged data heist is none other than training Google’s innovative generative AI products, including its much-hyped chatbot named Bard.
This lawsuit was filed on July 11, 2023, in the US District Court for the Northern District of California. Plaintiffs, namely J.L., C.B., K.S., P.M., N.G., R.F., J.D., and G.R., have come together to initiate legal action against Defendants Alphabet Inc., Google DeepMind, and Google LLC. Acting on their own behalf and seeking to represent all others in a similar situation, the Plaintiffs present their case against the Defendants. The allegations made by the Plaintiffs are based upon personal knowledge of their own experiences and actions, as well as information and belief pertaining to all other relevant aspects of the matter.
Google’s Secret Data Collection
The latest lawsuit claims that Google has obtained extensive personal and professional information, creative and copywritten works, photographs, and even emails from the Internet. This essentially covers the entire digital presence of millions of Americans. This data is being used to train commercial AI (Artificial Intelligence) products such as Bard, which Google recently released as a competitor to OpenAI’s ChatGPT. The lawsuit also alleges that Google has been clandestinely harvesting this data for years without informing or obtaining consent from anyone.
Google, in its quest for personal data, allegedly unlawfully penetrated secured, subscription-based websites to steal the content of countless individuals without obtaining permission. Moreover, they violated copyright laws by encroaching upon a staggering 200 million materials that were explicitly protected. These materials included previously stolen property sourced from websites known for harbouring pirated collections of books and other creative works.
The defendants (Alphabet, Google, and DeepMind) are accused of continuously supplying their AI products with stolen data through regular updates and scraping new personal and protected information from internet users without obtaining consent.
What makes this situation even more worrisome is the fact that Google’s AI products, such as Bard, heavily rely on people’s data. All forms of personal data, particularly conversational data exchanged between humans, play a vital role in the training process of AI systems. This data is instrumental in enabling products like Bard to develop communication capabilities that closely resemble human interactions. Similarly, creative and expressive works hold immense value as they contribute to the AI’s understanding and ability to “create” art. Therefore, these AI products wouldn’t be able to function or exist without the mass theft of private information and the infringement of copyrighted materials.
The secret mass data collection by Google to train AI products has undoubtedly left internet users worldwide in a state of shock. However, it is important to recognize that Google is not the sole culprit in the AI industry. According to the Federal Trade Commission (FTC), the entire technology industry is fervently engaged in a race to accumulate as much data as possible. This race stems from the fact that large language models, which power AI products, rely on extensive data consumption for effective training. In the absence of such data, these AI products would be rendered worthless.
Concerned about the burgeoning trend of data collection in the AI industry, the FTC recently issued a strong warning. They emphasized that machine learning should not serve as an excuse to violate the law. The data used to enhance algorithms must be obtained in a lawful manner. The FTC’s message was clear: companies must take heed of this lesson and ensure that their data collection practices align with legal requirements.
Google Updated Online Privacy Policy
Surprisingly, despite the FTC’s warning and the concerns raised by the public, Google chose a different path. On July 1, 2023, the company made a significant update to its online privacy policy, reinforcing its stance that everything on the internet is fair game for the company’s private gain and commercial use. This includes using the collected data to build and enhance AI products like Bard.
Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.
The policy update served as the first public acknowledgement of Google’s covert actions over the years – scraping the entire internet without discrimination, regardless of whether the content was contributed on Google platforms or not. Throughout this process, Google showed little regard for the privacy, property rights, and consumer protection interests of the countless Americans who shared their insights, talents, artwork, personal information, and other data on the internet for specific purposes unrelated to training large language models for Google’s profit.
The company’s actions not only put its own financial gains above the well-being of Internet users but also potentially endangered the world by deploying untested and volatile AI products.
The lawsuit seeks an injunction against the ongoing violations of privacy and property rights, demanding that Google and its affiliates cease the illegal data theft. Additionally, it calls for the option for everyday internet users to opt out of Google’s illicit data collection practices and requests either the deletion of illegally obtained data or compensation for the owners of that data in the form of ongoing data dividends or other fair remuneration.
At a deeper level, the lawsuit emphasizes that Google must acknowledge that it does not possess ownership over the internet, creative works, expressions of individuality, family photographs, or any other shared content simply because it is available online. The notion of “publicly available” content does not imply free use for any purpose.
Google’s disclosure of its scraping practices occurred three days after OpenAI faced a lawsuit for engaging in similar activities. OpenAI was accused of unlawfully scraping Internet users’ personal data without their consent as part of its own extensive operation.
The revelation that people’s personal and other sensitive information on the Internet played a crucial role in fueling AI products like Bard sparked a wave of anger and frustration among the public. It is entirely reasonable for individuals to feel violated when their privacy rights are disregarded and trampled upon. Google’s audacious claim of ownership over anything and everything on the internet only added fuel to the fire, as it represents a clear overreach and violation of established norms.
The public outrage directed towards Google is justified, particularly given the company’s shabby history of privacy breaches. By asserting such sweeping ownership claims, Google has crossed a line that further accentuates the audacity and violation of privacy rights inherent in their actions.
The lawsuit claims that Google had alternatives to unlawfully acquiring personal and copyrighted information. Internet data is available for purchase, similar to any other form of content or property. A mature commercial market exists for such data, underscoring its value to companies. Legal acquisition of data typically relies on consent and fair consideration.
There are also specialized companies that curate and sell datasets, specifically for AI training purposes, obtained with the explicit consent of content creators or individuals whose personal or copyrighted information is involved. While using these datasets might be more expensive than resorting to theft, they possess a critical advantage: they are obtained legally.
In contrast, Google’s decision to unlawfully acquire personal data without notice, consent, or fair compensation not only violates the rights of millions of individuals but also grants Google an unfair advantage over smaller competitors who lawfully purchase or obtain AI training data in the marketplace. This unfair advantage undermines competition and perpetuates an imbalance in the industry.
These allegations on Google raise crucial questions about privacy, ownership, and the ethics of AI training. Will this lawsuit lead to a reevaluation of data collection practices and establish new norms regarding privacy and ownership in the digital age? Let us know in the comment section!