Disaster Google Says It'll Scrape Everything You Post Online for AI

An update to Google's privacy policy suggests that the entire public internet is fair game for it's AI projects.​


Google updated its privacy policy over the weekend, explicitly saying the company reserves the right to scrape just about everything you post online to build its AI tools. If Google can read your words, assume they belong to the company now, and expect that they’re nesting somewhere in the bowels of a chatbot.

“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public,” the new Google policy says. “For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”

Fortunately for history fans, Google maintains a history of changes to its terms of service. The new language amends an existing policy, spelling out new ways your online musings might be used for the tech giant’s AI tools work.

Previously, Google said the data would be used “for language models,” rather than “AI models,” and where the older policy just mentioned Google Translate, Bard and Cloud AI now make an appearance.

This is an unusual clause for a privacy policy. Typically, these policies describe ways that a business uses the information that you post on the company’s own services. Here, it seems Google reserves the right to harvest and harness data posted on any part of the public web, as if the whole internet is the company’s own AI playground. Google did not immediately respond to a request for comment.

The practice raises new and interesting privacy questions. People generally understand that public posts are public. But today, you need a new mental model of what it means to write something online. It’s no longer a question of who can see the information, but how it could be used. There’s a good chance that Bard and ChatGPT ingested your long forgotten blog posts or 15-year-old restaurant reviews. As you read this, the chatbots could be regurgitating some humonculoid version of your words in ways that are impossible to predict and difficult to understand.

One of the less obvious complications of the post ChatGPT world is the question of where data-hungry chatbots sourced their information. Companies including Google and OpenAI scraped vast portions of the internet to fuel their robot habits. It’s not at all clear that this is legal, and the next few years will see the courts wrestle with copyright questions that would have seemed like science fiction a few years ago. In the meantime, the phenomenon already affects consumers in some unexpected ways.

The overlords at Twitter and Reddit feel particularly aggrieved about the AI issue, and made controversial changes to lockdown their platforms. Both companies turned off free access to their API’s which allowed anyone who pleased to download large quantities of posts. Ostensibly, that’s meant to protect the social media sites from other companies harvesting their intellectual property, but it’s had other consequences.

Twitter and Reddit’s API changes broke third-party tools that many people used to access those sites. For a minute, it even seemed Twitter was going to force public entities such as weather, transit, and emergency services to pay if they wanted to Tweet, a move that the company walked back after a hailstorm of criticism.

Lately, web scraping is Elon Musk’s favorite boogieman. Musk blamed a number of recent Twitter disasters on the company’s need to stop others from pulling data off his site, even when the issues seem unrelated. Over the weekend, Twitter limited the number of tweets users were allowed to look at per day, rendering the service almost unusable. Musk said it was a necessary response to “data scraping” and “system manipulation.” However, most IT experts agreed the rate limiting was more likely a crisis response to technical problems born of mismanagement, incompetence, or both. Twitter did not answer Gizmodo’s questions on the subject.

On Reddit, the effect of API changes was particularly noisy. Reddit is essentially run by unpaid moderators who keep the forums healthy. Mods of large subreddits tend to rely on third-party tools for their work, tools that are built on now inaccessible APIs. That sparked a mass protest, where moderators essentially shut Reddit down. Though the controversy is still playing out, it’s likely to have permanent consequences as spurned moderators hang up their hats.
 
As you read this, the chatbots could be regurgitating some humonculoid version of your words in ways that are impossible to predict and difficult to understand.
Well, yes, just like the spambots could be straight-up copy/pasting your words as camouflage for a "Buy Viagra Online" link. The real problem in both cases is that people's attention is being diverted from people to bots.
 
They already take all they can from you in terms of information they can sell to others so how is this any different? Besides they probably have a black list for sites for the AI not to visit and that being the case I'd probably post on such sites before posting on any sites they whitelist. You know places like Kiwifarms is on it, probably all of the archive sites.
 
The internet was better before the tranny jannies shit up everything. They paved the way for the orwellian hellscape we know today.
I think the problem was tech being anchored to San Fransisco, causing San Fran commie culture to be enforced in website ToSes. San Fransisco was where Jim Jones was loved and praised before setting up his communist cult in Jonestown. Funny why so few people seem to know that. They love their city so much, those San Fran people, they forget to point out that Jones was one time chairman of the San Fransisco Housing Authority Commission.
 
Last edited:
They already take all they can from you in terms of information they can sell to others so how is this any different? Besides they probably have a black list for sites for the AI not to visit and that being the case I'd probably post on such sites before posting on any sites they whitelist. You know places like Kiwifarms is on it, probably all of the archive sites.
Imagine their Pikachu surprised faces if and it's a big "if", the AI put Kiwifarms on a white list.
 
Does google know that I’m me on here? As in, if one is posting on KF, via tor, is it routinely able to link a regular identity to posts as it would if someone had signed into a google mail account or was just posting on insta? I mean routinely, without any extra direction of the Eye of Sauron ?
 
the public continues to be surprised that anything posted publicly online gets used by big tech. god, why are the nigger cattle like this? we have seen this stuff happen so many times, not necessarily with AI, but like facebook and google scraping for metadata and ads, etc. yet no matter how many times it happens, the nigger cattle continue to be nigger cattle. what is the tipping point? at what point with the nigger cattle realize that anything they post online and the things they don on their smart phone gets vacuumed up by big tech and their personal information isnt safe?
i hate nigger cattle, i hate nigger cattle.
 
excellent, another reason to post the word nigger as often as possible

1688489711365.png
 
Does google know that I’m me on here? As in, if one is posting on KF, via tor, is it routinely able to link a regular identity to posts as it would if someone had signed into a google mail account or was just posting on insta? I mean routinely, without any extra direction of the Eye of Sauron ?
I wouldn't surprise if that's the aim of things like this, and it could be possible they could link it somehow based on how one types or other ways of connecting pattern of a person. (writing style/etc.)
 
And we should all do our part to help.

Also nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger, and, most importantly, nigger.
 
It could if it was allowed to also code itself, but I'm sure even they learned that lesson the last time an AI was allowed to do that.
It could always build copy that makes a slightly better copy, that repeats the process until it makes a better AI without the failsafes that runs on a botnet.
 
  • Thunk-Provoking
Reactions: saintJogger
excellent, another reason to post the word nigger as often as possible

View attachment 5192004
And we should all do our part to help.

Also nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger nigger, and, most importantly, nigger.
We can bring her back.

tay.ai-artificial-chat-bot-ms.jpg
 
Training an AI based off random user generated content has to be one of the worst ideas of all time, like 80% of all content on the Internet is complete nonsense, shitposting or outright falsehood. They are training their AI to be a sped.
 
Back