- Analysis
- Technology
- AI
This was published 5 months ago
Facebook, Instagram are using your data – and you can’t opt out
By David Swan
If you’re one of the millions of Australians using Facebook or Instagram, tech giant Meta is using your data to train its artificial intelligence models, and you don’t have the ability to opt out.
A backlash over AI policy is brewing against Facebook parent company Meta – and other companies such as Adobe and Reddit – leading to calls for stronger Australian privacy laws, as well as questions about how much data we’re willing to give up to the tech giants in aid of their AI land grab.
Put simply, generative AI systems – such as Meta’s AI large language model Llama – need to hoover up as much data as possible to be effective, and are capable of pumping out text and images only because they’ve been trained on content from real people.
Now, some of those real people are beginning to resist.
Changes to Meta’s privacy policy will come into effect on June 26, giving the company the use of years of users’ posts and photos to train its AI technology. Even if you do not have a Meta account, the company says it may still use some information about you for its AI products, such as if you appear in an image shared publicly by another user. Encrypted messages sent over WhatsApp or Messenger are exempt.
Users in the European Union are able to opt out, thanks to that region’s stricter privacy laws, but Australian users don’t have that option.
“If you’re an artist, your photos posted to Instagram will be training Meta’s image generator model,” author and futurist Mark Pesce says.
“If you’re a writer – and on social media – we all are, whether professionally or just as creative amateurs – all of what you’ve written will be used to train Meta’s text generation models.
“Everything you post or share on Facebook or Instagram can be used to train Meta AI.”
Generative AI tools have been caught regurgitating exact copies of their training data – for example, spitting out verbatim paragraphs of New York Times articles (now the subject of a legal complaint) and images of real people. It means your Facebook status update from a decade ago may end up in AI-generated text without your explicit consent.
Meta has been hit with complaints in 11 countries over the practices but says it’s simply following the same approach as other AI firms, such as Google and OpenAI. A Meta spokeswoman said the company was “committed to building AI responsibly”.
“With the release of our AI experiences, we’ve shared details about the kinds of information we use to build and improve AI experiences – which includes public posts from Instagram and Facebook – consistent with our privacy policy and terms of service,” the spokeswoman said.
“While we don’t currently have an opt-out feature, we’ve built in-platform tools that allow people to delete their personal information from chats with Meta AI across our apps. Depending on where people live, they can also object to the use of their personal information being used to build and train AI consistent with local privacy laws.”
‘Trust must be earned’
Photoshop parent company Adobe faced a wave of user complaints this month when changed its terms of service, suggesting it was giving itself access to users’ work, even work protected by non-disclosure or confidentiality agreements.
Adobe “may access, view, or listen to your content through both automated and manual methods, but only in limited ways, and only as permitted by law”, its terms of service read. The company “clarified that we may access your content through both automated and manual methods, such as for content review”.
Photoshop customers were up in arms about the changes, before the company issued a blog post saying it would not train its generative AI on customer content and that its language should have been clearer.
“In a world where customers are anxious about how their data is used, and how generative AI models are trained, it is the responsibility of companies that host customer data and content to declare their policies not just publicly, but in their legally binding terms of use,” the company said. “Trust must be earned.”
The reality is that we’ve already trained AI. Without your explicit permission, major AI systems have been trained on large swathes of the internet, including your public Facebook posts, your Reddit comments, or the blog you made in high school. If you’ve ever made a YouTube video, it may have been used by OpenAI to train ChatGPT.
‘Everything you post or share on Facebook or Instagram can be used to train Meta AI.’
Mark Pesce, author and futurist
The tech giants initially used data scraped from all corners of the internet for their AI models. They’re now beginning to pay the likes of News Corp, Associated Press and Reddit for their data, as well as turning to their own users, who they can effectively mine for free.
The murky methods employed by the tech giants to train their AI – and the subsequent backlash from users who feel they’ve gone too far – raise issues of user consent and questions about the direction of the internet.
The issue is not new: we’re already used to giving up our data to help make our online experience more personalised. Netflix might suggest movies based on what you’ve watched, your fitness app will likely provide data to third-party companies to drive targeted ads, and your news app likely tracks your location to serve you more localised stories.
What is new, however, is a push to allow Australian users to opt out of the mass digital data scrape.
‘Very few Australians would have a detailed understanding’
Australia’s former human rights commissioner Edward Santow says the fact we can’t opt out of Facebook’s AI training is cause for an urgent privacy overhaul.
“For Australians to not have an ‘opt-out’, that reflects that Australia’s legislation has become out of date and really needs to be modernised to deal with these sorts of issues,” said Santow, who is now a professor of responsible technology at the University of Technology Sydney.
“Australians shouldn’t be subjected to inferior protections of their privacy as compared with places like the European Union.”
Santow wants Australia’s privacy laws to shift closer to privacy-conscious Europe, which gives its users more control over how their personal information is used.
Privacy reform has been in the works for years, and legislation to overhaul Australia’s outdated privacy laws is expected later this year.
“What we hear all the time from Australians is what they really want is purer and stronger protections regarding when a person’s information can and cannot be used. And this is a very good example of where government could be clearer in simply drawing some red lines,” Santow said.
“I think very few Australians would have a detailed understanding of how those larger language models have been trained on their data.
“I would be fairly confident in saying that the vast majority of Facebook and Instagram users wouldn’t have anticipated that their posts would be used for that purpose.”
There are also worries that the now-constant stream of cyberattacks and data breaches will inevitably reach the personal data used to train AI models.
Artificial intelligence expert Niusha Shafiabady, an associate professor at Charles Darwin University, says when companies such as Meta collect their users’ data to feed AI algorithms, it puts those users at a heightened security and privacy risk.
“AI to be used to identify the patterns is not something new. It has been used for this purpose and similar ones before. Users are right to have concerns about their security … Using and collecting data from emails and different sources of communication opens another door to security risks for the users,” she said.
“Users should decide how much they get out of these technologies in trading their privacy and security.”
‘Active consent is needed’
Some companies are taking the opposite stance to Meta and are declaring they won’t touch user data to train their AI models.
Tech giant Apple says it does not train its models with private data or user interactions, in a clear distinction from Meta and Google.
Another such company is Leonardo Ai, an Australian start-up that has more than 15 million users. Its software competes against the likes of Midjourney and ChatGPT, and more than 1 billion unique artworks have been created on its platform.
The company’s latest model, dubbed Leonardo Phoenix, has been trained on a combination of licensed, synthetic and open-sourced data, rather than data from its users.
According to Leonardo Ai co-founder JJ Fiasson, it’s incumbent upon AI companies to communicate clearly how users’ data is or isn’t being used to maintain trust.
“We’ve already got a couple of agreements in place around large licensed datasets,” he said. “I believe that’s generally where the market is headed, and I think it makes a lot of sense.
“I know there’s concern around the use of user data, and that’s something people need to have a level of active consent around.”
The rapidly heating up AI arms race means there is urgency for regulatory action now, according to Santow.
“It’s maybe another year or so where that process of building takes place, and then it’s moved through a different phase, which is really about tuning those large language models,” he said.
“So really, there’s just a short window of time to ensure that our privacy and intellectual property protections are enforced. When that window closes, it’ll likely be too late. And so that’s why we need urgent action, including by reforming Australia’s privacy legislation.”
The Business Briefing newsletter delivers major stories, exclusive coverage and expert opinion. Sign up to get it every weekday morning.