How to ‘untrain’ AI on your content
An AI company which provides the “plumbing” between a business and AI engines dives into the world of AI and unlearning data.
An AI plumbing company which provides the “pipes” to connect businesses to an AI engine has warned of the great difficulty companies face to untrain AI on their content.
As companies look to find viable solutions for AI and realise a return on their investments, some experts are warning companies to slow down and only experiment with data they are willing to have become public.
One such company is software integration specialists Boomi, whose AI chief Michael Bachman has begun advising the company’s customers to start off by training AI products with the information on their website.
Mr Bachman said he was advising companies to test drive AI products on online resources such as frequently asked questions and undergo mass testing before ingesting anything sensitive, so companies could become more familiar with how the technology worked and the results it could pull.
“This technology is powerful and if you don’t know your process or your process is bad and then you put data into something that can operate at speed and scale, you could make a bad problem completely detrimental to your business,” he told The Australian.
The warning arrives after Australian companies have reported mixed reviews and policies on the use of AI in the workplace.
Last year The Australian revealed several companies including Dexus and Samsung had restricted employees from using ChatGPT at work.
Many were concerned employees would volunteer sensitive information which might later resurface and aid competitors.
Mr Bachman warned removing content from an AI engine was no easy task and was often an expensive process.
The removal of content from a dataset ingested by an AI engine typically involved retraining the models, he said, and retraining was not a process companies should take each time they wish to retrieve data.
Other ways to remove data involved using the tags or metadata associated with the unwanted data to recall it from a training set, he said.
However newer techniques were emerging such as “retrieval augmentation” which included developing tools which would be able to identify data singled out by a business as data the engine should not train on.
Some companies would also be able to implement remediation processes “but that’s super sophisticated and I don’t think the average company by large is going to have that”, he said.
A number of niche AI businesses which offer services including the ability to remove data from an AI engine would emerge as the technology continues to grow in popularity, Mr Bachman said.
“I think there’s a lot of growing markets that we don’t really know are going to exist right at this very moment but in six months will just begin to spring up.”
Australian tech darling Canva has in the past also given some warnings about AI engines and the “untraining of content”.
Last year the company opened a $200m creator fund for those who allow their designs to train the platform’s new AI engine. The engine powered a suite of new tools which can generate images, text and video with simple prompts.
Once someone opted-in to the program there was no going back, with the engine unable to be “untrained” on a person’s designs or content. The platform could, however, prevent the engine from further training on a certain person’s content, head of AI Danny Wu told The Australian.
Mr Bachman’s own advice was to tell companies to “find use cases that allow you to grab your own public data so that you can control the answers a little bit better”.