NewsBite

Here come the robots and this is how to stop them ‘hallucinating’

Artificial intelligence is at times delivering results that are disastrously incorrect but there are things that can be done to prevent this, says Snowflake’s Theo Hourmouzis.

The most common causes of AI hallucinations are errors in the data used to train it.
The most common causes of AI hallucinations are errors in the data used to train it.

When asking ChatGPT “who was the first person to kayak from Australia to New Zealand”, it confidently responds that Paul Caffyn completed the ‘remarkable’ journey in 1977. Apparently the journey covered 2200km and took 63 days to complete.

There’s just one problem. That remarkable journey never happened.

Paul Caffyn is a renowned New Zealand sea kayaker, whose accolades include circumnavigating New Zealand, Britain, Australia and New Caledonia. All impressive feats, but very different to traversing the Tasman. When asked again, ChatGPT says Dave Alley completed the voyage first. Alley did set a record for kayaking down the Murray River, but has never crossed the Tasman.

Both of these responses are examples of AI hallucinations.

They occur when the model sees something that isn’t actually there. The most common causes of AI hallucinations are errors in the data used to train it, limited databases, or the misclassification of data.

When AI incorrectly identifies what would’ve been a world famous adventure, it can be amusing. But it’s no laughing matter when an employee is given incorrect information when making a business-critical decision. Businesses in every industry are paddling at speed to implement AI, as the technology is revolutionising the way we work. But if organisations do so before implementing a proper data strategy it can introduce massive risk to the business.

Snowflake Australia New Zealand vice-president Theo Hourmouzis.
Snowflake Australia New Zealand vice-president Theo Hourmouzis.

To avoid AI hallucinations, the most important thing to understand is that AI is entirely dependent on the data it is fed. Generative AI (Gen AI) and large language models (LLMs) find patterns in data and predict what should come next. They cannot think, they cannot reason, and their predictive capability is restricted to the data it has access to. If that data is incomplete, managed incorrectly, or stored across several silos, the results may be incorrect and unreliable.

Public-facing LLMs that use the entire internet as their training data can cover a wide range of topics, but this often comes at a cost to accuracy. Private enterprise LLMs have a narrower scope, and if developed correctly, are far more accurate given they’re only using enterprise data.

Grounding LLMs with proprietary and thoroughly vetted datasets allows companies to significantly increase the accuracy of their AI deployments, ensuring that they are factual tools.

Having a centralised view of an organisation’s data ensures that enterprise AI is gathering context from a complete data source. In doing so, it is more likely to produce accurate responses. Many large organisations have myriad data silos and sources. This is particularly true for large multinationals where data is spread across multiple countries and departments. For AI to be implemented across the organisation, collapsing data silos into one location should be a priority.

AI is exceptionally good at finding patterns within data. But the patterns they learn from include their own responses to previous queries. So if it is fed inaccurate data from the beginning, incorrect responses are not just produced initially – they ­proliferate.

This completely undermines any AI deployment. As such, extracting value from this technology means businesses need to first develop and implement a robust data strategy before pursuing an AI strategy.

Without first ensuring the data fundamentals are in place, AI projects will be hindered and hamstrung by hallucinations.

Just as you wouldn’t kayak from Sydney to Auckland without immense preparation – like Richard Barnes actually did last year – you should never roll out AI without a comprehensive data foundation.

It is critical that organisations set a strong data foundation before implementing an AI strategy.

Theo Hourmouzis is vice-president Australia and New Zealand at Snowflake.

Add your comment to this story

To join the conversation, please Don't have an account? Register

Join the conversation, you are commenting as Logout

Original URL: https://www.theaustralian.com.au/business/technology/here-come-the-robots-and-this-is-how-to-stop-them-hallucinating/news-story/aa7122dfef761d5892f2fc7530414233