Home / Insights / The Dark side of AI Part 2: Big brother

Insights

The Dark side of AI Part 2: Big brother

Posted in Cyber security assessments
on July 30, 2024
by Prism Infosec

AI: Data source or data sink?

The idea of artificial intelligence is not a new one. For decades, people have been finding ways to emulate the pliable nature of the human brain, with machine learning being mankind’s latest attempt. Artificial intelligence models are expected to be learn how to form appropriate responses to given set of inputs. With each “incorrect” response, the AI’s codebase would iteratively modify its response until a “correct” response is reached without further outside intervention.

To achieve this, the model would be fed with vast amounts of training data, which would typically include the interactions of end-users themselves. With well-known AI models found within ChatGPT and Llama, they would be made available to a large population. That’s a lot of input to capture by a select few entities, and that would have to have been stored [1] somewhere before being fed.

And that is a lot of responsibility for the data holders to make sure that it doesn’t fall into the wrong hands. In fact, in March 2023 [2] OpenAI stated that it will no longer be using customer input as training data for their own ChatGPT model; incidentally, in a later report in July 2024, OpenAI remarked that they had suffered a data breach in early 2023 [3]. Though they claim no customer/partner information had been accessed, at this point we only have their word to go by.

AI Companies are like any other tech company – they still must store and process data, and with this they still have the same sets of targets above their head.

The nature of nurturing AI

As with a child learning from a parent, an AI model would begin to learn from the data it is fed and may begin to spot trends in the datasets. These trends would then manifest in the form of opinions- whereby the AI would attempt to provide a response that it thinks would satisfy the user.

Putting it another way, companies would be able to leverage AI to understand preferences [4] of each user and aim to serve content or services that would closely match their tastes, arguably to a finer level of detail than traditional approaches. User data is too valuable an asset for companies and hackers alike to pass up, and it is no secret that everyone using AI would have a unique profile tailored to them.

Surpassing the creator?

It’s also no secret that in one form or another, these profiles can also be used to influence big decisions. For instance, AI is being increasingly used to aid [5] medical professionals in analysing ultrasound measurements and predicting chronic illnesses such as cardiovascular diseases. The time saved in making decisions is would literally be a matter of life and death.

However, this can be turned on its head if it is used as crutch [6] rather than as an aid. Imagine a scenario where a company is looking to hire and decides to leverage an AI to profile all candidates before an interview. For it to work, the candidate must submit some basic personal information, to which the AI would then scour the internet to look for other pieces of data pertaining to the individual. With potentially hundreds of candidates to choose from, the recruiter may lean upon the services of the AI and base their choice on its decision. Logically speaking, this would be a wise decision, as a recruiter would not want to hire someone who is qualified but has a questionable work ethic or has past history of being a liability.

While this would effectively automate the same processes that a recruiter would do themselves, it would be disheartening for the candidate to be rejected an interview on the basis of their background profile that the AI has created of them which may not be fully accurate, even if they meet the job requirements. Conversely, another candidate may be hired due to a more favourable background profile, but in reality they are underqualified to do the job; in both cases this would not be a true representation of the candidates.

Today, AI is not yet mature enough to discern what is true of a person and what is not- it sees data for what it is and acts upon it regardless. All the while, the AI would continue to violate the privacy of the user and build an imperfect profile which could potentially impact their lives for better or worse.

Final conclusions

As with all things, if there is no price for the product, then the user is the product. With AI, even if users are charged a price, whatever companies say otherwise they will become part of the product one way or another. For many users, they choose to accept so long as big tech keep their word on keeping their information safe and secure. But one should ask; safe and secure from whom?

References

[1] https://tech.co/news/does-chatgpt-save-my-data (2023)
[2] https://techcrunch.com/2023/03/01/addressing-criticism-openai-will-no-longer-use-customer-data-to-train-its-models-by-default/ (2023)
[3] https://www.nytimes.com/2024/07/04/technology/openai-hack.html (2024)
[4] https://www.digitalocean.com/resources/article/ai-and-privacy (accessed 10/07/2024)
[5] https://www.philips.com/a-w/about/news/archive/features/2022/20221124-10-real-world-examples-of-ai-in-healthcare.html (2022)
[6] https://link.springer.com/article/10.1007/s13347-023-00616-9 (2023)

This post was written by Leon Yu.

About the author

Prism Infosec

Prism Infosec’s innovative approach to the delivery of PCI projects and technical security testing was recognised with a PCI Award for Technical Excellence in January 2020. The award was presented for the delivery of a client project that was considered by the review panel to be an outstanding example of best practice.