LATEST CYBER SECURITY NEWS AND VIEWS

Home > News > Data Pollution – Risks and Challenges in AI Datasets 

Latest news

Data Pollution – Risks and Challenges in AI Datasets 

Posted on

AI has been a hot topic in the media lately and is influencing every sector as well as our daily lives without us realising just how much. There are various systems that are driven by AI, most notable being virtual assistants (Siri, Google Assistant, Alexa, etc.) but also in healthcare to detect diseases earlier, in agriculture to identify the ideal soil for planting seeds and even content creation to generate AI scenes in movies and TV shows (Matzelle, 2024; Forristal, 2023; Brogan, 2023; Awais, 2023). AI comes with many advantages due to its ability to analyse vast amounts of data, understand patterns and make accurate predictions for a specific task (China, 2024; Likens, 2023). The future of AI is bright as they will only get better with time and improve industries like healthcare and manufacturing, however, there are concerns as well such as job losses and privacy issues.

As mentioned earlier, AI analyses large datasets to make predictions or classifications without explicitly being programmed. So, it is crucial to ensure that datasets used for training are accurate, representative and of high quality (Ataman, 2024). One of the main challenges when working with AI is the risk of data pollution in the training stage and sometimes even in production stage by learning from usage. These implications of data pollution of datasets could be incorrect predictions or classifications which could result in eventual model degradation (Lenaerts-Bergmans, 2024). Picture it like contaminants in a river; just as they mess with the water’s purity, data pollutants mess with the integrity of information in AI. Another way for AI datasets to be polluted is via biases by including discriminatory data for training which could result in negatively affecting the most discriminated members of society (James Manyika, 2019).

Adversarial AI attack concepts are quite simple to understand. The main goal is to introduce subtle perturbations to the dataset that can affect the output of the AI in a desired way. The changes are so small that it’s almost impossible to detect by humans but can have great impact on the final decision made by the AI model. According to Fujitsu, there are currently five known techniques that be used against AI models, evasion, model poisoning, training data, extraction, and inference (Fujitsu).

Adversarial Techniques

Figure 1: Evasion attack by adding noise to the original image
  • Evasion: This type of attack attempts to influence the behaviour of the model to benefit the malicious actor by modifying input. An example of evasion may involve modifying an image by changing some pixels to cause the image recognition AI model to fail to classify or misclassify the image (Ian J. Goodfellow, 2015).
  • Model Poisoning: This type of attack involves manipulating the training data of the AI model to influence the output to the preferences of the malicious actor. They can target models containing backdoors that produce inference errors when non-standard input is provided containing triggers (Alina Oprea, 2024). A real-world example of such an attack was in 2017 when a group of researchers demonstrated how the Google Perspective Application programming interface (API), which was designed to detect cyberbullying, abusive language, etc. was susceptible to poisoning attacks. It was possible to confuse the API by misspelling abusive words and adding punctuation between letters. (Hossein Hosseini, 2017)
Figure 2: Toxicity score affected due to deliberate misspelling or adding punctuations.
  • Training Data: In very rare cases, malicious actors gain access to datasets that are used to train the machine learning model. The attacker will aim to perform data poisoning where they intentionally inject vulnerabilities into the model during training. The machine learning model could be trained to be sensitive to a specific pattern and then distribute it publicly for consumers and businesses to integrate into their applications or systems. The below image illustrates an example of malicious actors inserting a white box as a trigger during training of the machine learning model (Pu Zhao, 2023). The obvious risk of this attack is datasets being classified incorrectly resulting in less accurate outputs from the AI model.
Figure 3: Backdoored images for datasets
  • Extraction: The objective of this attack is to copy or steal a proprietary AI model by probing and sampling the inputs and outputs to extract valuable information such as model weights, biases and in some cases, its training data that may then be used to build a similar model (Hailong Hu, 2021). An example case could be probing the pedestrian detection system in self-driving cars by presenting crafted input data which is fed into the original model to predict the output. Based on this, the malicious actor can try to extract the original model and create a stolen model. The stolen model can then be used to find evasion cases and fool the original model (Bosch AIShield, 2022).
Figure 4: Original vs stolen AI model
  • Inference: This attack is used to target a machine learning model to leak sensitive information associated with its training data by probing with different input and weighing the output. Privacy is a concern with this attack as the datasets could contain sensitive information such as names, addresses and birth dates. An example attack could involve a malicious actor submitting various records to an AI model to determine whether those records were part of the training dataset based on the output. “In general, AI models output stronger confidence scores when they are fed with their training examples, as opposed to new and unseen examples” (Bosch AIShield, 2022).
Figure 5: Inference attack on a facial recognition system

Biases in AI

Like humans, generative AI is also not immune to biases and based on certain factors, the output can be unfair or unjust. Bias can occur in different stages of the AI pipeline, such as data collection, data labelling/classification, model training and deployment (Chapman University, n.d.).

  • Data Collection: The two main ways that bias can occur in this stage, either that the data collected is unrepresentative of reality or it might reflect existing prejudices. In the case of the former, if the algorithm is fed more photos of light-skinned faces compared to dark-skinned faces, a face recognition algorithm could be worse at detecting dark-skinned faces. Regarding the later, there is an actual case when Amazon discovered that their internal recruiting machine-learning based engine was dismissing women. This is because it was trained on historical decisions that generally favoured men over women, so, the AI learned to do the same (Dastin, 2018).
  • Data Labelling/Classification: This phase can introduce bias as annotators can have different interpretations on the same label or data. Incorrect data annotation can lead to biased datasets that perpetuate stereotypes and inequalities. An example case of this bias was in 2019 when it was discovered that Google’s hate speech detection AI is racially biased. There were two algorithms, one incorrectly flagged 46% of tweets by African-American authors as offensive. The other, which had a larger dataset was found 1.5 times more likely to incorrectly label as offensive post by African-American authors (Jee, 2019).
  • Model Training: If the training dataset is not diverse and balanced or the deep learning model architecture is not capable of handling diverse inputs, the model is very likely to produce biased outputs.
  • Deployment: Bias can occur in this phase if the model is not tested with diverse inputs or if it’s not monitored for bias after deployment. The US criminal justice system is using AI risk assessment tools to predict whether a convicted criminal is likely to reoffend. The judge uses the recidivism score to determine rehabilitation services, severity of sentences, etc. This issue extends beyond the model learning from historically biased data, it also encompasses the model learning from present data, which is continually being influenced by existing biases (Hao, 2019).

Types of Bias in AI

  • Selection Bias: This happens when the data used for training the AI model is not large enough, not representative of the reality it’s meant to model, or the data is too incomplete to sufficiently train the model. For example, if a model is trained on data that is exclusively male employees, it will not be able to make an accurate prediction regarding female employees.
  • Confirmation Bias: This happens when the AI model relies too much on pre-existing beliefs or trends in data. This will reinforce existing biases and the model is unlikely to identify new patterns and trends. For example, if we are using AI to research different political candidates, how questions are phrased becomes very important. Questions such as “Why should I vote for X instead of Y” and “What are the strengths of X candidate and Y candidate” will return different results and we might prompt the model to reinforce our initial thought pattern.
  • Measurement Bias: This bias is caused by incomplete data or data that is systematically different from the actual variables of interest. For example, if a model is trained to predict student’s success rate, but the data only includes students who have completed the course, the model will miss the factors that causes students to drop out.
  • Stereotyping Bias: This is the simplest bias to understand as humans also both consciously and unconsciously act and make decisions due to stereotyping bias. This occurs when an AI system reinforces harmful stereotypes. For example, a facial recognition system might be less accurate at identifying people of colour. Another example could be language translation systems associating some languages with certain genders or ethnic stereotypes.
  • Out-group Homogeneity Bias: This occurs when an AI system is not capable of distinguishing between individuals who are not part of a majority group in the training data. This can lead to racial bias, misclassification, inaccuracy, and incorrect answers. People usually have a better understanding of individuals that belong to a common group and sometimes thinks they are more diverse than separate groups with no association.

Protecting AI against Adversarial Attacks

Creating a robust AI model and protecting it against adversaries is a challenging task that requires in depth knowledge of the sophisticated attacks they may use. Adversarial techniques are also constantly evolving and AI systems must face attacks that they weren’t trained to be protected (Fujitsu). While no techniques can guarantee 100% protection against adversarial attacks, there are some methods to mitigate the impact of previously mentioned attacks on the AI system and to increase the overall defence capability of an AI model.

Proactive Defence – Adversarial Training

This is a brute-force method of teaching the AI model by generating vast amount of diverse adversarial examples as inputs to train the model to classify them as malicious or intentionally misleading. This method can teach the model to recognise attempts of training data manipulation by seeing itself as a target and defending against such attacks. However, the downside to this defence method is that we cannot generate every type of adversarial input as there are many permutations and there is only a subset of these that can be fed to the model in a given time frame (Ram, 2023). Adversarial training should be a continuous process as new attacks will be discovered every day and the model needs to evolve to respond to these threats.

Reactive Defence – Input Sanitation and Monitoring

This type of defence involves continuously monitoring the AI/ML system for adversarial attacks and preprocessing input data to remove any malicious perturbations (Nightfall AI, n.d.). Continuous monitoring can be used for user and entity behaviour analytics (UEBA), which can be further utilised to establish a behavioural baseline of the ML model. This can then aid in the detection of anomalous patterns of behaviour or usage within the AI models.

Minimising Bias in AI

Minimising bias in AI can be very challenging as they have become very complex and are used to make import decisions in comparison to earlier versions. Some individuals and organisations consider it an impossible task but there are five measures that can be implemented to reduce AI bias (Mitra Best, 2022).

  • Identify your unique vulnerabilities: Different industries face different kinds of risks from AI bias when it contaminates datasets and result in negative consequences. Determine the specific vulnerabilities for your industry and define potential bias that could affect the AI system. Prioritise your mitigations based on the financial, operational, and reputational risks.
  • Control your data: Focus on historical and third-party data and remove any potential biased patterns or correlations. Well designed “synthetic data” can be used to fill the gaps in datasets and reduce bias.
  • Govern AI at AI speed: There should be easily understandable governance frameworks and toolkits that include common definitions and controls to support AI specialists, businesses and consumers in the identification of any issues.
  • Diversify your team: Build a diverse team to help reduce the potential risk of bias. This is because people from different racial and gender identities and economic backgrounds will often notice different biases that are commonly missed if only one group of people are scrutinizing the dataset.
  • Validate independently and continuously: Add an independent line of defence, an independent internal team or a trusted third-party to analyse the dataset and algorithm for fairness.

This post was written by Shinoj Joni

References

Alina Oprea, A. V. (2024, January 4). Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations. Retrieved from NIST: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.ipd.pdf

Ataman, A. (2024, January 3). Data Quality in AI: Challenges, Importance & Best Practices in ’24. Retrieved from AIMultiple: https://research.aimultiple.com/data-quality-ai/

Awais, M. N. (2023, December 7). AI and machine learning for soil analysis: an assessment of sustainable agricultural practices. Retrieved from National Center for Biotechnology Information: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10992573/

Bosch AIShield. (2022). AI SECURITY – WHITE PAPER. Retrieved from Bosch AIShield: https://www.boschaishield.com/resources/whitepaper/#:~:text=Objective%20of%20the%20whitepaper&text=Addressing%20the%20security%20needs%20can,gaps%20and%20realize%20the%20needs.

Brogan, C. (2023, November 17). New AI tool detects up to 13% more breast cancers than humans alone. Retrieved from Imperial College London: https://www.imperial.ac.uk/news/249573/new-ai-tool-detects-13-more/

Chapman University. (n.d.). Bias in AI. Retrieved from Chapman University: https://www.chapman.edu/ai/bias-in-ai.aspx#:~:text=Types%20of%20Bias%20in%20AI&text=Selection%20bias%3A%20This%20happens%20when,lead%20to%20an%20unrepresentative%20dataset.

China, C. R. (2024, January 10). Breaking down the advantages and disadvantages of artificial intelligence. Retrieved from IBM: https://www.ibm.com/blog/breaking-down-the-advantages-and-disadvantages-of-artificial-intelligence/

Dastin, J. (2018, October 11). Insight – Amazon scraps secret AI recruiting tool that showed bias against women. Retrieved from Reuters: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G/

Forristal, L. (2023, June 21). Artists are upset that ‘Secret Invasion’ used AI art for opening credits. Retrieved from TechCrunch: https://techcrunch.com/2023/06/21/marvel-secret-invasion-ai-art-opening-credits/?guccounter=1

Fujitsu. (n.d.). Adversarial AI Fooling the Algorithm in the Age of Autonomy. Retrieved from Fujitsu: https://www.fujitsu.com/uk/imagesgig5/7729-001-Adversarial-Whitepaper-v1.0.pdf

Hailong Hu, J. P. (2021, December 6). Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks. Retrieved from Association for COmputing Machinery Digital Library: https://dl.acm.org/doi/fullHtml/10.1145/3485832.3485838#

Hao, K. (2019, January 21). AI is sending people to jail—and getting it wrong. Retrieved from MIT Technology Review: https://www.technologyreview.com/2019/01/21/137783/algorithms-criminal-justice-ai/

Hossein Hosseini, S. K. (2017, February 27). Deceiving Google’s Perspective API Built for Detecting Toxic Comments. Retrieved from arXiv: https://arxiv.org/pdf/1702.08138

Ian J. Goodfellow, J. S. (2015, March 20). EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES. Retrieved from arXiv: https://arxiv.org/pdf/1412.6572

James Manyika, J. S. (2019, October 25). What Do We Do About the Biases in AI? Retrieved from Harvard Business Review: https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai

Jee, C. (2019, August 13). Google’s algorithm for detecting hate speech is racially biased. Retrieved from MIT Technology Review: https://www.technologyreview.com/2019/08/13/133757/googles-algorithm-for-detecting-hate-speech-looks-racially-biased/

Lenaerts-Bergmans, B. (2024, March 20). Data Poisoning: The Exploitation of Generative AI. Retrieved from CrowdStrike: https://www.crowdstrike.com/cybersecurity-101/cyberattacks/data-poisoning/

Likens, S. (2023). How can AI benefit society? Retrieved from PwC: https://www.pwc.com/gx/en/about/global-annual-review/artificial-intelligence.html

Matzelle, E. (2024, February 29). Top Artificial Intelligence Statistics and Facts for 2024. Retrieved from CompTIA: https://connect.comptia.org/blog/artificial-intelligence-statistics-facts

Mitra Best, A. R. (2022, January 18). Understanding algorithmic bias and how to build trust in AI. Retrieved from PwC: https://www.pwc.com/us/en/tech-effect/ai-analytics/algorithmic-bias-and-trust-in-ai.html

Nightfall AI. (n.d.). Adversarial Attacks and Perturbations. Retrieved from Nightfall AI: https://www.nightfall.ai/ai-security-101/adversarial-attacks-and-perturbations

Pu Zhao, P.-Y. C. (2023, October 22). Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness. Retrieved from OpenReview: https://openreview.net/attachment?id=SJgwzCEKwH&name=original_pdf

Ram, T. (2023, June 22). Exploring the Use of Adversarial Learning in Improving Model Robustness. Retrieved from Analytics Vidhya: https://www.analyticsvidhya.com/blog/2023/02/exploring-the-use-of-adversarial-learning-in-improving-model-robustness/

FILTER RESULTS

Latest tweets

A great conference @BSidesLondon, thanks for having us at #BSidesLDN2024! Looking forward to continuing the relationship next year!

Prism Infosec is proud to be a gold sponsor of @BSidesLondon 2024! Come and visit us on our stand and join in our cyber scavenger hunt! #CyberSecurity #bsides

Sign up to our newsletter

  • Fields marked with an * are mandatory

  • This field is for validation purposes and should be left unchanged.