AI has been a hot topic in the media lately and is influencing every sector as well as our daily lives without us realising just how much. There are various systems that are driven by AI, most notable being virtual assistants (Siri, Google Assistant, Alexa, etc.) but also in healthcare to detect diseases earlier, in agriculture to identify the ideal soil for planting seeds and even content creation to generate AI scenes in movies and TV shows (Matzelle, 2024; Forristal, 2023; Brogan, 2023; Awais, 2023). AI comes with many advantages due to its ability to analyse vast amounts of data, understand patterns and make accurate predictions for a specific task (China, 2024; Likens, 2023). The future of AI is bright as they will only get better with time and improve industries like healthcare and manufacturing, however, there are concerns as well such as job losses and privacy issues.
As mentioned earlier, AI analyses large datasets to make predictions or classifications without explicitly being programmed. So, it is crucial to ensure that datasets used for training are accurate, representative and of high quality (Ataman, 2024). One of the main challenges when working with AI is the risk of data pollution in the training stage and sometimes even in production stage by learning from usage. These implications of data pollution of datasets could be incorrect predictions or classifications which could result in eventual model degradation (Lenaerts-Bergmans, 2024). Picture it like contaminants in a river; just as they mess with the water’s purity, data pollutants mess with the integrity of information in AI. Another way for AI datasets to be polluted is via biases by including discriminatory data for training which could result in negatively affecting the most discriminated members of society (James Manyika, 2019).
Adversarial AI attack concepts are quite simple to understand. The main goal is to introduce subtle perturbations to the dataset that can affect the output of the AI in a desired way. The changes are so small that it’s almost impossible to detect by humans but can have great impact on the final decision made by the AI model. According to Fujitsu, there are currently five known techniques that be used against AI models, evasion, model poisoning, training data, extraction, and inference (Fujitsu).
Like humans, generative AI is also not immune to biases and based on certain factors, the output can be unfair or unjust. Bias can occur in different stages of the AI pipeline, such as data collection, data labelling/classification, model training and deployment (Chapman University, n.d.).
Creating a robust AI model and protecting it against adversaries is a challenging task that requires in depth knowledge of the sophisticated attacks they may use. Adversarial techniques are also constantly evolving and AI systems must face attacks that they weren’t trained to be protected (Fujitsu). While no techniques can guarantee 100% protection against adversarial attacks, there are some methods to mitigate the impact of previously mentioned attacks on the AI system and to increase the overall defence capability of an AI model.
Proactive Defence – Adversarial Training
This is a brute-force method of teaching the AI model by generating vast amount of diverse adversarial examples as inputs to train the model to classify them as malicious or intentionally misleading. This method can teach the model to recognise attempts of training data manipulation by seeing itself as a target and defending against such attacks. However, the downside to this defence method is that we cannot generate every type of adversarial input as there are many permutations and there is only a subset of these that can be fed to the model in a given time frame (Ram, 2023). Adversarial training should be a continuous process as new attacks will be discovered every day and the model needs to evolve to respond to these threats.
Reactive Defence – Input Sanitation and Monitoring
This type of defence involves continuously monitoring the AI/ML system for adversarial attacks and preprocessing input data to remove any malicious perturbations (Nightfall AI, n.d.). Continuous monitoring can be used for user and entity behaviour analytics (UEBA), which can be further utilised to establish a behavioural baseline of the ML model. This can then aid in the detection of anomalous patterns of behaviour or usage within the AI models.
Minimising bias in AI can be very challenging as they have become very complex and are used to make import decisions in comparison to earlier versions. Some individuals and organisations consider it an impossible task but there are five measures that can be implemented to reduce AI bias (Mitra Best, 2022).
This post was written by Shinoj Joni
Alina Oprea, A. V. (2024, January 4). Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations. Retrieved from NIST: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.ipd.pdf
Ataman, A. (2024, January 3). Data Quality in AI: Challenges, Importance & Best Practices in ’24. Retrieved from AIMultiple: https://research.aimultiple.com/data-quality-ai/
Awais, M. N. (2023, December 7). AI and machine learning for soil analysis: an assessment of sustainable agricultural practices. Retrieved from National Center for Biotechnology Information: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10992573/
Bosch AIShield. (2022). AI SECURITY – WHITE PAPER. Retrieved from Bosch AIShield: https://www.boschaishield.com/resources/whitepaper/#:~:text=Objective%20of%20the%20whitepaper&text=Addressing%20the%20security%20needs%20can,gaps%20and%20realize%20the%20needs.
Brogan, C. (2023, November 17). New AI tool detects up to 13% more breast cancers than humans alone. Retrieved from Imperial College London: https://www.imperial.ac.uk/news/249573/new-ai-tool-detects-13-more/
Chapman University. (n.d.). Bias in AI. Retrieved from Chapman University: https://www.chapman.edu/ai/bias-in-ai.aspx#:~:text=Types%20of%20Bias%20in%20AI&text=Selection%20bias%3A%20This%20happens%20when,lead%20to%20an%20unrepresentative%20dataset.
China, C. R. (2024, January 10). Breaking down the advantages and disadvantages of artificial intelligence. Retrieved from IBM: https://www.ibm.com/blog/breaking-down-the-advantages-and-disadvantages-of-artificial-intelligence/
Dastin, J. (2018, October 11). Insight – Amazon scraps secret AI recruiting tool that showed bias against women. Retrieved from Reuters: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G/
Forristal, L. (2023, June 21). Artists are upset that ‘Secret Invasion’ used AI art for opening credits. Retrieved from TechCrunch: https://techcrunch.com/2023/06/21/marvel-secret-invasion-ai-art-opening-credits/?guccounter=1
Fujitsu. (n.d.). Adversarial AI Fooling the Algorithm in the Age of Autonomy. Retrieved from Fujitsu: https://www.fujitsu.com/uk/imagesgig5/7729-001-Adversarial-Whitepaper-v1.0.pdf
Hailong Hu, J. P. (2021, December 6). Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks. Retrieved from Association for COmputing Machinery Digital Library: https://dl.acm.org/doi/fullHtml/10.1145/3485832.3485838#
Hao, K. (2019, January 21). AI is sending people to jail—and getting it wrong. Retrieved from MIT Technology Review: https://www.technologyreview.com/2019/01/21/137783/algorithms-criminal-justice-ai/
Hossein Hosseini, S. K. (2017, February 27). Deceiving Google’s Perspective API Built for Detecting Toxic Comments. Retrieved from arXiv: https://arxiv.org/pdf/1702.08138
Ian J. Goodfellow, J. S. (2015, March 20). EXPLAINING AND HARNESSING ADVERSARIAL EXAMPLES. Retrieved from arXiv: https://arxiv.org/pdf/1412.6572
James Manyika, J. S. (2019, October 25). What Do We Do About the Biases in AI? Retrieved from Harvard Business Review: https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai
Jee, C. (2019, August 13). Google’s algorithm for detecting hate speech is racially biased. Retrieved from MIT Technology Review: https://www.technologyreview.com/2019/08/13/133757/googles-algorithm-for-detecting-hate-speech-looks-racially-biased/
Lenaerts-Bergmans, B. (2024, March 20). Data Poisoning: The Exploitation of Generative AI. Retrieved from CrowdStrike: https://www.crowdstrike.com/cybersecurity-101/cyberattacks/data-poisoning/
Likens, S. (2023). How can AI benefit society? Retrieved from PwC: https://www.pwc.com/gx/en/about/global-annual-review/artificial-intelligence.html
Matzelle, E. (2024, February 29). Top Artificial Intelligence Statistics and Facts for 2024. Retrieved from CompTIA: https://connect.comptia.org/blog/artificial-intelligence-statistics-facts
Mitra Best, A. R. (2022, January 18). Understanding algorithmic bias and how to build trust in AI. Retrieved from PwC: https://www.pwc.com/us/en/tech-effect/ai-analytics/algorithmic-bias-and-trust-in-ai.html
Nightfall AI. (n.d.). Adversarial Attacks and Perturbations. Retrieved from Nightfall AI: https://www.nightfall.ai/ai-security-101/adversarial-attacks-and-perturbations
Pu Zhao, P.-Y. C. (2023, October 22). Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness. Retrieved from OpenReview: https://openreview.net/attachment?id=SJgwzCEKwH&name=original_pdf
Ram, T. (2023, June 22). Exploring the Use of Adversarial Learning in Improving Model Robustness. Retrieved from Analytics Vidhya: https://www.analyticsvidhya.com/blog/2023/02/exploring-the-use-of-adversarial-learning-in-improving-model-robustness/