Threat Analysis
Building Trustworthy AI: Contending with Data Poisoning
Executive Summary
As Artificial Intelligence (AI) and Machine Learning (ML) systems are adopted and integrated globally, the threat of data poisoning attacks remains a significant concern for developers and organizations deploying AI technologies. This paper will explore the landscape of data poisoning attacks, their impacts, and the strategies being developed to mitigate this threat.
Key Findings
- The field of AI security is rapidly evolving, with emerging threats and innovative defense mechanisms continually shaping the landscape of data poisoning and its countermeasures.
- Data poisoning attacks are capable of compromising AI/ML model performance, introducing biases, or creating backdoors for malicious exploitation of AI/ML systems.
- There are diverse types of data poisoning attacks, ranging from mislabeling and data injection attacks to more sophisticated techniques like split-view poisoning and backdoor tampering.
- Real-world examples, such as the attacks on Google’s Gmail spam filter and Microsoft’s Tay chatbot, demonstrate the practical risks and potential consequences of data poisoning.
- Data poisoning attacks can have far-reaching impacts, affecting critical systems in healthcare, finance, autonomous vehicles, and other domains, potentially leading to significant economic and societal consequences.
- Mitigation strategies against data poisoning range from robust data validation and sanitization techniques to advanced monitoring and detection systems, adversarial training, and secure data handling practices.
Introduction
AI and ML systems are increasingly and rapidly being adopted across various sectors, from healthcare and finance to autonomous vehicles and social media. As these technologies continue to evolve, threat actors are already seeking to adapt to, and exploit new vulnerabilities. One of these vulnerabilities is data poisoning.
Data poisoning is when a threat actor intentionally compromises a training dataset used by an AI or ML model to manipulate or degrade the model, or introduce specific vulnerabilities for future exploits (see source 1 in appendix). These attacks can cause AI systems to make wrong decisions, exhibit bias, or even fail completely. As organizations increasingly rely on AI/ML systems for critical decision-making processes, the threat of data poisoning attacks becomes more urgent.
Modern deep learning models are trained on massive datasets, often containing billions of samples automatically crawled from the internet (see source 2 in appendix). While this scale has enabled significant advancements in AI capabilities, it has also introduced new vulnerabilities. Poisoning even a minuscule fraction (as little as 0.001%) of these large, uncurated datasets can be sufficient to introduce targeted mistakes in a model’s behavior (see source 3 in appendix).
As AI systems become more integrated into our daily lives and critical infrastructure, the potential impact of these attacks grows exponentially. As the industry shifts to smaller, more specialized models, this attack surface will only increase. Additionally, as training cycles shorten, threat actors’ ability to poison datasets will only become easier. From compromising autonomous vehicle safety systems to manipulating financial algorithms, the consequences of successful data poisoning attacks can range from financial losses to threats to human life (see source 4 in appendix).
Evolution of Data Poisoning Attacks
As AI/ML systems have become more sophisticated and widely adopted, so too have the methods used to attack them. Early forms of data poisoning were relatively simple, and often involved the injection of mislabeled data into training sets. However, as AI/ML models became more complex, threat actors developed more sophisticated, targeted, and undetectable techniques. These may involve subtle manipulations of training data that cause specific misclassifications or introduce backdoors into models for future exploitation, without disrupting the performance of the model (see source 5 in appendix).
Types of Data Poisoning Attacks
Threat actors use a variety of methods to execute data poisoning attacks. We have captured various types and examples in the table below to highlight the complexity and diversity of threats facing AI/ML systems. Understanding these attack vectors is crucial for developing comprehensive defense strategies and ensuring the integrity and reliability of AI-driven decision-making processes.
About Nisos®
Nisos is the Managed Intelligence Company. We are a trusted digital investigations partner, specializing in unmasking threats to protect people, organizations, and their digital ecosystems in the commercial and public sectors. Our open source intelligence services help security, intelligence, legal, and trust and safety teams make critical decisions, impose real world consequences, and increase adversary costs. For more information, visit: https://www.nisos.com.