AI getting hacked – systemic vulnerabilities of a booming technology

The growing use and capabilities of Machine Learning (ML) systems come with new vulnerabilities. Insufficient awareness and governance will expose increasing numbers of services to new attacks.

The use of Machine Learning (ML) systems is increasing rapidly. However, practitioners often do not understand, think about, or protect against an emerging class of vulnerabilities: “adversarial ML”. This threat encompasses any targeted exploitation or hacking of AI systems, leveraging ML-specific vulnerabilities.1

Together with recent progress in Deep Learning,2 research interest in adversarial ML has increased notably.3 Professional hackers are not only able to trick models into making mistakes or leaking information. They can also harm model performance by corrupting training data and/or stealing and extracting ML models. 

The following examples showcase the diversity of potential ML vulnerabilities:4

1. Data/model poisoning (backdoors): A backdoor responds to a trigger (specific data pattern) and produces a pre-determined outcome (eg, an exceptionally high creditworthiness or insurance score). Backdoors can be introduced using malicious pre-trained models (used as a basis for specialized ML) or malicious data, for instance by a disgruntled employee.

2. Model evasion: Attackers could use adversarial ML to produce patterns that mislead ML systems and are difficult for humans to notice.5 For example, a visual pattern that – if applied to a car as a sticker – leads an automated motor claims tool to misjudge damage.

3. Membership inference attacks can leak sensitive information used for the original training of ML models (eg, a confidential dataset that was used during a system’s learning phase). By constructing a series of targeted queries, attackers can extract training data points with high probability. This raises concerns over data protection issues.

Systemic impacts

The use of complex machine learning systems is becoming more widespread. By design their outputs are difficult to check for mistakes (both by humans and other ML). Furthermore, the systems are inherently vulnerable to adversarial attacks.6 These vulnerabilities pose risks to insurers and businesses alike, and are a challenge for regulators also.7

Regarding risks relevant for insurers (beyond cyber insurance), there are increasing opportunities for fraud, as well as potential claims in professional indemnity and errors and omissions (E&O) lines for ML failures and data breaches. Furthermore, adversarial ML attacks leaking to the media (however small the impact might be) could directly impact the reputation of insurers and/or their assets. Model-stealing could lead to intellectual property (IP) loss. Beyond that, data leakage or non-compliance with current and upcoming data and AI regulation8 might trigger fines. In some instances ML malfunction could cause harm or accidents (autonomous car crashes, medical misdiagnosis9 etc) triggering casualty or health covers.

Mitigations

Strict access management, suitable usage limits and data governance can go a long way in reducing attack surface (areas of vulnerability that can be attacked). While not always applicable, simply not exposing models to the internet, and using only trusted (eg, not automatically collected) data, are powerful risk mitigation strategies. Getting all this right is by no means an easy feat, however. It entails making security and data governance a core feature of ML development and deployment, and striking a balance between usability and privacy/IP-protection. Starting this transition now will make ML applications more resilient in the future.
 

references

References

1 This article focuses on malicious actors. Broader algorithmic risks have been covered in “Algorithms are only human too – opaque, biased, misled”, SONAR 2018.
2 Deep Learning refers to ML using deep artificial neural networks (networks with many layers/parameters).
3 See Biggio, B. and F., Roli, “Wild patterns: Ten years after the rise of adversarial machine learning”, Pattern Recognition (317–331), 2018; for a history of Adversarial ML.
4 The spectrum of possible attacks is much broader than the presented selection. See for example, MITRE’s ATLAS™ (Adversarial Threat Landscape for Artificial-Intelligence Systems) or “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations”, NIST, 2023.
5 In 2018 two Chinese citizens hacked into government’s tax system by tricking facial recognition and stole US dollars 77 million. Cf “Faces Are the Next Target for Fraudsters”, Wall Street Journal, 2021.
6 Cf eg Tramer, F. et al., “On Adaptive Attacks to Adversarial Example Defenses”, Advances in Neural Information Processing Systems (NeurIPS) 33, 2020.
7 Eg “AI Risk Management Framework”, NIST, 2023; or “Recommendation of the Council onArtificial Intelligence”, OECD, 2022; or “Artificial intelligence governance principles”, EIOPA, 2021; or “Adversarial Machine Learning and the Future Hybrid Battlespace”, NATO, 2021. The KPMG “Cyber Trust Insight”, 2022 quotes Microsoft Vice President Ann Johnson: “We’re doing a lot of work on adversarial AI because we believe that will be the next wave of attack.”
8 Such as the GDPR, the upcoming EU AI act (cf “How AI is transforming governance and risk management in insurance”, EY Denmark, 2022), and the Algorithmic Accountability Act in the United States.
9 Cf Finlayson, S., et al., “Adversarial attack on medical machine learning”, Science, 2019.

Tags

Related SONAR content

SONAR 2023: New emerging risk insights

This year's emerging risk insights reflect the uncertainty currently shaping the global risk landscape of the insurance industry, including geopolitical tensions, volatile financial markets and technological innovations such as generative AI.