Artificial Intelligence and Machine Learning systems are changing the world and are poised to be the most disruptive technology of recent times. Widespread adoption of such systems by businesses to gain competitive advantage in the industry is a huge challenge for cyber-security teams who are not used to the unique risks which such systems create.
Data Poisoning, Model Extraction and Membership inferences attacks are all new risks which cyber security teams need to identify and mitigate early on. One of the best ways to do this is to apply the concept of Threat Modeling to these applications.
Threat Modeling
Threat Modeling refers to a structured way of identifying security threats to a system and is usually consists of the below:
● A high-level diagram of the system
● Profiles of attackers and their motives
● A list of threats to the system and how they might materialize
Threat Modeling is like risk assessments, but you adopt the perspective of an attacker and see how much damage they can do. There are numerous methodologies and tools available for threat modeling which we do not have to cover here but honestly, you can create a threat model with pen and paper if you understand the core concepts!
Threat Modeling AI applications
In its simplest form, you create a high diagram of your AI application and envision scenarios in which threats will materialize.
Let us take the following example of a sample AI system for which we create the following high-level abstraction:

Even though this is a very basic conceptualization, you can still use it to decompose the AI system and list down the key assets that might get targeted by an attacker.
Some of the key assets would be:
● The training data
● The public-facing API
● The Machine Learning Model
● The Servers hosting the model
● The infra-admin access keys or credentials
Now assume the viewpoint of an attacker and try to imagine what are the areas they would target. You do not have to do this alone and should involve the data scientist and technology staff that are part of the AI team to help you with the same. Brainstorming threats is a great way to identify weak areas of your technology ecosystem with numerous methodologies present.
STRIDE is a popular threat modeling methodology by Microsoft that I prefer which classifies threats into the following categories:
- Spoofing : Impersonating something or someone else.
- Tampering : Modifying data or code
- Repudiation : Claiming to have not performed an action.
- Information disclosure : Exposing information to someone not authorized to see it
- Denial of service : Deny or degrade service to users
- Elevation of privilege : Gain capabilities without proper authorization
Try to classify all your threats into these categories and envision at least one for each. Lets take a detailed look at the categories with some sample threats and their mitigations.
Spoofing
🚨 Threat Scenario : An attacker can use AI systems in a malicious way to commit identity fraud. For example using Deepfakes to assume the identity of someone else in a remote interview
🔒 Mitigation : Enforce "realness checks" for interviews of sensitive positions. Ensure that HR is trained to detect deepfakes
Tampering
🚨 Threat Scenario : Attacker poisons the supply chain of third-party tools used by Data Scientists:
🔒 Mitigation : Scan software libraries before usage. Make sure integrity checks are present
Repudiation
🚨 Threat Scenario : Building upon the spoofing threat, an attacker can use deepfakes to commit identity threat and attribute those actions to another individual.
🔒 Mitigation : Ensure that authentication is built into systems at multiple levels which are independent of each other.
Information disclosure
🚨 Threat Scenario : Attacker gains access to the model and attempts to replicate it
🔒 Mitigation : Throttling limits present on the exposed APIs to restrict the number of requests that can be made. Alerts for an abnormal number of calls. Limited information in the output request
Denial of service
🚨 Threat Scenario : An attacker causes a Denial-of-Service attack by deleting the training data
🔒 Mitigation : Regular backups of the images. Restricted access to the training data
Elevation of privilege
🚨 Threat Scenario : Attacker gains access to admin credentials or keys
🔒 Mitigation : Admins use multi-factor authentication to access servers via hardened endpoints.
Conclusion
As you can see threat modeling is more of an art than an exact science and you will get better at it the more you do it. The specific methodology you use is not important but what is important is that you do it consistently without fail and track these risks to closure. This should be done in collaboration with the technology and business teams who will be involved in the training/creation of the AI system on a periodic basis.
I hope this helped you in seeing how threat modeling can help envisage AI risks and threats in a clear and easy-to-understand manner. Essentially, threat modeling is asking the question "What can go wrong" and answering it while thinking like an attacker. Carry out this process throughout the AI lifecycle and whenever there is a major change, and you will see a tangible improvement in your AI security posture
I hope you enjoyed reading this. If you find this topic interesting then check out my book on AI governance and Cybersecurity or my course . Do consider supporting me by becoming a Medium member using this link