Black Hat USA 2022: Videos, news, interviews – our complete coverage is here!
Machine learning (ML) inputs and outputs are becoming more widely available to customers thanks to organizations in almost every sector integrating artificial intelligence (AI) technology into their hardware and software products. Naturally, this has attracted the attention of malicious actors.
In this interview for Help Net Security, Christopher Sestito, CEO of HiddenLayer, talks about machine learning security considerations, and the related threats organizations should be worried about.
Very few organizations are focusing on protecting their machine learning assets and even fewer are allocating resources to machine learning security. There are multiple reasons for this including competing budgetary priorities, scarce talent, and until recently, a lack of security products targeting this issue.
Over the last decade we’ve seen unprecedented adoption of AI/ML in every industry for every use case where data is available. The advantages are proven, but as we’ve seen with other new technologies, they quickly become a new attack surface for malicious actors.
With advances in machine learning operations, data science teams are building more mature AI ecosystems with regard to efficacy, efficiency, reliability and explicability, but security has yet to be prioritized. This is no longer a viable path for enterprise organizations as the motivations for attacking ML systems are clear, attack tools are available and easy to use, and the potential targets are growing at that same unprecedented rate of adoption.
As machine learning models are integrated in more and more production systems, they are being exposed to customers in hardware and software products, web applications, mobile applications and more. This trend is generally referred to as ‘AI at the edge’ and brings incredible decision making power and prediction capabilities to all of the technologies we use every day. Providing ML to more and more end users is simultaneously exposing those same ML assets to threat actors.
Machine learning models that are not exposed via networks are also at risk. Access to these models can be attained through traditional cyber attack techniques paving the way for adversarial machine learning opportunities. Once threat actors have access, there are many types of attacks they can employ. Inference attacks attempt to map or ‘reverse’ the model which leads to the ability to exploit a model’s weaknesses, tamper with the overarching product’s functionality, or replicate and steal the model itself.
We’ve seen real world examples of this attack on security vendors to bypass anti-virus or other protection mechanisms. An attacker can also opt to poison the data used to train the model to mislead the system into learning improperly and skewing decision making in the attacker’s favor.
While all adversarial machine learning attack types need to be defended against, different organizations will have different priorities. Financial institutions leveraging machine learning models to identify fraudulent transactions are going to be highly focused on defending against inference attacks.
If an attacker understands the strengths and weaknesses of a fraud detection system, they can use that to alter their techniques to go undetected, bypassing the model altogether. Healthcare organizations could be more sensitive to data poisoning. The medical field has been some of the earliest adopters of using their massive historical data sets to predict outcomes with machine learning.
Data poisoning attacks can lead to misdiagnosis, alter results of drug trials, misrepresent patient populations and more. Security organizations themselves are presently focusing on machine learning bypass attacks that are actively being used to deploy ransomware or backdoor networks.
The best advice I can give to a CISO today is to embrace patterns we’ve already learned on emerging technologies. Much like our advancements in cloud infrastructures, machine learning deployments represent a new attack surface and require specialized defenses. The bar for conducting adversarial machine learning attacks is getting lower every day with open source attack tools like Microsoft’s Counterfit or IBM’s Adversarial Robustness Toolbox.
Another major consideration is that many of these attacks are not obvious and you may not understand they’re taking place if you’re not looking for them. As security practitioners, we’ve gotten used to ransomware, which makes it very clear that an organization has been attacked and data has been locked or stolen. Adversarial machine learning attacks can be tailored to take place over a longer period of time and some, like data poisoning, can be a slower, but permanently destructive process.