AI and Cybersecurity | CyberCapital HQ

December 18, 2023

How are Cyber threats rising in complexity and sophistication with the advent of generative AI? What are the challenges and opportunities for Cyber defence? Which ethical and regulatory implications do AI and Cybersecurity present? Which tradeoffs must we consider when developing AI systems? How can the industry foster more collaborative and information-sharing frameworks and practices?

These are some of the questions we will explore in this article, looking at where AI and Cybersecurity intersect.

As threat actors are weaponizing AI for cyber-attacks, we observe an increase in the magnitude and sophistication of cyber-attacks leveraging artiﬁcial intelligence and the convergence of scale and customization.

Customization often comes at the expense of scalability. Still, artiﬁcial intelligence has the potential to reconcile scale and customization, allowing threat actors to develop highly customized targeted attacks deployed at scale.

In this new paradigm, AI can empower threat actors to deploy Cyber threats at speed and scale in highly customized ways.

In parallel with those trends, attacks are becoming more autonomous and adaptive with self-evasive capabilities.

Some statistics speak for themselves in terms of impact. According to the IBM Ponemon Institute, the average cost of cyberattacks for an enterprise is around $4.9 million, and it takes an average of 270 days for an attack to be detected and contained.

From a defence and response perspective, the challenge is that those new threats will take more work and be more expensive to detect. Malware is becoming harder to identify, and the threat landscape will experience more targeted threats and more convincing and personalized phishing attacks, not only regarding the intrusive capabilities of those cyber weapons but also the potential impact.

Consequently, we’re likely to see more destructive network-wide malware infections, more effective methods of propagation, more convincing fake news and clickbait and more cross-platform malware.

For example, looking more specifically, we’re seeing an increase in impersonation attacks. According to IBM, 2/3 of businesses were targeted. There has been an increase in impersonation attacks in the last 12 months, and 73 per cent suffered a direct loss.

Synthetic data or deep fakes is one of the ways AI threat actors can leverage artiﬁcial intelligence for those types of attacks. Threat actors can also use AI to scan networks and identify vulnerabilities in code in a more scalable way.

One of the trends we’re also observing is the evasive capabilities of those malware. For example, worms or malware can avoid detection by learning from detection events. AI is very much evolutionary, and those new types of threats can learn from their environments, from detection events to morph and adapt to evade the EDR system. The stealth nature of those malware is deﬁnitely on the rise.

In 2018, IBM developed the Deep Locker malware, essentially developing techniques to build a new type of malware with evasive capabilities around three dimensions. The malware can hide the type of instance it is targeting and the speciﬁc kind of class of users it is targeting, and it can also encrypt the payload. Essentially, this comes with a convoluted neural network embedded in the malware, which makes it highly difficult to detect and reverse engineer. It is an altogether new paradigm.

Deep Locker leverages the black box nature of deep neural networks to conceal the trigger condition that will trigger the payload distribution. We’re thus seeing new types of cyber weapons that are harder to detect, more targeted, developed at speed and scale, customized, more evasive, more convincing, and have high destructive power related to networks and applications.

Threat actors’ entry barriers are decreasing with AI and generative artiﬁcial intelligence. We will thus see a proliferation of less sophisticated attacks, as threat actors can leverage tools like ChatGPT to generate new hacking scripts. On the other hand, those new tools also empower sophisticated threat actors to develop sophisticated types of polymorphic malware, for example. ChatGPT can also be used on the defensive side to identify the vulnerabilities in code, for instance, in smart contracts. One can leverage ChatGPT for scanning purposes to develop better software development lifecycle practices and address and ﬁxing vulnerabilities faster.

Zero-day exploits are the most sophisticated type of malware and are still expensive to develop and require significant resources. Given these costs, zero-day exploits typically remain within the realm of nation-state actors.

Whether on the less sophisticated or more sophisticated side, we observe an increase in intensity, frequency, velocity, and impact.

Typically, it’s sensible to leverage artiﬁcial intelligence to defend against AI-powered cyber-attacks. Several companies have already applied artiﬁcial intelligence to develop defensive technology. We could mention companies like DarkTrace, Cylance, and DeepInstinct, some pioneers in applying artiﬁcial intelligence in malware detection and network detection response.

For example, DarkTrace, as an AI application for network detection and response, deﬁnes a baseline norm and standard of expected behaviour within a network infrastructure and identifies deviations from the baseline within the security bands to highlight potential compromise.

Cylance, acquired by Blackberry in 2019 and Deep Instinct, apply machine learning and deep learning to tackle the problems of Zero Days threats, those unknown types of malware that legacy and antivirus solutions don’t recognize as they detect malware based on known signatures. They’re using their training models to identify unknown malware based on a large volume of data sets that they’re training to be able to recognize those unknown malware.

Another exciting field of research is adversarial machine learning, which is attracting much attention. AI systems have vulnerabilities that threat actors can exploit.

Let’s take a defence-in-depth and risk-management-layered approach. Our job as cybersecurity professionals on the defensive side is to mitigate risk and transform an environment into a low-risk, low-impact setting. We must approach Cybersecurity from an enterprise-wide risk management perspective, not only from the technology angle but also from the people, organizational, process, policy, and governance perspectives.

On the defensive side, we like to consider the lifecycle of a security event. Typically, we could apply AI within each step of the NIST framework to get the main benefits of using AI around automation, prediction, and faster processing and response times regarding the various stages of the NIST framework.

The question is how to leverage artiﬁcial intelligence for efficiency, speed, scale, and accuracy, ultimately reducing impact and likelihood.

If we look at the components of an AI system, we’re talking about the model, the algorithms, and the data. Good-quality outcome needs good-quality inputs. Data curation is one of the main challenges to building robust AI models. We need good-quality data to train algorithms. That relates to all the data curation aspects of training models and algorithms. Training algorithms and developing effective models are undoubtedly priorities for developing effective AI models.

We must be careful about the type of data we use to train those models to ensure we train those models with privacy and enhanced privacy protection while monitoring potential bias building up in the data or algorithm.

There are several methodologies around training and developing models, for example.

We can think of federated learning, like this decentralized way of training models, so that various data owners don’t necessarily have to share all the data. You can preserve privacy while training a model. Federated learning is one approach. Significant interest from investors in developing privacy-enhanced technologies related to developing and training new AI models could develop.

From a risk management perspective, we must know the risks inherent in developing training and ensuring an AI model’s robustness and trustworthiness. Typically, there are two dimensions here. Essentially, threat actors can attack AI models at two levels. In the traditional sense, the conventional IT side of the security model, the AI model, can be compromised, and the data can be compromised. And there can be unauthorized access to the model’s parameters.

In traditional cyber security, AI models can be compromised around conﬁdentiality, integrity, availability, and unauthorized access. Adversarial machine learning brings a new dimension where the model can be corrupted, whether around data poisoning or the algorithm itself can be corrupted, such as generating intentional bias in the data. One of the critical aspects of training models is making sure that we keep bias under control. Inevitably, there will be some degree of bias, whether it’s from the algorithm or the data, that will gradually build up in AI models.

From a risk management perspective, we will need human oversight. In security monitoring, an AI model’s performance risk matrix ensures that bias is not gradually building up. Typically, in adversarial machine learning, there are many ways an AI model can be compromised, such as data reconstruction attacks, memorization attacks, and membership inference attacks, where potential threat actors can try to infer some attributes about the data or map properties between the data and the data subjects.

To infer what other kind of information is extrapolated from querying the model, one can even try to reverse engineer training the model or target the training data set that was used to train the model.

AI models will also have to be protected in the traditional sense. Still, from a software development lifecycle perspective, they must embed security by design to mitigate the risk of adversarial machine learning attacks.

From a regulatory standpoint, we should develop AI models with a certain degree of observability and explainability to understand how an AI with 175 billion parameters decides.

There are tradeoffs between performance and visibility and explainability and traceability. Getting more observability and explainability may come at the expense of performance.

One criticism of any model like ChatGPT is that there is still a lot of secrecy around these models. So, for example, Microsoft, Google, and OpenAI need to make the training datasets they’re using public. They’re very secretive about the processes and mechanisms they use for ﬁne tuning. They don’t give impartial researchers access to the models and training data. There is a need for independent oversight. Or at least it needs to replicate research and studies reliably.

There are still a lot of questions and uncertainty surrounding these models. There are many questions about policy and ethics, mainly about whether we should use these models in high-risk situations. Cybersecurity, especially regarding safety and security, will be considered high risk, especially in critical infrastructures. So that touches upon the question of regulation, how policymakers will develop regulations to address and mitigate those risks and ensure safety while protecting users’ privacy and fundamental rights.

Regulation should ensure the safety and trustworthiness of artiﬁcial intelligence systems, respect for human rights, and robustness of AI systems in their deployment and distribution, whether in the public or private sector.

The EU AI Act has classiﬁcation on those high-risk, medium, and low-risk models. And ultimately, they require different levels of regulations and oversight. Typically, we want to be able to trust AI systems. From a safety and human rights protection perspective, AI systems could prove intrusive, and regulation should protect people’s fundamental rights.
It’s also a question of AI systems’ accountability. That also ties into what we discussed, more observability and explainability, because usually, for AI systems to be accountable, we need to understand how they make a decision. Therefore, we need to have observability and explainability of those AI systems.

Different approaches to regulation will be influenced by culture. The EU will undoubtedly foster a higher degree of accountability, especially in high-risk situations or environments, for high-risk AI models. Companies will have to develop AI models that are explainable and observable.

We must develop explainability and observability in AI models in applications such as facial recognition, credit scoring, or critical infrastructure, which is a cost/ beneﬁt analysis, and tradeoffs will have to be made because those controls may also impact performance, transparency and secrecy. Typically, the US approach to AI is more focused on performance. There is undoubtedly more secrecy around AI models, whereas the EU wants to incentivize more transparency around those frameworks and Artificial Intelligence.

There is deﬁnitely a need for harmonization around regulatory frameworks. The EU AI Act’s scope, purpose, and objective are to create a harmonizing framework that is principle-based and future-proof. Regulation should evolve as technology evolves, so those regulations must be ﬂexible enough to embrace fast technological developments.

There’s deﬁnitely a cyber arms race. AI is used to develop new types of cyber weapons. We will naturally have AI as a new tool, technique, and technology to protect against more sophisticated and targeted attacks. There’s going to be deﬁnitely an enhanced defence with AI. For example, we could see the emergence of a next-gen SIEM solution, security, incident, and event management solutions in leveraging enterprise data generated internally for better and more customized defensive tools rather than relying on off-the-shelf solutions. There is a need for customization and the need to, for example, deliver better querying capabilities around security events.

AI algorithms will identify patterns, anomalies, and potential threats. We have access to a large volume of data, enabling fast and more accurate threat detection responses.

As we discussed earlier, think of the various stages of the NIST framework: Identify, Protect, Detect, Respond, and Recover. We can apply AI at each step and reduce the time between those steps and their processing times. And we’re deﬁnitely going to see increased AI-targeted attacks that are both customized and launched at speed and scale. We’re going to see the convergence of customization and scale. There will be a need to protect AI models against adversarial attacks. That will become crucial in the coming years, considering the risk of data poisoning and the compromise of AI system parameters. So, protection is both in the traditional cybersecurity sense and from an adversarial AI attack perspective.

There will likely be a new wave of privacy-enhanced technologies, some applications of blockchain, or techniques such as zero-knowledge proof in terms of how we can train AI models in a privacy-preserving way, emphasizing the responsible and ethical use of AI in Cybersecurity. We mentioned the federated learning models as an example of a protective way of training the models.

Cybersecurity is about collaboration, the collaborative relationship and dynamic between humans, and the need for a better collaborative framework in the public and private sectors, especially as it relates to collective threat intelligence and to be better prepared for an evolving threat landscape. Cybersecurity is a global problem and requires an international response. It is also a systemic problem which can have many domino effects. We’ve seen the example of Solar Wind as a systemic attack. As systems become more and more interconnected, the systemic risk is increasing. Collaboration and cooperation within and across industries and between public and private sectors, especially as it relates to collective threat intelligence and distribution sharing in terms of being aware of the type of new threats emerging, is also a trend that we will be observing.

From an enterprise risk management perspective, Cybersecurity is not only a technology issue but a policy and governance issue, looking at fostering communication between IT and business functions regarding addressing cybersecurity risks. Increasing awareness and the ability to quantify the metrics of success will be essential components of a robust Cybersecurity risk management program.

Any company should also consider establishing an AI agenda, which should be cross-functional and involve multiple stakeholders across business functions, being conscious of the new regulatory advancements, how regulation will impact the business landscape, how the threat landscape is evolving, and how we can implement mechanisms and methodologies across people, processes, and technology to be better prepared.

Ensuring the sustainability of AI systems will mainly depend on the human factor. Cyber-attacks often leverage social engineering, so creating awareness and sustainability will be essential, with human resources critical to building more resilient infrastructures. We must remember the need for continuous human oversight regarding AI models and not overly outsource our decision-making to AI.

By Jean Lehmann, CEO, Cyber Capital HQ

Previous post: Ztudium Podcast with Dinis Guarda »