3 things industrial control system enterprises should do to boost cyber-resilien...
Here's an action plan for industrial control system cyber-resilience

Enterprises in smart critical infrastructure-driven sectors such as manufacturing, energy, water, and transportation (among many others) rely upon the cyber-resilience of industrial control system (ICS) infrastructure to sustain business continuity. Business continuity here not only encompasses the feature of non-disruptive minimal service quality to customers but also the feature of ensuring the safety properties of that service. As Brian Deken, Business Development Manager of ICS giant Rockwell Automation, put it: "As a citizen, I"d like to know whether my drinking water is safe or whether a cyber-attack is affecting it or could possibly affect it". Imagine an event in a smart city with about a million people accessing maliciously targeted non-potable drinking water. How much could this event negatively affect society"s economic, health, and lifestyle welfare? Is there a strategy by which the management in such enterprises can maintain business continuity in the event of inevitable cyber-attacks to mitigate these repercussions?
In this article, we provide a brief overview of how ICSs operate, their security challenges, and examples of cyber-attacks that have significant adverse repercussions on society. Subsequently, we provide a three-point action plan for ICS management to boost cyber resilience and maintain business continuity in the event of inevitable cyber-attacks.
The increased connectivity and interoperability with IT/OT convergence via connecting OT systems, networks, and applications to enterprise IT amplifies the cybersecurity attack surface. In other words, the recent IT/OT convergence has minimised the traditional air gap between the IT and OT parts of an ICS enterprise that was the cornerstone of ICS cybersecurity. Legacy OT was designed and implemented before cybersecurity was even a concern in ICSs. Hence, the modern ICS in smart cities has a "patched-in" cybersecurity rather than the much-needed "baked-in" cybersecurity for their networks and applications. Consequently, this increases the risk of cyber-criminals accessing sensitive ICS data and making unauthorised changes to the ICS controls of industrial operations in critical infrastructure.
To drive home this point, according to Rockwell Automation, the number of cybersecurity incidents on ICSs between 2021 and 2022 alone is about one-third the number of similar incidents between 1980-2010. In addition, there has been a 50 percent rise in ransomware attacks on ICS in 2023 compared to 2022 (says Dragos, an ICS cybersecurity market leader). It is evident that the number of ICS cyber-incidents is rising exponentially by the year. Christopher Wray, the director of the US Federal Bureau of Investigation (FBI), says that in 2024, Beijing"s efforts to plant offensive malware inside US critical infrastructure covertly were greater than ever before.
The LockerGoga ransomware attack of 2019 on Norsk Hydro, a multinational aluminium manufacturer, compromised the firm"s IT systems, including networked servers and PCs, and the business functions reliant on them. The attack affected all 35,000 Norsk employees, led to multiple plants going offline, and eventually cost the firm approximately $75 million.
In another incident, Mondelez, a multinational food and beverage company (and maker of the popular Oreo cookies), fell prey to the NotPetya cyber-attack in 2017. The NotPetya malware encrypted and permanently damaged Mondelez"s 1700 servers and 24,000 laptops. This disrupted Mondelez"s production facilities and other operations across the globe and resulted in them incurring business losses amounting to $100 million because they could not complete customer orders.
In a more recent incident from 2021, Colonial Pipeline, an oil and gas company controlling nearly half of the gasoline, jet fuel, and diesel flowing along the East Coast of the USA, fell prey to the DarkSide ransomware cyber-attack. Colonial Pipeline took an immediate precautionary step to shut down all its operational technology to prevent the ransomware infection from spreading into its OT networks. As a result, Colonial Pipeline experienced business discontinuity, as 5500 miles of pipeline had to be shut down. Despite paying hackers a ransom of around $4.5 million, Colonial Pipeline took about a week to restore its operational technology networks that drive the pipeline operation.
When it comes to data breach cyberattacks, the state of New York"s critical infrastructure in the year 2023 was subject to nine incidents in health care and public health, eight incidents in financial services and seven incidents in both commercial and government facilities, co-contributing to a massive $775 million cyber loss for the state. According to DiNapoli (comptroller of the state of New York), "Data breaches at companies and institutions that collect large amounts of personal information expose New Yorkers to potential invasions of privacy, identity theft and fraud."
In an incident related to a cyber-attack on a broader supply chain, Japanese car-manufacturing giant Toyota suspended operations on 28 production lines across 14 plants for at least 24 hours in 2022 because Kojima Industries, one of Toyota"s key supply chain partners and a plastic parts and electronic components manufacturer was hit by a malware cyber-attack. The world"s top-selling carmaker incurred a business disruption cost of about $375 million from possibly being hit by malware Emotet that enters ICSs through IT, compromising social engineering hacks. More importantly, it took Kojima months to get operations close to old routines.
In all the above examples, compromising the IT wing of the ICS adversely impacted the performance of the OT wing of the ICS. This type of "indirect" compromise on ICS OT forms approximately 84 percent of adversary ways to disrupt critical infrastructure.
The only rational thing to do then for OT-driven enterprises is to accept this risk and design cyber resilience in their OT networks, business policies, structures, and operations. Cyber resilience will ensure that the IT/OT-converged ICS will be fault tolerant to always enable sustained continuity of core business processes at the minimum service guarantees. It is, however, practically and economically infeasible for cyber resilience to ensure cyber-protection of all vulnerable points in the OT network.
Figure 1: An Illustration of an Industrial Control System Network
As part of the management task, these vulnerabilities should be identified first. Multiple major vulnerabilities exist between any link connecting parts of an ICS network that can exploited at the software, network, and hardware levels. Publicly known vulnerabilities are termed Common Vulnerabilities and Exposures (CVEs). An example is CVE 2022-46680, which affects commercial ION and PowerLogic power meters. Managers should first identify these CVEs (from MITRE"s National Vulnerability Database) for each link of an ICS network. Each CVE carries a Common Vulnerability Scoring System (CVSS) score denoting its severity (e.g., CVE 2022-46680 with a CVSS score of 8.8). For unknown vulnerabilities (e.g., those resulting in zero-day attacks), the managers should rely on their domain expertise to derive approximate CVSS scores. Managers should then collect the CVSS scores for each link. It is evident that for a given enterprise in context, not all CVEs for a link are equally likely. Managers should consequently derive a single weighted CVSS score for each link in an ICS network.
We simulated ICS network breakdown when a few critical network assets were compromised by analysing multiple publicly available case studies on service non-availability in ICSs after a cyber incident. We observed a tipping point phenomenon wherein a few critical ICS assets became operationally unavailable to provide service, bringing down most of the ICS network. This tipping point phenomenon will amplify incident response times considerably, diminishing the chances that business continuity of core processes will be sustained.
Thus, an important action item for ICS network managers is determining which special network assets, i.e., the "crown jewels", should be prioritised for cyber-protection to prevent a tipping point phenomenon. After all, hardly any ICS enterprise has an unlimited budget to protect all the IT and OT assets. Examples of OT" crown jewels" include critical data, logical/physical assets, and/or OT control processes, whereas IT" crown jewels" include HMIs, data servers, and OPC/application servers.
ICS system managers should first rank the importance of network assets. A popular scientific way to do this is by ranking the assets based on a network centrality (importance) measure adopted from network science theory. A centrality measure showcases how much influence a given asset"s proper functioning has on the proper functioning of the other assets in the ICS network. The higher the centrality of an ICS network asset, the greater its criticality. However, there are multiple centrality measures in practice, each providing managers with a different listing order.
Our research simplifies the listing dilemma for ICS managers. It recommends managers work on a single asset criticality listing that combines multiple centrality measures in a manner specific to a particular OT environment. Our research generates this single list via a novel managerial decision-making framework derived from applying the seminal Analytic Hierarchy Process (AHP) theory in operations management research.
Our quant-based research recommends ICS system managers proportionately allocate a cyber-protection budget among ICS assets in decreasing order of their centrality values in the network. This recommendation boosts/optimises our proposed quantitative cyber-resilience measure that correlates with the time to incident response (IR).
Multiple managerial cyber-protection allocation policies (MCAPs) can be drawn up depending on the type of asset centrality measure deployed. Our research observes and compares MCAP effectiveness insights to reduce IR time based on whether a network of ICS assets is fragile (i.e., individual assets have low/zero recovery rates after becoming non-operational due to a cyber-attack) or non-fragile (i.e., individual assets have medium/high recovery rates). It also accounts for whether an ICS network structure is balanced (i.e., a few assets are connected to far many assets compared to the others).
We summarise our network-specific quant-based MCAP recommendations to boost ICS cyber-resilience in Figure 3. These recommendations are validated via extensive simulations run atop real-world OT asset-driven ICS network structures. The recommendations suggest budget allocation proportional to decreasing order of (a) the network influence-based centrality measure (e.g., Katz centrality that computes the centrality for a network node based on the centrality of its neighbours) of an asset if individual ICS assets are not fragile to boost cyber-resilience, and (b) the node influence-based centrality measure (e.g., degree centrality that is the fraction of network nodes a given node is connected to) of an asset if ICS assets are fragile to boost cyber-resilience.
Likewise, in scenarios when an OT-driven ICS network structure is unbalanced, budget allocation based upon path-based centrality measures (e.g., betweenness centrality, which is the sum of the fraction of shortest network paths that go through a given node for all source-destination paths) is recommended to boost ICS cyber-resilience.
The action plan first involves identifying the major cyber vulnerabilities and their severity in an ICS network that can considerably impact ICS cyber resilience. The second action point involves ICS management considering this information and prioritising the assets to protect in the IT/OT converged ICS network to boost/optimise a quantified cyber-resilience metric that correlates with the time to incident response (IR) after a cyber-attack. The final action point requires ICS management to invest in asset protection in proportion to asset criticality within the ICS network.
Ranjan Pal, Michael Siegel, MIT Sloan School of Management and Bodhibrata Nag, Indian Institute of Management Calcutta
First Published: Apr 11, 2024, 12:25
Subscribe Now