## Abstract

Civil infrastructure systems become highly complex and thus get more vulnerable to disasters. The concept of disaster resilience, the overall capability of a system to manage risks posed by catastrophic events, is emerging to address the challenge. Recently, a system-reliability-based disaster resilience analysis framework was proposed for a holistic assessment of the components' reliability, the system's redundancy, and the society's ability to recover the system functionality. The proposed framework was applied to individual structures to produce diagrams visualizing the pairs of the reliability index (β) and the redundancy index (*π*) defined to quantify the likelihood of each initial disruption scenario and the corresponding system-level failure probability, respectively. This paper develops methods to apply the *β*-*π* analysis framework to infrastructure networks and demonstrates its capability to evaluate the disaster resilience of networks from a system reliability viewpoint. We also propose a new causality-based importance measure of network components based on the *β*-*π* analysis and a causal diagram model that can consider the causality mechanism of the system failure. Compared with importance measures in the literature, the proposed measure can evaluate a component's relative importance through a well-balanced consideration of network topology and reliability. The proposed measure is expected to provide helpful guidelines for making optimal decisions to secure the disaster resilience of infrastructure networks.

## 1 Introduction

The socioeconomic activities of modern societies, driven by rapid population growth and changes, are supported by increasingly large and complex infrastructure facilities. Under this circumstance, the escalating climate change and the interactions of components within the facilities make civil infrastructure systems more vulnerable and thus pose unprecedented risks. For this reason, research has been conducted to evaluate the impact of disasters on infrastructure systems from the viewpoint of disaster resilience. In the literature, disaster resilience is mainly defined as the ability of communities or systems to adapt to, recover from, and prepare for hazards, shocks, or stresses such as earthquakes, droughts, or violent conflicts.

Bruneau et al. [1] proposed a multi-aspect view of disaster resilience, which characterizes the target system by four properties: robustness, redundancy, resourcefulness, and rapidity. Using this concept, Chang et al. [2] developed resilience measures related to the expected loss and applied them to community performance objectives. As such, initial studies of disaster resilience and reliability of infrastructure systems were mainly performed for community-level assessment. However, the concept can be extended to various scales and types of civil infrastructure systems, e.g., lifeline networks and structural systems.

Earlier research on disaster resilience focused on calculating the system reliability of infrastructure networks, e.g., power distribution systems [3,4], pipelines [5,6], transportation networks [7–9], and bridge networks [10,11]. Researchers also attempted to identify critical risk factors and component failure combinations from a system performance degradation viewpoint [12,13]. It was found that the computational cost required for system reliability calculation grows exponentially with the number of network components. Several approaches have been developed to address the issue. Byun [14] proposed a matrix-based Bayesian network (MBN) scheme that facilitates probabilistic inference on complex systems for given evidence through a novel data structure storing and handling information regarding component reliability. Approximate methods based on alternative system representation or calculations were also developed [15–18].

Further research efforts incorporated the two directions discussed above into the pre- and postdisaster stages of network resilience analysis. Jönsson et al. [19] and Johansson et al. [20] performed reliability and vulnerability analyses for power networks to demonstrate the critical need for system-level analyses in disaster resilience assessment. The vulnerability analysis aimed to quantify the negative system-level consequences for given characteristics of disasters or the combinations of component failures. Kilanitis and Sextos [21] performed the fragility analysis of each road network component based on seismic fault information and investigated changes in network functionality and loss factors caused by traffic delay using the samples of the component states. Considering the impact of network component capacities, Zhang and Alipour [22] optimized the risk mitigation strategies of components in the predisaster stage to minimize direct and indirect costs of the whole network in the postdisaster stage.

Beyond evaluating the disaster resilience of networks, various attempts are increasingly made for resilience-informed decision-making regarding network design and maintenance strategies. To keep up with such demands, we should be able to prioritize network components in terms of their contributions to system resilience. For this purpose, various importance measures (IMs) were proposed to quantify the influences and contributions of the components [23]. Probabilistic interpretations of widely used IMs were categorized in terms of the information used for calculating IMs. Beyond the traditional definitions, IMs were further developed to reflect various perspectives and conditions, e.g., topological viewpoint and reliability [24], and multiple states of components [25,26]. A well-defined IM can provide insights for component-level decision-making in many practical problems, such as finding the priority of reinforcement or inspection [27,28].

To serve as an informative measure from a disaster-resilience perspective, IM should be able to describe the impact of the component states on the system-level performance considering intercomponent correlations [29]. However, it is noted that the existing IMs in the literature may not successfully exclude the effects of initial disruption scenarios making insignificant contributions to the system performance degradation. Moreover, the evaluations of components' relative importance should consider both the reliability of individual components and the physical impact of component failures on the system's performance.

Therefore, we apply the system-reliability-based disaster resilience analysis framework [30] to infrastructure networks for the first time to facilitate resilience-informed decision-making processes of the systems. Then, we propose a new IM based on the resilience analysis results and a causal model of the system-level failure. The proposed causality-based importance measure handles network failure through major disruption scenarios. Without losing applicability to general network reliability problems, this paper focuses on those in which the system performance is defined by the connectivity between major components, which is a top priority to secure evacuation routes or production procurement paths in the postdisaster stage.

The rest of the paper is organized as follows: Section 2 briefly reviews the system-reliability-based disaster resilience analysis framework [30], widely known IMs in the literature, and theories of causality effect evaluation by a causal diagram. Section 3 provides a hypothetical network to demonstrate the general applicability of the framework to *infrastructure network* scale and IM calculations. Section 4 proposes new causality-based IMs based on the disaster resilience analysis framework and a causal model. The numerical examples in Sec. 5 further investigate the effects of the correlations between component failures and test the applicability and merits of the proposed causality-based IM. The paper concludes with a summary and concluding remarks in Sec. 6.

## 2 Theoretical Background

### 2.1 System-Reliability-Based Framework of Disaster Resilience Analysis.

Lim et al. [30] pointed out the limitations of existing disaster resilience analysis methodologies: the definition of system performance measure in a restoration curve is often subjective or elusive; the interaction between the system and its components is not incorporated well into the analysis; and the methodologies often lack a systematic procedure to utilize the analysis results as a basis for the decision-making process to secure the disaster resilience of the system. Hence, a system-reliability-based disaster resilience analysis (S-DRA) framework was proposed to overcome the limitations.

The S-DRA framework features three criteria of disaster resilience — reliability, redundancy, and recoverability. *Reliability* is the component-level capability to avoid or minimize initial component disruptions during disastrous events. *Redundancy* is the capability of a system to minimize system-level performance degradation due to cascading failure initiated by component-level disruptions. *Recoverability* is the ability of engineers and society to take appropriate actions on components and systems to restore the degraded system's functioning quickly and wholly. The three criteria are illustrated in Fig. 1.

The three criteria of disaster resilience should be defined and evaluated considering the scale-specific characteristics because system modeling, analysis of disaster-induced damage, and recovery strategy all vary depending on the scale of the analysis. In this context, Lim et al. [30] laid out a “3 × 3 resilience matrix” whose rows and columns, respectively, describe the three analysis scales and the three criteria, as shown in Table 1. Each element in the resilience matrix provides detailed descriptions of the resilience criterion for the given analysis scale.

Criteria | |||
---|---|---|---|

System | Reliability | Redundancy | Recoverability |

Individual structure | Avoid initial disruptions in structural members | Prevent progressive failures and collapse | Effective repair or replacement to restore system-level performance |

Infrastructure network | Avoid initial disruptions in network components | Prevent cascading failures and degradation of network performance | Proper strategies and actions against disruptions in network |

Urban community | Avoid structural damage and direct losses of infrastructures | Prevent indirect and long-term losses, given direct losses | Society's capabilities to recover from losses quickly and completely |

Criteria | |||
---|---|---|---|

System | Reliability | Redundancy | Recoverability |

Individual structure | Avoid initial disruptions in structural members | Prevent progressive failures and collapse | Effective repair or replacement to restore system-level performance |

Infrastructure network | Avoid initial disruptions in network components | Prevent cascading failures and degradation of network performance | Proper strategies and actions against disruptions in network |

Urban community | Avoid structural damage and direct losses of infrastructures | Prevent indirect and long-term losses, given direct losses | Society's capabilities to recover from losses quickly and completely |

While Lim et al. [30] focused on the *individual structure* scale, i.e., the first row of the matrix, further developing the framework and proposed detailed analysis methods based on single structure examples. This paper rather focuses on the second row of the resilience matrix to facilitate disaster resilience evaluations and decision-making for infrastructure networks such as transport networks, power grids, hydraulic systems, and gas distribution systems. It is noted that such networks are often modeled as graphs consisting of *nodes* representing bridges and stations and *links* such as transmission lines and pipelines. The nodes and links in a graph model are treated as node- and line-type components of the target network system.

The reliability evaluation at the network scale should properly model and consider the variability of the hazard intensity, the distribution of network components over a large area, and the spatial correlation of hazard demands on components. In evaluating the redundancy of a network system, the effects of sequential component failures triggered by initial component disruptions on the eventual network-level performance degradation should be considered. For recoverability evaluations, we should be able to assess the overall ability of engineers and society to recover the damaged network in the postdisaster stage and secure essential resources and response manuals in the predisaster stage [31]. Following the definition of the criterion, diverse socioeconomic factors describing the network environment should be studied carefully. Therefore, this paper focuses on infrastructure networks' reliability and redundancy criteria (the shaded areas in Table 1) while leaving recoverability as a future research topic.

### 2.2 Reliability Index, Redundancy Index, and Reliability-Redundancy Diagram.

*i*th scenario of component-level disruption or failure, $Fi$ for the hazard $H$ with a given occurrence rate. In other words, reliability is evaluated by the generalized reliability index from the component-level reliability analysis, considering the uncertainties in the hazard and the components. On the other hand, the redundancy index $\pi i$ is defined as the generalized reliability index for the system-level failure $Fsys$ due to cascading failures initiated by the initial disruption $Fi,$ i.e.,

To facilitate S-DRA exploiting the reliability index $(\beta )$ and redundancy index $(\pi ),$ a reliability-redundancy ($\beta $-$\pi $) diagram was proposed. The axes of the two-dimensional scatter plot represent the redundancy and reliability indices, respectively. A point with the coordinates $(\pi i,\beta i)$ in a $\beta $-$\pi $ diagram represents the indices computed by Eqs. (1) and (2) for an initial disruption $Fi$ under the given hazard $H.$ The $\beta $-$\pi $ diagram serves as a tool to provide a holistic description of the disaster resilience of the system both at the component level and the system level and to support relevant decision-making processes.

*force majeure*under given social regulations or economic constraints. To this end, the “

*de minimis*risk” concept was introduced into the $\beta $-$\pi $ diagram with the occurrence rate $Pdm,$ e.g., $Pdm=10\u22127/year$ [32]. Using the probabilities in Eqs. (1) and (2), one can check if the annual probability of the system-level failure caused by the

*i*th initial disruption scenario, $Fsys,\u2009i,$ is below the

*de minimis*risk level, i.e.,

where $\Phi (\u22c5)$ is the standard normal CDF and $Pdm/\lambda H=Hdm$ was termed “per hazard *de minimis* risk” of the hazard $H.$ The boundary of the domain $D\lambda H$ serves as the disaster resilience limit-state surface in the $\beta \u2010\pi $ diagram, as shown by the contours of the $\beta \u2010\pi $ diagrams in Ref. [30] and this paper. The disaster resilience analysis procedure employing the $\beta \u2010\pi $ diagram was termed a “reliability-redundancy $(\beta \u2010\pi )$ analysis.”

### 2.3 Reliability Importance Measures for Components in a System.

In risk management of complex systems, quantifying the components' relative contributions to the likelihood of the system's failure is often helpful. To this end, importance measures (IMs) have been developed to guide risk-informed decision-making regarding various systems. We briefly review the following IMs, widely used in system reliability analyses [33,34]. These IMs will be demonstrated through the hypothetical network example (Sec. 3) and compared with the causality-based IM proposed in this paper (Sec. 4).

*l*th cut set event $Cl.$ For example, if $C1=E1E2E3,\u2009C1={1,2,3}.$ Risk achievement worth (RAW), risk reduction worth (RRW), and boundary probability (BP), which hinge on the change in the probability of the system failure, are respectively defined as

where $Fsys\u2212e$ and $Fsys+e$ respectively denote the failure events of the target system with component $e$ removed (or failed) and replaced by a perfectly reliable component.

The IMs in Eqs. (5)–(10) provide larger values for components contributing more to the system failure probability. By definition, RAW and RRW are greater than or equal to 1, while the other IMs range between 0 and 1. In calculating RAW, RRW, and BP, the state of the component $e$ is fixed regardless of their likelihood of failure.

### 2.4 Identification of Causal Effects Using a Causal Diagram.

A causal model aims to provide mathematical descriptions of the causal relationship between random variables. In particular, recent research has often employed a causal diagram, i.e., a directed acyclic graph consisting of the nodes representing random variables and the arrows describing the causal influence between those variables [36,37]. A causal diagram well constructed using relevant variables can play an essential role in answering causality-related queries such as “How much would the disruptions of certain components affect the network's performance?” based on observable data. (Controlled experiments for statistical causality identification are impossible for already-built civil infrastructure systems. Therefore, this study utilizes computational simulation data obtained for each component disruption scenario. Then we process the data using causal diagrams to consider causal effects in quantifying the relative importance of components.)

It is noted that direct inference based on observable data may lead to spurious misunderstanding of causality due to correlations. For example, let us consider a causal diagram in Fig. 2(a) that describes the relationship between binary random variables. It is noted that $S$ and $Z$ may represent a set of random variables. Suppose the conditional probability $P(Y=yX=x)$ is computed using Bayes' rule to answer the query, “How much does $X$ affect $Y$?” Although there are no directed paths from $X$ to $Y,$ a probabilistic inference that may give a result $P(Y=yX=x)\u2260P(Y=yX\u2260x).$ Such statistical dependence between $X$ and $Y$ is due to the common source effects from $S,$ not the causal relationship between $X$ and $Y.$

where $do(X=x)$ is an operator representing the intervention shown in Fig. 2(b). The causal effect can be quantified by the change in the conditional probability caused by the opposite outcome, $X\u2260x.$ For example, the difference $P(Y=ydo(X=x))\u2212P(Y=ydo(X\u2260x))$ or the ratio $P(Y=ydo(X=x))/P(Y=ydo(X\u2260x))$ are often adopted depending on the analysis purpose [39].

Various formulaic test methods have been developed to identify which variables should be observed and controlled to investigate the causal relationship of interest. Among those methods, this study adopts a simple graphical test called a “back-door criterion.” In general, a variable set $A$ is referred to as meeting the back-door criterion for a variable pair $(X,Y)$ in a directed acyclic graph $G$ if they satisfy the following conditions: (1) no elements in $A$ are descendants of $X$, and (2) $A$ blocks every path between $X$ and $Y$ that contains an arrow into $X.$ The name of the criterion, back door, is attributed to the second condition in which the path with arrows heading to $X$ is interpreted as entering $X$ through the back door. The phrase “$A$ blocks every path” in the second condition means that $X$ and $Y$ become independent by the observation of the variables in $A.$ (For example, for the causal diagram in Fig. 2(a), $S$ and $Z$ satisfy the back-door criterion for a variable pair $(X,Y)$ because they block all back doors from $X$ to $Y,$ along the path $X\u2013S\u2013Z\u2013Y.$)

because the state of $Y$ depends only on the state of $Z.$ The causal query $P(Y=y|do(X\u2260x))$ can be computed in a similar way using the back-door criterion in Eq. (12), which leads to the same value as $P(Y=y|do(X=x)).$ This consistency on $Y$ indicates that there exists no direct causal effect of $X$ on $Y,$ as the causal diagram in Fig. 2(a) shows.

## 3 Application of System-Reliability-Based Disaster Resilience Analysis to Infrastructure Networks

### 3.1 Resilience Indices of Scenarios for Multi-Component Failures.

This section presents a new approach from the basic modeling of infrastructure networks to the creation of initial disruption scenarios and the subsequent realization of a resilience index. Let us consider an infrastructure network consisting of $n$ components with binary states – failure or nonfailure. All combinations made by $k$ components are represented by the *component index sets*$Iik,\u2009i=1,\u2026,(nk)$. For instance, when $n=5$ and $k=2$, the index sets are $I12={1,2},$$I22={1,3},$…, $I102={4,5}.$ For each combination, a mutually exclusive and collectively exhaustive (MECE) *scenario set*$Fik$ is introduced to enumerate all joint states. For example, $F12={E1E2,\u2009E1E\xaf2,\u2009E\xaf1E2,E\xaf1E\xaf2}.$ In particular, the corresponding *disruption scenario* is denoted by $Fik.$ For example, $F12=E1E2.$

where $Xe$ is a Bernoulli random variable indicating the failure by the value 1 and nonfailure by the value 0. The reliability of $n$ components can be represented by an $n$-variate standard normal distribution $Z\u223cNn(0,\Sigma )$ with a mean vector of $0\u2208Rn$ and a covariance matrix $\Sigma ,$ which is equal to the correlation coefficient matrix $R\u2208Rn\xd7n.$ The correlation coefficient between different $Ze$'s can be computed based on the marginal and joint failure probabilities [17].

*k*-variate standard normal CDF, and $\beta ik$ and $\Sigma ik$, respectively, denote the vector of $\beta e$'s of the components in $Iik$ and the corresponding submatrix of $\Sigma .$ From Eq. (2), the redundancy index $\pi ik$ is derived as

### 3.2 $\beta $-$\pi $ Analysis and Importance Measure Calculation of a Hypothetical Infrastructure Network.

Figure 3(a) shows the configuration of a hypothetical infrastructure network consisting of nine node-type components near the seismic source. Suppose the reliability indices of the components $\beta e,\u2009e=1,\u2026,\u20099$ are evaluated as {1.600, 1.718, 2.014, 2.100, 2.181, 2.403, 2.600, 2.662, 2.836}. The covariance matrix of $Z\u223cN9(0,\Sigma )$ is determined based on the assumption that the correlation coefficient between $Ze$'s of two components with distance $\Delta $ (km) is $exp(\u2212\Delta \u22120.5).$ The system failure $Fsys$ is defined as the disconnection of all terminal nodes from the source nodes, described by the six cut sets as $Fsys=\u22c3l=16Cl=E2E3\u222aE2E6\u222aE3E5\u222aE5E6\u222aE7E9\u222aE8.$

The redundancy index is calculated using Eq. (16) as $\pi 12=\u2212\Phi \u22121(0.1849)=0.8969.$

The β-π diagram in Fig. 3(b) is obtained by repeating this process for all scenarios in $Fik\u2009(k=1,\u20092,\u20093).$ The markers in the diagram are distinguished with different shapes and colors according to $k,$ the number of components in the disruption scenario. Especially for $k=1,$ nine components are labeled near the marker with the corresponding color. The contours in the figure represent the boundaries of the disaster resilience constraints $D\lambda H$ in Eq. (4) for the four annual mean occurrence rates $\lambda H$ and $Pdm=10\u22127/year.$

The redundancy indices of the initial disruption scenarios that include one or more minimum cut sets are $\u2212\u221e$ because the conditional probability $P(Fsys|Fik,H)$ is 1. For convenience, the index pairs of such scenarios are located in $\pi =\u22123$ in the diagram. This visualization is acceptable because the vertical location of the marker is almost consistent for $\pi \u2264\u22123,$ and thus the reliability index determines whether the system satisfies the resilience constraint.

The $\beta \u2010\pi $ diagram helps us to identify critical initial disruption scenarios leading to socially unacceptable risk. For example, when $\lambda H=1/4,800year,$ among the nine initial disruption scenarios with $k=1,$ only $E7$ and $E9$ satisfy the disaster resilience constraint $D\lambda H$ in Eq. (4). It is observed that the markers tend to move in the upper left direction as $k$ increases, i.e., higher reliability index and lower redundancy index. This is a natural phenomenon because the spatial demand on the components and correlation coefficients between component states are driven by the seismic hazard with the specific mean occurrence rate. In detail, stemming from the given hazard, the likelihoods of scenarios involving more components are relatively low, but once such disruption occurs, the network's capability to avoid system-level failures drops significantly. Incorporating the information on *recoverability* into the diagram by the colors or sizes of the markers [30], the $\beta \u2010\pi $ diagram can support decision-making processes to manage the network's disaster resilience.

Next, the six IMs reviewed in Sec. 2.3 are calculated for the hypothetical network, as shown in Fig. 4. Six IMs quantify the relative importance of components differently based on the reliability of individual components and their topological importance in the network. The existing IMs commonly find components 1, 4, 7, and 9 relatively less critical but provide different comparisons for the others. It is also noted that these IMs do not check the contributions of the components to inducing socially unacceptable risks while considering the hazard occurrence rate. Besides, the CP and ICP of components 1 and 4 are nonzero even though they do not contribute to the system performance from the network topology viewpoint (because the state of component five governs). Further in-depth analysis of these IMs and comparison with the causality-based IMs will be discussed in Sec. 4.2.

## 4 Causality-Based Importance Measure of Infrastructure Network Components

To reflect the causal effects of component failures on the system failure properly, first, we describe the causal relationship between various factors, e.g., hazards and component states, using a causal diagram. Then, from the scenario-wise S-DRA, we allocate the results to individual components considering the causal contribution calculated through the causal diagram and propose new IM. Finally, we discuss and compare the resilience philosophy of the novel causality-based IM with the existing correlation-based IMs and *do-free* IMs set as the comparison group.

### 4.1 Causal Diagram for System Performance Loss by External Hazards

#### 4.1.1 Excluding Spurious Correlations Between Component States and System-Level States.

Figure 5(a) shows the proposed structure of a causal diagram $G$ describing the Bernoulli random variable $Ynet$ representing the network connectivity loss induced by external hazards denoted by $S1,\u2009S2,$ …, $Sm.$ The node $U$ represents a set of unobservable factors affecting $n$ component failure variables $X1,\u2009X2,$ …, $Xn.\u2009U$ remains an implicit variable acknowledging the imperfect knowledge of the component failures and their correlations.

The node $U$ and the $m$ hazard sources are common causes of the component failure $Xe,$ which may eventually lead to network connectivity loss $Ynet.$ These common causes are the *confounders* inducing associations between component failures and those between component failures and a network connectivity loss, e.g., $X1\u2192U\u2192X3,$ and $X1\u2192S1\u2192X2\u2192Ynet.$ These associations are considered spurious because the observation of the failure of a component does not necessarily change the failure mechanism of other components from a causal viewpoint.

*do*-operator is introduced to $Xe,$ i.e., all back-door paths from $Xe$ to $Ynet$ are erased, as shown in Fig. 5(b). Based on the modified diagram $GX2\xaf,$ the causal effect of the failure of component

*2*on the system failure probability can be calculated by setting the state of all remaining components $(X1,\u2009X3,\u2009X4,\u2009...,\u2009Xn)=X\u223c2$ as an admissible set and applying the back-door criteria, i.e.,

The causal effect on the left-hand side can be estimated by computing all the probability terms using multivariate normal CDFs.

#### 4.1.2 Causal Contribution of Individual Components in the Common Initial Disruption Scenario.

In $\beta \u2010\pi $ analysis, the disaster resilience of a system is evaluated for each *disruption scenario*, which is described as a single component failure or the joint failure of multiple components. The causal contributions of individual *components* within a multi-component-failure scenario are identified through the following process. First, to deal with the effect of change in the state of component $e\u2208Iik,$ the remaining $(k\u22121)$ components are denoted by the *sorted index set*$Ii,ek=Iik\u20e5{e}.$ As $Fik$ was enumerated for $Iik,$ an MECE scenario set $Fi,ek$ can be obtained for $Ii,ek.$ For example, let us consider the component index set $I13={1,\u20092,\u20093}.$ When we investigate the importance of component $e=2,\u2009I1,\u200923=I13\u20e5{2}={1,\u20093}$ and $F1,\u200923={E1E3,\u2009E1E\xaf3,\u2009E\xaf1E3,\u2009E\xaf1E\xaf3},$ respectively. For $k=1,\u2009Fi,\u2009ek$ is defined as a universal set $U$ for the given sample space.

*coherent*network, i.e., the improvement of component states cannot degrade the system's performance, described by the inequality

Table 2 shows five cases of the possible values of the probability terms, along with the corresponding causal contribution of component $e,\u2009rek(F)$ computed by Eq. (20). In Case 1, the value of the conditional probability $P(FsysF,H)$ is set to zero, i.e., $F$ includes a link set. On the other hand, in Case 5, the conditional probability is set to 1, i.e., $F$ includes a cut set. In these cases, the system's state is determined regardless of the state of component $e.$ Therefore, it is found reasonable that $rek(F)$ takes zero in these cases. Case 2 indicates that the nonfailure of component $e$ guarantees the system-level nonfailure, which makes $rek(F)=1$ from Eq. (20) reasonable. Case 3 is a general case in which the contribution of the changed state of component $e$ determines $rek(F).$ In Case 4, the change in the component's state considered by the *do*-operator leads to no changes in the conditional probability, and naturally, Eq. (20) gives $rek(F)=0.$

Case | $Pe\xaf(FsysF,H)$ | $P(FsysF,H)$ | $Pe(FsysF,H)$ | $rek(F)$ |
---|---|---|---|---|

1 | $0$ | $0$ | $0$ | $0$ |

2 | $0$ | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $1$ |

3 | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $\u2260Pe\xaf(FsysF,H)$ | $\u2208(0,\u20091)$ |

4 | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $=Pe\xaf(FsysF,H)$ | $0$ |

5 | $1$ | $1$ | $1$ | $0$ |

Case | $Pe\xaf(FsysF,H)$ | $P(FsysF,H)$ | $Pe(FsysF,H)$ | $rek(F)$ |
---|---|---|---|---|

1 | $0$ | $0$ | $0$ | $0$ |

2 | $0$ | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $1$ |

3 | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $\u2260Pe\xaf(FsysF,H)$ | $\u2208(0,\u20091)$ |

4 | $\u2208(0,\u20091)$ | $\u2208(0,\u20091)$ | $=Pe\xaf(FsysF,H)$ | $0$ |

5 | $1$ | $1$ | $1$ | $0$ |

### 4.2 New Causality-Based Importance Measure and Application to the Hypothetical Network.

Using Eqs. (22)–(24), we calculated NCIMs of the hypothetical network in Sec. 3.2 for the seismic hazard with $\lambda H=1/4,800year$ and $1/1,000year,$ as shown in Figs. 6(a) and 6(b), respectively. It is noted that component 8 shows the most significant decreases in its NCIMs as the number of initially disrupted components ($k$) increases because the component contributes to the system-level failure through its topological importance rather than through joint failures with other components. Similar to the existing IMs in Fig. 4, the NCIMs of components 1, 4, 7, and 9 are insignificant.

An essential distinction of the proposed NCIM from the existing IMs is that it can evaluate the component importance for the given occurrence rate of the hazard by the $\beta \u2010\pi $ analysis. For example, when $k=3,$ the NCIMs of components 5, 6, and 8 increase significantly as $\lambda H$ increases. It is also noteworthy that NCIM can consider the causal effects. For instance, the NCIMs of components 1 and 4, making no topological impacts on the network connectivity according to Fig. 3(a), are zeros. This is because the *do*-operators filter the spurious effect of the correlations in Eq. (20). By contrast, the CP and ICP, defined as conditional probabilities, show non-zeros values for components 1 and 4 as shown in Figs. 4(a) and 4(b). Therefore, CP and ICP cannot exclude spurious association from component failures to system-level failure, and ICP reflects only the redundancy aspects of the $\beta \u2010\pi $ analysis.

Components 7 and 9 are located far from the seismic hazard to have high reliability; thus, the corresponding FVs and RRWs are close to the minimum value. However, the definitions of FV and RRW fundamentally make it impossible to distinguish the relative importance of the components belonging to the same cut set. In summary, the existing IMs defined as a single ratio or difference between two probabilities have limitations in reflecting the reliability and redundancy perspectives in contrast to the proposed NCIM.

*do*-operator in the NCIM formulation, the causal contribution from Eq. (20) is modified to a correlation-based

*do-free*contribution $r\u02dcek(F),$ i.e.,

For the hypothetical network, by following the process of Eqs. (22)–(24) using the correlation-based contribution in Eq. (30) instead of Eq. (20), the normalized *do-free* importance measure (NDIM) is calculated as shown in Figure 7 for comparison.

*do*-operator is omitted as in Eq. (30), the merged contribution for the same scenario is calculated as

In addition, NDIMs assign a higher importance weight to component 8 as $k$ increases, whereas the weight decreases when NCIM is used. It implies that the *do*-operator helps NCIM avoid placing extra weights on components contributing to the network redundancy for higher $k$.

## 5 Further Numerical Investigations

A numerical example is created using the topology of a bridge network in Sioux Falls, to test the applicability of the S-DRA framework and the causality-based IM to general infrastructure networks. The reliability and redundancy indices are computed for bridge failure scenarios based on a seismic hazard model and the fragility analyses of individual bridges. The impacts of the correlation coefficient between the component state on the $\beta \u2010\pi $ analysis process and NCIM calculation are investigated in detail. The resilience characteristics of the network under multiple seismic hazards are also examined.

Figure 8(a) shows the road map of the Sioux Falls area, where the highways and the lower-level roads are located along the outskirts and the region's interior, respectively. Figure 8(b) shows the bridge network model consisting of 19 components with two types of bridges, i.e., multispan simply supported concrete girder (MSC) for the highway bridges and single-span concrete girder (SSC) for the bridges on downtown roads. The network-level performance goal is the posthazard connectivity between the two three-node sets, representing the source and terminal of the traffic.

### 5.1 Fragility Analysis of Bridges in Sioux Falls.

*e*by the

*j*th earthquake $EQj$ is described by a ground-motion prediction equation

where $Dje$ denotes the seismic demand intensity, $F(\u22c5)$ is a ground-motion model (GMM), $Mj$ is the earthquake magnitude, $Rje$ is the distance between the epicenter and the structure, and $\eta j$ and $\epsilon je$ are inter- and intra-event uncertainties (residuals), respectively [40]. These residuals are assumed to be statistically independent and follow the zero-mean normal distributions. Therefore, if $\sigma \eta 2$ and $\sigma \epsilon 2$ denote the variances of the residuals, the total variance of $lnDje,$ i.e., $\sigma T2$ is $\sigma \eta 2+\sigma \epsilon 2.$

where $\rho e1e2(\Delta )$ is a spatial correlation model, given as a function of the distance $\Delta $ between two components $e1$ and $e2$ [8]. The spatial correlation starts from 1 at $\Delta \u2009=\u20090,$ decreases monotonically, and converges to 0 as $\Delta $ goes to infinity.

where $e1,\u2009c1,\u2009c2,\u2009c3,\u2009Mref,\u2009Rref,$ and $h$ are the model parameters whose values are summarized in Table 3, along with the standard deviations of the two residuals. The GMM in Eq. (36) is for $VS30=$ 760 m/s where $VS30$ denotes the time-averaged shear-wave velocity over the top 30 m near the unspecified fault. The spatial correlation is assumed to be $\rho e1e2(\Delta )=exp(\u22120.27\Delta 0.4)$ [17]. In addition, the structural parameters of the bridges with MSC and SSC for extensive $(EX)$ damage state are obtained as $(\alpha MSCEX,\beta MSCEX)=(0.83,\u20090.65)$ and $(\alpha SSCEX,\u2009\beta SSCEX)=(2.62,\u20090.90),$ respectively from the prior structural analysis [43]. For a 7.5-magnitude western earthquake $EQwest$ whose epicenter is located at $(\u22124,6)$ in Fig. 8(b), the failure probability of the bridges with their types and the correlation coefficient between the bridge failures are calculated using Eqs. (34)–(36). Figure 9(a) shows the failure probabilities of the bridges. Figure 9(b) shows the positive correlation coefficients between failures of the different bridges, estimated with values between 0.178 and 0.325.

### 5.2 Effects of Correlation Coefficients Between Bridge States on System-Reliability-Based Disaster Resilience Analysis Results.

Figure 10(a) shows the $\beta \u2010\pi $ diagram of the bridge network. The filled markers represent the analysis results with the effects of the correlations between bridge states fully considered. On the other hand, the hollow markers show the results when the correlations are ignored. In both cases, the markers move to the upper left as $k$ increases, as observed in Fig. 3(b). The makers are clustered, which is attributed to two different bridge types and two types of locations in the network topology. The MSC-type components 1 to 7 have higher failure probabilities, i.e., lower reliabilities, than the SSC-type components. Meanwhile, the MSC-type components are more critical regarding network topology, which results in larger redundancy indices. For $k=2$ and $k=3,$ three and four clusters are formed by the combinations of the failures of two-type bridges, respectively.

A part of Fig. 10(a) is enlarged in Fig. 10(b) for a discussion on $k=1$. It is noted that the redundancy index is significantly increased while the reliability index stays the same when the correlation between component failures is excluded during the $\beta \u2010\pi $ analysis. This is because joint failure probability calculations are involved in the redundancy index calculation. By contrast, in multiple-component failure cases, i.e., $k=2$ and $k=3,$ both indices are affected because the reliability is computed for the joint failures of two or three components.

Suppose the correlations between component states are ignored. In this case, markers are shifted to the right or upper right in the $\beta \u2010\pi $ diagram, resulting in changes in the numbers and characteristics of critical scenarios. The system's disaster resilience is significantly overestimated. Figure 11 shows the NCIMs of the bridge network for the dependent ($\lambda H=0.01/year$) and independent ($\lambda H=10/year$) cases. Even though the scenario $E3$ does not belong to $D\lambda H,$ the NCIM of component 3 is zero since a detour exists between the nodes bridged by the component. In other words, NCIM successfully considers the fact that the state of component 3 does not have a direct causal effect on the system-level performance.

When correlations between the component states are fully considered, the order and scale of NCIMs of MSC-type components 1–7 are evaluated consistently for all $k$'s. In addition, components 11, 13, 14, 16, 17, and 19 have small NCIM values with consistent orders among SSC-type components. NCIMs of the rest of the SSC-type components, i.e., 8, 9, 10, 12, 15, and 18, vary dramatically as $k$ increases and the critical scenarios change. Since the percentage of initially disrupted scenarios satisfying the resilience constraint increases as $k$ increases, the NCIMs of the SSC-type components that initially had small failure probabilities tend to decrease overall.

If correlations are ignored, NCIMs show different trends in several aspects. As the scenarios are distributed more widely in the $\beta \u2010\pi $ diagram, the NCIM assigns more weights to the particular components belonging to the critical scenarios. In other words, NCIMs are zeros for most of the components satisfying the resilience constraints, e.g., components 7, 11, 13, 14, 16, 17, and 19. In addition, the distinction in NCIM values between components becomes more evident as $k$ increases. Specifically, component 8 becomes overwhelmingly critical, which can not be observed in NCIMs for the dependent case and the other IMs.

### 5.3 Sioux Falls Bridge Network Under Multiple Seismic Hazard Scenarios.

Finally, we discuss the disaster resilience of the bridge network against multiple potential hazards. Let us consider four hazard scenarios of 7.5-magnitude earthquakes with equal occurrence rates. Their epicenter coordinates are (3, 17), (6, −3), (14, 6), and (−4, 6), termed the north, south, east, and west earthquakes, respectively. Figure 12 shows the average value of NCIMs over the four hazard scenarios in the form of stacked bars. The averaged NCIM also ranges between 0 and 1 like NCIM. It is noted that, as *$k$* increases, averaged NCIMs for the highway bridges (components 1, 2, 4, 5, and 6) decrease, while those for the downtown bridges (components 8, 9, 10, 12, 15, and 18) increase. The same trend was observed when the western earthquake alone was considered.

Table 4 lists the averaged values of six IMs against the multiple hazard scenarios. Depending on the bridge type and the topological location, the trends of averaged NCIMs and IMs vary. Components 1 and 2 are considered barely critical according to FV, RAW, RRW, and BP, while CP and ICP assign values somewhat larger than zero, similar to the averaged NCIMs. In other words, FV, RAW, RRW, and BP underestimate the importance of highway bridges. For component 3, averaged NCIM is consistently zero for the multiple seismic hazards for all $k$'s, which is the phenomenon observed in the averaged FV, RAW, RRW, and BP. Since CP and ICP embody direct correlations between the component failure and the system-level failure, the two IMs assign nonzero importance even if there exists no direct causality between them. The averaged NCIMs of components 4, 5, and 6 are moderately larger than zero, similar to CP, FV, and RRW. In contrast, ICP, RAW, and BP assign near-minimum importance to those components. Finally, among the downtown SCC-type bridges, six IMs and averaged NCIMs (for all $k$'s) identify 8, 9, 10, 12, 15, and 18 as the top six components.

Bridge | CP | ICP $(\xd710\u22123)$ | FV | RAW | RRW | BP $(\xd710\u22124)$ |
---|---|---|---|---|---|---|

1^{a} | 0.668 | 0.578 | 0.417 | 1.131 | 1.172 | 0.219 |

2^{a} | 0.644 | 0.568 | 0.392 | 1.131 | 1.137 | 0.198 |

3^{a} | 0.616 | 0.519 | 0 | 1.000 | 1.000 | 0 |

4^{a} | 0.773 | 0.653 | 0.447 | 1.354 | 1.774 | 0.578 |

5^{a} | 0.772 | 0.647 | 0.492 | 1.320 | 1.958 | 0.581 |

6^{a} | 0.795 | 0.643 | 0.524 | 1.327 | 1.922 | 0.583 |

7^{a} | 0.616 | 0.509 | 0.089 | 1.030 | 1.070 | 0.066 |

8 | 0.597 | 2.948 | 0.544 | 3.863 | 2.240 | 2.640 |

9 | 0.530 | 2.563 | 0.447 | 3.413 | 1.775 | 2.149 |

10 | 0.598 | 2.869 | 0.513 | 3.623 | 2.140 | 2.423 |

11 | 0.301 | 1.439 | 0.165 | 1.504 | 1.137 | 0.475 |

12 | 0.589 | 2.708 | 0.507 | 3.373 | 2.084 | 2.095 |

13 | 0.279 | 1.328 | 0.126 | 1.344 | 1.096 | 0.329 |

14 | 0.268 | 1.265 | 0.117 | 1.278 | 1.079 | 0.262 |

15 | 0.560 | 2.620 | 0.492 | 3.371 | 1.957 | 2.101 |

16 | 0.252 | 1.150 | 0.089 | 1.267 | 1.070 | 0.230 |

17 | 0.294 | 1.395 | 0.166 | 1.509 | 1.127 | 0.464 |

18 | 0.473 | 2.189 | 0.393 | 2.911 | 1.530 | 1.625 |

19 | 0.234 | 1.063 | 0.089 | 1.262 | 1.069 | 0.229 |

Bridge | CP | ICP $(\xd710\u22123)$ | FV | RAW | RRW | BP $(\xd710\u22124)$ |
---|---|---|---|---|---|---|

1^{a} | 0.668 | 0.578 | 0.417 | 1.131 | 1.172 | 0.219 |

2^{a} | 0.644 | 0.568 | 0.392 | 1.131 | 1.137 | 0.198 |

3^{a} | 0.616 | 0.519 | 0 | 1.000 | 1.000 | 0 |

4^{a} | 0.773 | 0.653 | 0.447 | 1.354 | 1.774 | 0.578 |

5^{a} | 0.772 | 0.647 | 0.492 | 1.320 | 1.958 | 0.581 |

6^{a} | 0.795 | 0.643 | 0.524 | 1.327 | 1.922 | 0.583 |

7^{a} | 0.616 | 0.509 | 0.089 | 1.030 | 1.070 | 0.066 |

8 | 0.597 | 2.948 | 0.544 | 3.863 | 2.240 | 2.640 |

9 | 0.530 | 2.563 | 0.447 | 3.413 | 1.775 | 2.149 |

10 | 0.598 | 2.869 | 0.513 | 3.623 | 2.140 | 2.423 |

11 | 0.301 | 1.439 | 0.165 | 1.504 | 1.137 | 0.475 |

12 | 0.589 | 2.708 | 0.507 | 3.373 | 2.084 | 2.095 |

13 | 0.279 | 1.328 | 0.126 | 1.344 | 1.096 | 0.329 |

14 | 0.268 | 1.265 | 0.117 | 1.278 | 1.079 | 0.262 |

15 | 0.560 | 2.620 | 0.492 | 3.371 | 1.957 | 2.101 |

16 | 0.252 | 1.150 | 0.089 | 1.267 | 1.070 | 0.230 |

17 | 0.294 | 1.395 | 0.166 | 1.509 | 1.127 | 0.464 |

18 | 0.473 | 2.189 | 0.393 | 2.911 | 1.530 | 1.625 |

19 | 0.234 | 1.063 | 0.089 | 1.262 | 1.069 | 0.229 |

MSC-type brides located on highways.

## 6 Conclusions and Future Research

This paper introduced the whole process for analyzing the disaster resilience of infrastructure networks using the S-DRA framework [30]. The essence of the S-DRA is a holistic consideration of *reliability* and *redundancy* to manage the risk of the system at a socially acceptable level. Reliability ($\beta $) and redundancy ($\pi $) indices were defined as the generalized reliability indices of the component-level likelihood of the initial disruption scenarios and the system-level failure induced by the scenarios, respectively. This paper first provided a detailed process through a hypothetical network to enable the application of the S-DRA framework to the network scale. Moreover, based on the $\beta \u2010\pi $ diagram from the reliability-redundancy analysis of the network, the new normalized causality-based importance measure (NCIM) was proposed using the causal diagram describing the behaviors of infrastructure networks under the external hazards. NCIM accumulates the rate of change in the system failure probability caused by the change in the state of the component of interest over the initial disruption scenarios leading to socially unacceptable risk. The proposed NCIM was tested and discussed in detail using the hypothetical network under different occurrence rates of the hazards to find that the measure achieves a good balance between component reliability and system redundancy relying on the network topology.

A numerical example of a bridge network with an actual network topology was also provided to demonstrate the proposed S-DRA framework and compare NCIM with existing importance measures (IMs). For the $\beta \u2010\pi $ analysis, the example features a seismic reliability assessment of bridges based on a ground motion prediction equation and structural fragility models. When the correlations between the bridge states were ignored, the markers in the $\beta \u2010\pi $ diagram moved to the right or upper right. In other words, the probabilities of multiple component failure events and the system-level failure event were underestimated, which leads to an overall underestimation of NCIM values. The NCIMs averaged over multiple hazard scenarios clearly distinguished critical and noncritical components among the downtown bridges, as existing IMs did. The zero-importance components were not observed in some IMs since the change in the states of components does not cause a change in the likelihood of system reliability. Comparisons of the averaged NCIM with the average of existing IMs for the highway bridges showed that the proposed NCIM achieved a balanced consideration of the topological location in the network and component reliability.

The proposed system-reliability-based analysis framework and causality-based IM are expected to facilitate various decision-making processes to achieve the target system-level risk. Although the S-DRA framework should apply to general system-level performance definitions, this paper focused on the connectivity between several significant components as the first step. Further research is underway to deal with the flow-based network performance in the S-DRA framework and extend the NCIM definition to deal with the flow based on a new causal diagram. In addition, the proposed measures can be used to enhance the system-reliability-based disaster resilience of networks in the causal perspective and identify strategies to maximize the resilience in designing the network topology or modules in the network or distributing repair and reinforcement resources to the components.

## Acknowledgment

This work is supported by the Korea Agency for Infrastructure Technology Advancement (KAIA) grant funded by the Ministry of Land, Infrastructure, and Transport (Grant No. RS-2021-KA163162; Funder ID: 10.13039/501100003565). The second author also acknowledges the support by the Institute of Construction and Environmental Engineering at Seoul National University.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.