<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>A Bayesian State-Space Approach to Mapping Directional Brain Networks</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>10/02/2021</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10318828</idno>
					<idno type="doi">10.1080/01621459.2020.1865985</idno>
					<title level='j'>Journal of the American Statistical Association</title>
<idno>0162-1459</idno>
<biblScope unit="volume">116</biblScope>
<biblScope unit="issue">536</biblScope>					

					<author>Huazhang Li</author><author>Yaotian Wang</author><author>Guofen Yan</author><author>Yinge Sun</author><author>Seiji Tanabe</author><author>Chang-Chia Liu</author><author>Mark S. Quigg</author><author>Tingting Zhang</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[The human brain is a directional network system of brain regions involving directional connectivity. Seizures are a directional network phenomenon as abnormal neuronal activities start from a seizure onset zone (SOZ) and propagate to otherwise healthy regions. To localize the SOZ of an epileptic patient, clinicians use intracranial EEG (iEEG) to record the patient's intracranial brain activity in many small regions. iEEG data are high-dimensional multivariate time series. We build a state-space multivariate autoregression (SSMAR) for iEEG data to model the underlying directional brain network. To produce scientifically interpretable network results, we incorporate into the SSMAR the scientific knowledge that the underlying brain network tends to have a cluster structure. Specifically, we assign to the SS-MAR parameters a stochastic-blockmodel-motivated prior, which reflects the cluster structure. We develop a Bayesian framework to estimate the SSMAR, infer directional connections, and identify clusters for the unobserved network edges. The new method is robust to violations of model assumptions and outperforms existing network methods. By applying the new method to an epileptic patient's iEEG data, we reveal seizure initiation and propagation in the patient's directional brain network and discover a unique directional connectivity property of the SOZ. Overall, the network results obtained in this study bring new insights into epileptic patients' normal and abnormal epileptic brain mechanisms and have the potential to assist neurologists and clinicians in localizing the SOZ-a long-standing research focus in epilepsy diagnosis and treatment.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Brain activities form a directional network, where network nodes are brain regions and each network edge represents a directional influence exerted by one region on another. Such directional information flow from one region to another is referred to as directional connectivity also called effective connectivity <ref type="bibr">[1]</ref>. The purposes of this paper are to present a new statistical approach for analysis of intracranial electroencephalographic (iEEG) data and to use our approach to uncover the normal and abnormal directional brain networks of epileptic patients over the course of seizure development.</p><p>Seizures are a directional network phenomenon <ref type="bibr">[2]</ref>, as abnormal, excessive, and synchronous neuronal activities start from the seizure onset zone (SOZ) and propagate to otherwise healthy brain regions. Brain surgery to remove the SOZ is a common treatment consideration for patients with drug resistant epilepsy. Pre-surgical evaluation includes localization of the SOZ using iEEG, which is absolutely critical to the success of the surgery. Clinicians place iEEG electrodes on the exposed brain (inside the skull) of epileptic patients to record their neuronal activities in many regions. The recorded data are high-dimensional multivariate time-series of voltage waveforms, which often exceed 50 channels (with each channel corresponding to one region). Figure <ref type="figure">1(a)</ref> shows the electrode placement on the left hemisphere of a patient who underwent iEEG recordings in epilepsy evaluation. Figure <ref type="figure">1</ref>(b) illustrates 5-second segments of the patient's iEEG recordings in two regions/channels.</p><p>To localize the SOZ, trained EEG experts visually examine iEEG waveforms and designate the region that first shows abnormal epileptic activity to be the SOZ <ref type="bibr">[3]</ref>. However, despite careful planning, sometimes visual analysis of intracranial EEG fails to localize the SOZ clearly <ref type="bibr">[4]</ref>. One crucial reason is that sometimes seizure onsets consist of low amplitude, very fast activity. This activity may not generate appropriate power that can be visually detected until the seizure is well underway. Activity with greater power that can be identified may occur later, by which time seizure activity has spread beyond the actual SOZ and involves brain regions that are involved in seizure occurrence but do not serve as the electrical source. Given that seizures are a directional network phenomenon, our method for mapping directional brain networks (i.e., identifying directional connections) using iEEG data is expected to improve understanding of the brain system and localization of the SOZ. iEEG data are high-dimensional multivariate time series recordings of many small regions' neuronal activities at a high temporal resolution (millisecond scale) and spatial resolution (about 10 mm in diameter) and with a strong signal-to-noise ratio (SNR) <ref type="bibr">[5]</ref>, in contrast to popular functional magnetic resonance imaging (fMRI) with a low temporal resolution and scalp EEG with a low spatial resolution. As such, iEEG data provide valuable information about directional brain networks.</p><p>Mapping directional brain networks based on high-dimensional multivariate time series, however, faces multiple challenges. First, it is difficult to construct a model that can accurately characterize the complex mechanism of a high-dimensional brain system, i.e, how each region's activity depends on many others' activities. Second, the estimation of a high-dimensional model has a large variance. With many regions being studied and enormous possibilities in directional connections among the regions, it is challenging to identify only a few strong connections among enormous candidate ones. Though incorporating anatomic connectivity (AC) information into the directional connectivity model can improve the estimation of directional connections <ref type="bibr">[6]</ref>, AC information is not always available. Here, we consider mapping directional brain networks without relying on AC information. Simple sparsity regularization does not address the challenge because high-dimensional sparse networks have many different forms, most of which do not accurately reflect the brain's functional organization. For example, standard L 1regularized estimates <ref type="bibr">[7,</ref><ref type="bibr">8]</ref> lead to the sparse network in which every region has only a few connections with other regions. However, this sparse network is not consistent with known brain networks in which regions with similar functions tend to be closely connected <ref type="bibr">[9]</ref>. Third, the computation for analyzing high-dimensional multivariate time series data can be intensive. Existing approaches to mapping directional networks usually address only a part of these challenges, as explained below.</p><p>Network mapping approaches fall into two major categories: information-theoreticmeasure based methods and model-based methods. The former includes correlations, cross-correlations <ref type="bibr">[10,</ref><ref type="bibr">11]</ref>, cross-coherence <ref type="bibr">[12]</ref>, transfer entropy <ref type="bibr">[13]</ref>, directed transinformation <ref type="bibr">[14]</ref>, and directed information <ref type="bibr">[15]</ref>, and many others <ref type="bibr">[16,</ref><ref type="bibr">17]</ref>. Although these measures are fast to compute, they are mainly for quantifying pairwise relationship between regions and ignore system features of the brain in which each region's activity de-pends on many other regions' activities. Thus, information-measure-based approaches lack the ability to delineate the entire signal pathway of directional connections from regions to regions.</p><p>Model-based methods have been developed to describe simultaneous directional connectivity among all the recorded regions. The most popular models include dynamic causal modeling <ref type="bibr">[DCM,</ref><ref type="bibr">18]</ref> and neural mass models <ref type="bibr">[NMM,</ref><ref type="bibr">19]</ref>, which use ordinary differential equations (ODE) to characterize directional connectivity. Because of their complex mathematical formulation, the DCM and NNM are typically used for low-dimensional brain networks (consisting of only a few brain regions being studied).</p><p>To address this limitation, <ref type="bibr">[20]</ref><ref type="bibr">[21]</ref><ref type="bibr">[22]</ref> proposed to use linear ODEs to approximate highdimensional brain systems (consisting of many regions). However, parameter estimation of deterministic ODE models is sensitive to the model specification, data noise, and data-sampling frequency.</p><p>We propose to use a state-space multivariate autoregression-based (SSMAR) model for iEEG data to address the limitation of existing methods. First, the state-space framework allows for separating the model error due to the inherent model inadequacy for a complex system and the data measurement error. The SSMAR with the two errors is flexible to approximate different systems and is robust to various deviations from the assumed model. Equally importantly, the formulation of SSMAR is much simpler than ODE models, which thus, enables fast computation for high-dimensional data.</p><p>Different from standard MAR <ref type="bibr">[23]</ref><ref type="bibr">[24]</ref><ref type="bibr">[25]</ref> and SSMAR <ref type="bibr">[26,</ref><ref type="bibr">27]</ref>, our SSMAR is uniquely constructed for analyzing iEEG data to map directional brain networks. It has been widely documented <ref type="bibr">[28,</ref><ref type="bibr">29]</ref> that brain networks have a cluster structure, in which regions are more densely connected with regions in the same cluster than with regions otherwise. Our approach incorporates the cluster structure to greatly improve the model estimation. Specifically, we propose a stochastic blockmodel (SBM)-motivated prior for the SSMAR parameters, restricting the estimated network to have the cluster structure.</p><p>The SBM <ref type="bibr">[30]</ref><ref type="bibr">[31]</ref><ref type="bibr">[32]</ref><ref type="bibr">[33]</ref> is a generative model for the networks in the cluster structure. However, existing applications of the SBM <ref type="bibr">[34,</ref><ref type="bibr">35]</ref> and most cluster identification methods (also called community detection, a terminology often used in social network literature) <ref type="bibr">[36,</ref><ref type="bibr">37]</ref> are for observed networks with known edges. The proposed method addresses a more challenging problem of inferring unobserved networks based on multivariate time series measurements of network nodes' activities.</p><p>Using the SBM-motivated prior for SSMAR parameters, we develop a Bayesian framework to make inferences about the underlying network. The proposed Bayesian approach has three major advantages. First, our method improves the efficiency in identifying connected brain regions (i.e., a high true positive) and produces scientifically interpretable network results by incorporating the cluster structure into the model. Second, the proposed Bayesian framework accounts for the model error due to the model inadequacy for the complex system as well as the statistical uncertainty in identifying connected regions. Third, the simple SSMAR formulation brings the flexibility to approximate various brain systems and enables fast computation for high-dimensional multivariate time series data. As such, our approach effectively addresses the three major challenges in mapping high-dimensional brain networks.</p><p>The rest of the article is organized as follows. In Section 2, we introduce the new SSMAR model for directional brain networks with the cluster structure. We build a Bayesian hierarchical model with an SBM-motivated prior to make inferences of SS-MAR parameters and develop an efficient Markov chain Monte Carlo (MCMC) simula-tion algorithm for the ensuing posterior inference. In Section 3, we apply the developed Bayesian model to data simulated under two different model settings and network patterns and compare the ensuing results with those of existing network mapping methods.</p><p>We show that the proposed method is robust to various deviations from the assumed model and outperforms existing methods by achieving much higher accuracy in identifying connected brain regions. Section 4 presents the analysis results of real iEEG data from an epileptic patient by the new SSMAR model. We reveal the patient's directional brain network changes over the course of seizure development, uncover a unique directional connectivity property of the SOZ, and use this property to localize the SOZ.</p><p>Section 5 concludes with a discussion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Dynamic System Models and Bayesian Inference</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">The State-Space MAR Model</head><p>Let y(t) = (y 1 (t), . . . , y d (t)) be observed iEEG measurements of d brain regions (equivalently d network nodes of the brain network under study) at time t and x(t) = (x 1 (t), . . . , x d (t)) be the underlying neuronal state functions of the d brain regions at time t for t = 1, . . . , T .</p><p>Since each iEEG electrode directly records one brain region's neuronal activity with a high spatial and temporal resolution, we propose a simple space model that links y i (t) to x i (t):</p><p>where c i is a unknown constant, and i (t) is a data measurement error with mean zero.</p><p>For the state model that describes directional connectivity among the d regions at the neuronal level, we propose to use the simplest dynamic system model, i.e., the firstorder multivariate-autoregression (MAR), for x(t):</p><p>where &#951; i (t) is the model error due to the model inadequacy in characterizing the dynamics of region i.</p><p>Our goal is to develop a parsimonious model to detect the existence of temporal dependence among neuronal activities of regions rather than building a comprehensive model that can explain all the neuronal activities. Due to the high-dimensionality and the current limited understanding of the brain system, it is extremely difficult to build such a comprehensive dynamic system model. Even though more complex models, such as high-order MARs, may fit the observed data better, they still suffer from the model inadequacy. More seriously, high-order MARs have large estimation errors because they have at least d 2 more parameters than first-order MARs. Consequently, the first-order MAR is more efficient for detecting connected regions and addresses our needs.</p><p>Under the state-space MAR, identifying connected regions and mapping the brain network are equivalent to selecting statistically significant nonzero A ij s. To distinguish nonzero directional connections from zero ones, we introduce indicators for A ij s:</p><p>where &#947; ij is an indicator, taking values either 0 or 1. We use &#947; ij s to stand for the set of indicators {&#947; ij , i, j = 1, . . . , d}. The use of indicators is similar to the "spike and slab"</p><p>prior <ref type="bibr">[38]</ref><ref type="bibr">[39]</ref><ref type="bibr">[40]</ref> in the Bayesian variable selection framework <ref type="bibr">[41]</ref><ref type="bibr">[42]</ref><ref type="bibr">[43]</ref><ref type="bibr">[44]</ref>. Under (2), identifying connected brain regions, i.e., selecting directional network edges, is equivalent to selecting nonzero &#947; ij s, which is the focus of our model estimation.</p><p>The observation model ( <ref type="formula">1</ref>) and the state model ( <ref type="formula">2</ref>) together are the proposed statespace MAR (SSMAR) for the brain's directional connectivity. Note that the first-order SSMAR is different from the first-order MAR: The former is robust to violations of model assumptions, but the latter is not. This is because the SSMAR uses two error terms, &#951; i (t)</p><p>and i (t), to accommodate the model inadequacy and measurement error separately.</p><p>We let &#951; i (t) i.i.d &#8764; N(0, 1) for several reasons. First, c i in (1) and the variance of &#951; i (t)</p><p>are not uniquely defined. Since we treat the former as unknown, we fix the latter at 1 to avoid the identifiability issue. Second, letting &#951; i (t) be independent between regions enables &#947; ij and A ij to capture the dependence between regions more efficiently than otherwise. Third, letting &#951; i (t) be independent over time brings parsimony to the model.</p><p>Again, our purpose is to detect the existence of temporal dependence between regions' iEEG rather than capturing all possible temporal dependence. Similarly, for the latter two reasons, we let i (t) i.i.d &#8764; N(0, &#964; i ). We show through simulation studies (Section 3) that our approach is robust to violations of model assumptions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Bayesian Hierarchical Model for SSMAR</head><p>Since nonzero &#947; ij s define the brain's directional network, we impose the cluster structure on the estimated brain network through using a stochastic blockmodel (SBM)-motivated <ref type="bibr">[30]</ref><ref type="bibr">[31]</ref><ref type="bibr">[32]</ref><ref type="bibr">45]</ref> prior for &#947; ij s. The cluster structure means that regions within the same cluster connect more closely with each other than with regions in a different cluster. The cluster structure fits the brain's functional organization reported in the literature <ref type="bibr">[28,</ref><ref type="bibr">29]</ref> and is also useful in epilepsy diagnosis. For example, regions in the SOZ's cluster are those affected by the SOZ's activities most. Information about the SOZ's cluster and its changes during seizure development can help neurologists assess the effect of seizures on brain functions. In summary, developing the SBM-motivated prior for SSMAR parameters to impose the cluster structure on estimated networks is another important novelty of our approach.</p><p>Let K be the pre-specified number of clusters. Let m i = (m i1 , . . . , m iK ) be a Kdimensional vector with only one element being 1 and the rest being 0; m i labels the cluster of region i, i.e., m ik = 1 indicates region i in the kth cluster. Let</p><p>1, . . . , K, denote the prior probability of a nonzero directional connection from a region in cluster k 2 to another region in cluster k 1 . Let B be a</p><p>Prior specification for the cluster structure. The prior for the brain network with the cluster structure is a joint distribution for indicators &#947; ij s, the cluster labels m i s, and the probability matrix B as follows: </p><p>where l 0 and u 0 are given constants between 0 and 1, and &#945; = (1, . . . , 1), assigning uniform weights to different clusters. The distribution (3) specifies the probabilities of both within-cluster and between-cluster connections. For example, if m ik 1 = 1 and</p><p>is the prior probability of existing a directional connection between two regions in the same cluster k. Since within-cluster connections are dense and strong, while betweencluster connections are sparse <ref type="bibr">[46]</ref>, we let u 0 = 0.1 and l 0 = 0.9. The large difference between u 0 and l 0 facilitates differentiating within-cluster connections from betweencluster ones and identifying clusters.</p><p>The distributions (3), (4), and ( <ref type="formula">5</ref>) together define the SBM-motivated prior for &#947; ij s.</p><p>Our goal is to identify clusters and select significant edges by estimating the cluster labels for regions, m i s, and the indicators for edges, &#947; ij s.</p><p>Prior specification for A ij s. We assign a normal prior to A ij :</p><p>where &#958; 0 is a positive constant so that the density of A ij is almost flat within its domain.</p><p>Priors for other parameters. Let</p><p>and &#964; = (&#964; 1 , . . . , &#964; d ). We assign the following priors to the rest parameters:</p><p>where &#961; 0 is a pre-specified small positive constant to give an almost flat prior for &#964; and (2) with prior distributions (3), ( <ref type="formula">4</ref>), ( <ref type="formula">5</ref>), <ref type="bibr">(6)</ref>, and (7) lead to the posterior distribution:</p><p>The detailed formulation of the joint posterior distribution is provided in the Appendix.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">EM Algorithm for Setting Initial Values and Hyperparameter</head><p>We simulate from p(X, &#920;|Y) with a partially collapsed Gibbs Sampler <ref type="bibr">[47]</ref>, whose Markov Chain Monte Carlo (MCMC) simulation steps are provided in the Appendix.</p><p>The MCMC simulation can take many iterations to converge especially for large d.</p><p>To address this issue, following the practice suggested in [Chapter 13.1, 48], we use an expectation-maximization (EM) algorithm to find the starting values for the MCMC simulation. Specifically, we optimize p(Y| &#920;) = p(Y| &#920;, X) &#8226; p(X|&#920;)dX by the EM algorithm, in which the state functions X are treated as missing values. The output of the EM algorithm, &#920; in the final step, is used as the initial value for the following 10,000 MCMC iterations. For all our simulation and real data analysis, we verified that the MCMC algorithm converged upon evaluating the Gelman-Rubin statistic <ref type="bibr">[49]</ref>.</p><p>We need to determine the value of K, the number of clusters, for the proposed Bayesian model. Standard approaches to selecting hyperparameters for Bayesian methods include information criteria and cross-validation. However, these methods are timeconsuming for large d, because they all require running the posterior simulation for each candidate K. We propose to select the value for K by the EM algorithm. Specifically, we let K = d in our EM algorithm. We set the initial values of m ii to 1 for i = 1, . . . , d, that is, we let each region form one independent cluster at the start of the EM algorithm. As the algorithm iterates, several regions fall into the same cluster, and the number of distinct clusters of the d regions becomes stable. Since the EM algorithm can find the number of clusters that leads to a locally optimal posterior, we let the K in the Bayesian model be the number of distinct clusters in the final step of the algorithm.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4">Posterior Inference</head><p>We use two posterior probabilities to map the brain network:</p><p>ij , where S is the total number of MCMC samples after burn-in. The former, called the clustering probability, is the posterior probability of two regions i and j in the same cluster; and the latter, called the network edge probability, is the posterior probability of nonzero directional connectivity from region j to i. We use P m ij , i, j = 1, . . . , d, to identify clusters. Given a threshold m , if P m ij &gt; m , regions i and j are put in the same cluster; if additionally, P m jk &gt; m , then the three regions i, j, and k are put in the same cluster regardless of the value of P m ik . We use P &#947; ij to select directional network edges. Given a threshold &#947; , if P &#947; ij &gt; &#947; , we deem the directional connection from region j to i nonzero and select the directional network edge from j to i.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Choice of thresholds.</head><p>The total numbers of potential network edges and possible network patterns are enormous for high-dimensional networks. Because of the uncertainty resulted from the high-dimensionality, posterior probabilities P m ij and P &#947; ij are all small. To address this issue, many Bayesian methods select variables based on the ranks of their posterior probabilities <ref type="bibr">[21,</ref><ref type="bibr">50]</ref>. We here propose to determine the thresholds for P m ij and P &#947; ij based on their significance/p-values under the null hypothesis that all the regions are independent from each other, as explained in detail below.</p><p>We first generate a null data set Y 0 that satisfies the null hypothesis. Specifically, given long iEEG time series before seizure onsets, we randomly sample a short segment Y 0 i = {y i (t), t = t i + 1, . . . , t i + T } of each region i and let the pairwise distance between any two regions' segments, |t i -t j |, no smaller than 2T . All the regions' segments Y 0 i , i = 1, . . . , d, form Y 0 , in which the temporal dependence of each region's time-series data points remains while the dependence between regions' time series is almost none.</p><p>Applying our Bayesian method to Y 0 , we obtain the ensuing the clustering probabilities and network edge probabilities, which form the empirical null distributions for P m ij s and P &#947; ij s, respectively. We evaluate the p-values of P m ij s and P &#947; ij s based on the null distributions and determine the thresholds for P m ij s and P &#947; ij s corresponding to the chosen p-value. We here use the p-value of 1% to ensure a low false positive rate.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">Simulation Study</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Example 1: Simulation from A Third-Order SSMAR</head><p>We simulated multivariate time-series data from the following third-order SSMAR.</p><p>The above system has three clusters of size 15, 15 and 20. We consider region j has a directional influence over i, if at least one of A 1,ij , A 2,ij , and A 3,ij is nonzero. Figure <ref type="figure">2(a)</ref> shows the simulated network pattern, where the presence of a directional connection is indicated by an edge (grey edges for within-cluster connections and purple edges for between-cluster connections).</p><p>We simulated &#951; i (t) from the model &#951;(t) = 0.5&#951;(t -1) + &#948;(t) and &#948;(t)</p><p>where &#931; 1 is a block diagonal matrix with each block corresponding to one cluster. The diagonal entries of &#931; 1 all equal 1 and off-diagonal entries in diagonal submatrices follow Uniform(0,0.5). The upper bound of off-diagonal entries is chosen such that &#931; 1 is strictly positive definite.</p><p>We generated the observation errors (t) = ( 1 (t), . . . , d (t)) from the model</p><p>where &#931; 2 is created in the same way as &#931; 1 , and D is a d-by-d diagonal matrix with the diagonal entries chosen such that the SNRs of all the time series equal 10. The median SNR of real iEEG data is much higher than 10 <ref type="bibr">[20]</ref>. As such, the simulated model errors and data errors are all spatially and temporally correlated, which violates the model assumptions of the proposed SSMAR.</p><p>Using the simulated edges as the true values, we calculated false positive rates (FPR) and true positive rates (TPR) of network edge selection based on different thresholds for P &#947; ij s. For comparison, we examined the FPRs and TPRs of popular competing methods, including the third-order MAR with L 1 regularization (implemented by using the R package BigVAR <ref type="bibr">[8]</ref>), denoted by MAR(L 1 ), partial directed coherence (PDC) <ref type="bibr">[51]</ref>, the spectrum synchronicity <ref type="bibr">[52]</ref>, and graphical lasso (Glasso) <ref type="bibr">[53,</ref><ref type="bibr">54]</ref>. Figure <ref type="figure">2</ref>(c) shows the estimated network pattern using the thresholds corresponding to 1% p-value for P m ij and P &#947; ij . The proposed method was able to identify three clusters. For detecting the directional connections among the 50 regions, the overall TPR and FPR are 0.84 and 0.02. More specifically, the TPR and FPR are 0.95 and 0 for within-cluster connections and 0.45 and 0.02 for between-cluster connections. The comparably low TPR for selecting between-cluster connections is due to several reasons. First, since the clustering is subjective, our selection of directional network edges based on P &#947; ij does not account for the identified clusters. As within-cluster connections (accounting for 32.6% of all candidate connections) are much denser than between-cluster connections (9.0% of all candidate connections), network edge selection is more towards selecting withincluster connections, so that the overall network edge selection accuracy is high. Second, the number of candidate between-cluster connections is enormous and even more than the total number of true network edges. As such, the true between-cluster connections are highly sparse and more difficult to identify than within-cluster connections. Third, since the number of null connections is large, we used a high threshold for P &#947; ij to avoid many false selections, which also leads to a low TPR for selecting between-cluster connections. Overall, the proposed method outperformed existing methods by achieving a higher TPR and an almost zero FPR.</p><p>In summary, this simulation demonstrates the robustness of our SSMAR to violations of model assumptions and its efficiency in identifying connected regions and clusters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Example 2: Simulation from the Dynamic Causal Modeling</head><p>We simulated time series from a 50-dimension dynamic system given by the dynamic causal modeling (DCM) <ref type="bibr">[18]</ref>, the most popular ODE-based model for the brain's directional connectivity. The DCM is for low-dimensional brain networks. We expanded its state model to be high-dimensional and the same as that of the sparse regression-DCM (srDCM) <ref type="bibr">[55]</ref>, an extension of the DCM for high-dimensional brain networks. We used this high-dimensional state model to generate x(t) of 50 regions. Then we simulated y(t) based on the observation model of the DCM, which describes the transformation of neuronal activity x(t) into observed y(t). The signal-to-noise was set to be 1, which was considered small in the literature <ref type="bibr">[55]</ref>. Figure <ref type="figure">3(a)</ref> shows the simulated directional network among 50 regions.</p><p>We applied the proposed BSBM to simulated y(t) with 2714 time points, which were identical to those of the simulated data under the srDCM <ref type="bibr">[55]</ref>. We also applied the BSBM to down-sampled data with 1000 time points. Figures <ref type="figure">3(b</ref>) and 3(c) show the ROC curves of the BSBM and other competing methods for the data of two frequencies.</p><p>We also analyzed the simulated data by the srDCM. Though the proposed model is distinct from the DCM and srDCM, our method was robust to model specification, data noise, and data-sampling frequency and outperformed existing methods by achieving the largest area under the ROC curve. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Real iEEG Data Analysis</head><p>We applied the proposed method to iEEG data of an epileptic patient, who had 64 electrodes placed on the exposed surface of his brain, as shown in Figure <ref type="figure">1</ref>(a). iEEG recorded the patient's brain activities in 3 seizures. The sampling rate of this patient's iEEG data was 4000 Hz. We down-sampled the iEEG data to 1000 Hz, a typical rate used in the literature <ref type="bibr">[20,</ref><ref type="bibr">56]</ref>. EEG experts manually examined the data and determined seizure onset times and the SOZ, which was G37. A responsive neurostimulation system was later implanted in his brain with a lead placed on G37. The use of RNS has significantly reduced his seizure occurrences. This confirms that the SOZ was accurately located. In our analysis, we treated seizure onset times as given, since the detection of seizure onset time is not difficult. However, we did not use the location information of the SOZ when mapping the directional brain network among recorded brain sites. The SOZ was treated as unknown and equally as other brain sites. As such, we could validate our network results against the location information of the SOZ.</p><p>Channels 63 and 64, as the reference electrodes, were removed from the analysis. We evaluated connectivity among the rest 62 regions. To minimize the residual artifacts of 60 Hz electrical noise, we used a 60 Hz notch filter during the primary recording and removed the first principal component through the principal component analysis.</p><p>Once a seizure starts, the connection strength between the SOZ and other regions increases <ref type="bibr">[57]</ref>, resulting in abnormally synchronized or excessive neuronal activities in other regions <ref type="bibr">[58]</ref>. Thus, an effective brain network mapping methods should reveal different brain networks before and after seizure onset: More regions are expected to be affected by the activities from the SOZ after the seizure onset. We applied our method to map brain networks in the periods around the seizure onset time and examined the effectiveness of our method in revealing different brain networks before and after seizure onset. We focused on four time periods: 26 to 50 seconds before seizure onset, 1 to 25 seconds before seizure onset, 1 to 25 seconds after seizure onset, and 26 to 50 seconds after seizure onset. To ensure effective approximation of the underlying complex brain system by the SSMAR and also to accommodate potential variation of brain activities over time, we applied the developed method to each 1-second iEEG segment (containing 1000 time series measurements) independently. In total, we analyzed 300 1-second iEEG data segments (4 periods &#215; 25 seconds &#215; 3 seizures).</p><p>For each 1-second data segment and for each pair of regions i and j, we obtained their clustering probability P m ij and network edge probabilities P &#947; ij and P &#947; ji . For each seizure period, we took average of posterior probabilities in 75 segments and denoted the ensuing average posterior probabilities by P m ij , P &#947; ij and P &#947; ji . We identified clusters and connected brain regions and mapped brain networks for four seizure periods based on these average probabilities. This analysis is consistent with the medical practice where reliable epilepsy diagnosis is based on combined information of iEEG recordings of at least 3 seizures <ref type="bibr">[59]</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1">Network Results</head><p>Figures 4(a)-4(d) show estimated networks for the four periods using the thresholds corresponding to the p-value of 1%. The SOZ is at G37, indicated by the diamond in all these four figures, while all the other regions are indicated by circles. The shown network edges (in grey or purple) indicate their network edge probabilities above the threshold; and the nodes indicated by the same color other than light blue are corresponding to the regions identified to be in the same cluster. Each region indicated by light blue forms one cluster that contains the region itself only.</p><p>Our method reveals that the networks for the two pre-seizure periods were similar (Figures <ref type="figure">4(a</ref>) and 4(b)), indicating that the subject's brain network was steady before seizure onset. However, dramatic changes occurred in the networks once seizure started (Figures <ref type="figure">4(c</ref>) and 4(d)). Compared to the pre-seizure networks, more regions were connected to the SOZ (G37) and fell into the same cluster as the SOZ, indicating that the activity of the SOZ affected more and more regions as seizure developed. This result is in line with the existing understanding of seizure propagation <ref type="bibr">[2,</ref><ref type="bibr">57]</ref>.</p><p>To demonstrate the advantages of our method, we also analyzed the same iEEG data using several competing methods, including correlation, cross-correlation, partial directed coherence (PDC) <ref type="bibr">[51]</ref>, directed transfer function (DTF) <ref type="bibr">[60]</ref>, L 1 -penalized MAR (MAR(L 1 )), and graphical lasso (Glasso) <ref type="bibr">[53,</ref><ref type="bibr">54]</ref>. We used each of these methods to analyze 300 1-second segments independently and obtained 300 calculated values for each candidate network edge (either directional or undirectional depending on the method).</p><p>For each candidate network edge, we used the average of 75 values in each period to quantify the strength of connection. For comparison, we selected network edges with top 5% averages, because the network edges selected by our method based on the pvalue of 1% roughly correspond to the edges with top 5% P &#947; ij s.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">SOZ Localization</head><p>We hypothesize that the SOZ exhibits a significant change in its connectivity to other regions at the seizure onset. To quantify this change, we developed the following method.</p><p>For each period, for each region, say j, we calculated the average of network edge probabilities from j to all the other regions, i P &#947; ij /d, referred to as region j's average directional connectivity (ADC) in the period. We use the ADC difference between the periods right after and before the seizure onset to quantify the change in directional connectivity from region j to other regions. Figure <ref type="figure">6</ref> shows the ADC changes of 62 regions at the seizure onset. Except for one region, the SOZ and its neighboring regions have the highest increases in ADC. We propose to select the regions with high ADC increases to be candidates for SOZ.</p><p>To determine the threshold for ADCs, we calculated the 62 regions' ADC changes in the first two pre-seizure periods for the 3 seizures recorded by iEEG. Then we selected the regions whose ADC changes at the seizure onset are larger than the maximum of ADC changes in the two pre-seizure periods. Figure <ref type="figure">6</ref>(b) shows the selected regions (in red).</p><p>Our result showed that the small brain area including the SOZ G37 has the highest increase in directional connectivity at the seizure onset. This result is in line with the existing literature about the SOZ <ref type="bibr">[57]</ref>: the abnormal, excessive neuronal activity starts from it and spreads to other regions. Our method quantified brain network changes and uncovered that the brain area including the SOZ first demonstrated an increase in directional connectivity during the seizure development.</p><p>In summary, with our method, we revealed three characteristics of the epileptic pa- </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">Discussion</head><p>This paper develops a new high-dimensional dynamic system model for mapping directional brain networks using iEEG data. The proposed approach has three novelties.</p><p>First, we propose a state-space first-order MAR-based model for the brain network. This model is effective for approximating various high-dimensional brain systems and is ro-bust to violations of model assumptions. Second, in contrast to standard SSMAR and MAR models, the proposed Bayesian framework incorporates the prior knowledge of the cluster structure into the model estimation, which addresses the challenge in detecting connected brain regions among many possible ones. Our method produces scientifically meaningful network results. Third, we develop a stochastic-blockmodel (SBM)motivated prior to impose the cluster structure on the SSMAR parameters that denote directional edges. This is novel from standard SBMs for observed networks where network edges are directly known.</p><p>The proposed method can robustly detect directional connections with high accuracy, even if the underlying model for the brain network is nonlinear for three reasons.</p><p>First, we apply the SSMAR to short iEEG time segments so that the linear MAR can effectively approximate the underlying network system. Second, we use the proposed model to identify the directional connections through detecting the existence of temporal dependence among neuronal activities of regions rather than estimating the nonlinear interactions among regions. The first-order SSMAR focuses only on the primary temporal dependence (rather than the exact order or nonlinearity of the dependence) among multivariate time series. Thus, the model is parsimonious in terms of the number of model parameters and enables efficient detection of directional connections among many regions. Third, the SBM-motivated prior can effectively capture potential brain network patterns. Using the SBM-motivated prior increases the efficiency in detecting directional connections. In summary, the proposed integration of a conventional SSMAR and the cluster structure yields robustness, flexibility, efficiency, and computational feasibility in modeling and estimating brain network systems.</p><p>The obtained network results from iEEG data analysis by the proposed method re-veal how seizures propagate from the SOZ to other regions and thus, bring new insights into the brain's normal and abnormal mechanisms and functions and dysfunctions. By assuming the cluster structure for the brain network, we identify the SOZ's cluster (the regions affected strongly by the SOZ's activity) and its changes during seizure development. Such information can help neurologists assess the effect of seizures on brain functions. Moreover, by revealing the unique connectivity property of the SOZ, our network analysis can improve SOZ localization in clinical treatment for epilepsy.</p><p>We have applied statistical methods used for localizing the SOZ based on EEG data to our iEEG data. Specifically, <ref type="bibr">[61]</ref> developed frequency specific methods to localize the SOZ through detecting changes in EEG data; and [62] used the differences in persistent homology between EEG data in pre-seizure and seizure periods to localize the SOZ.</p><p>However, these methods tend to have much higher FPRs than the proposed method most likely because EEG and iEEG data have different properties. The two methods <ref type="bibr">[61,</ref><ref type="bibr">62]</ref> require the time series before and after seizures to be stationary for a relatively long period. Since the regions recorded by EEG are large and spatially distant from each other, the changes in one EEG region take a relatively long time to spread to other regions. As such, the assumption of stationary long time series required by the two methods can be satisfied with EEG data. In contrast, regions recorded by iEEG are spatially close. Seizures propagate from the SOZ to other regions quickly, and thus, many regions surrounding the SOZ can have sharp changes in frequencies and persistent homology in a short period of time. This phenomenon makes it difficult for the methods that rely on relatively long stationary time series to separate the SOZ from surrounding regions. Because our method is focused on detecting the change in directional connectivity instead of the change in time series, our method can better exclude non-SOZ regions whose</p></div></body>
		</text>
</TEI>
