skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes
A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are O ( N P 2 ) in training and O ( P ) per point in inference, whereNis the number of instances andPthe number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a ‘Susceptible, Infected, Recovered’ (SIR) toy problem, along with the experimental ‘Cascaded Tanks’ benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.  more » « less
Award ID(s):
2119688
PAR ID:
10627914
Author(s) / Creator(s):
; ; ;
Editor(s):
Zhou, Yu
Publisher / Repository:
PLOS ONE
Date Published:
Journal Name:
PLOS ONE
Volume:
19
Issue:
9
ISSN:
1932-6203
Page Range / eLocation ID:
e0309661
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hemati, Sara (Ed.)
    The application of 222 nm light from KrCl excimer lamps (GUV222 or far-UVC) is a promising approach to reduce the indoor transmission of airborne pathogens, including the SARS-CoV-2 virus. GUV222 inactivates airborne pathogens and is believed to be relatively safe for human skin and eye exposure. However, UV light initiates photochemical reactions which may negatively impact indoor air quality. We conducted a series of experiments to assess the formation of ozone ( O 3 ), and resulting formation of secondary organic aerosols (SOA), induced by commercial far-UVC devices in an office environment (small conference room) with an air exchange rate of 1.3   h 1 . We studied scenarios with a single far-UVC lamp, corresponding to the manufacturer’s recommendations for disinfection of a space that size, and with four far-UVC lamps, to test conditions of greater far-UVC fluence. The single lamp did not significantly impact O 3 or fine particulate matter levels in the room. Consistent with previous studies in the literature, the higher far-UVC fluences lead to increases in O 3 of 5 to 10 ppb above background, and minor increases in particulate matter (16% ± 10 % increase in particle number count). The use of far-UVC at minimum intensities required for disinfection, and in conjunction with adequate ventilation rates (e.g. ANSI/ASHRAE recommendations), may allow the reduction of airborne pathogen levels while minimizing the formation of air pollutants in furnished indoor environments. 
    more » « less
  2. Dunbrack, Roland L (Ed.)
    Chromatin is a polymer complex of DNA and proteins that regulates gene expression. The three-dimensional (3D) structure and organization of chromatin controls DNA transcription and replication. High-throughput chromatin conformation capture techniques generate Hi-C maps that can provide insight into the 3D structure of chromatin. Hi-C maps can be represented as a symmetric matrix A i j , where each element represents the average contact probability or number of contacts between chromatin lociiandj. Previous studies have detected topologically associating domains (TADs), or self-interacting regions in A i j within which the contact probability is greater than that outside the region. Many algorithms have been developed to identify TADs within Hi-C maps. However, most TAD identification algorithms are unable to identify nested or overlapping TADs and for a given Hi-C map there is significant variation in the location and number of TADs identified by different methods. We develop a novel method to identify TADs, KerTAD, using a kernel-based technique from computer vision and image processing that is able to accurately identify nested and overlapping TADs. We benchmark this method against state-of-the-art TAD identification methods on both synthetic and experimental data sets. We find that the new method consistently has higher true positive rates (TPR) and lower false discovery rates (FDR) than all tested methods for both synthetic and manually annotated experimental Hi-C maps. The TPR for KerTAD is also largely insensitive to increasing noise and sparsity, in contrast to the other methods. We also find that KerTAD is consistent in the number and size of TADs identified across replicate experimental Hi-C maps for several organisms. Thus, KerTAD will improve automated TAD identification and enable researchers to better correlate changes in TADs to biological phenomena, such as enhancer-promoter interactions and disease states. 
    more » « less
  3. Abstract We extend the Calderón–Zygmund theory for nonlocal equations tostrongly coupled system of linear nonlocal equations A s u = f {\mathcal{L}^{s}_{A}u=f}, where the operator A s {\mathcal{L}^{s}_{A}}is formally given by A s u = n A ( x , y ) | x - y | n + 2 s ( x - y ) ( x - y ) | x - y | 2 ( u ( x ) - u ( y ) ) 𝑑 y . \mathcal{L}^{s}_{A}u=\int_{\mathbb{R}^{n}}\frac{A(x,y)}{|x-y|^{n+2s}}\frac{(x-%y)\otimes(x-y)}{|x-y|^{2}}(u(x)-u(y))\,dy. For 0 < s < 1 {0<1}and A : n × n {A:\mathbb{R}^{n}\times\mathbb{R}^{n}\to\mathbb{R}}taken to be symmetric and serving asa variable coefficient for the operator, the system under consideration is the fractional version of the classical Navier–Lamé linearized elasticity system. The study of the coupled system of nonlocal equations is motivated by its appearance in nonlocal mechanics, primarily in peridynamics. Our regularity result states that if A ( , y ) {A(\,\cdot\,,y)}is uniformly Holder continuous and inf x n A ( x , x ) > 0 {\inf_{x\in\mathbb{R}^{n}}A(x,x)>0}, then for f L loc p {f\in L^{p}_{\rm loc}}, for p 2 {p\geq 2}, the solution vector u H loc 2 s - δ , p {u\in H^{2s-\delta,p}_{\rm loc}}for some δ ( 0 , s ) {\delta\in(0,s)}. 
    more » « less
  4. Abstract Let$$(h_I)$$ ( h I ) denote the standard Haar system on [0, 1], indexed by$$I\in \mathcal {D}$$ I D , the set of dyadic intervals and$$h_I\otimes h_J$$ h I h J denote the tensor product$$(s,t)\mapsto h_I(s) h_J(t)$$ ( s , t ) h I ( s ) h J ( t ) ,$$I,J\in \mathcal {D}$$ I , J D . We consider a class of two-parameter function spaces which are completions of the linear span$$\mathcal {V}(\delta ^2)$$ V ( δ 2 ) of$$h_I\otimes h_J$$ h I h J ,$$I,J\in \mathcal {D}$$ I , J D . This class contains all the spaces of the formX(Y), whereXandYare either the Lebesgue spaces$$L^p[0,1]$$ L p [ 0 , 1 ] or the Hardy spaces$$H^p[0,1]$$ H p [ 0 , 1 ] ,$$1\le p < \infty $$ 1 p < . We say that$$D:X(Y)\rightarrow X(Y)$$ D : X ( Y ) X ( Y ) is a Haar multiplier if$$D(h_I\otimes h_J) = d_{I,J} h_I\otimes h_J$$ D ( h I h J ) = d I , J h I h J , where$$d_{I,J}\in \mathbb {R}$$ d I , J R , and ask which more elementary operators factor throughD. A decisive role is played by theCapon projection$$\mathcal {C}:\mathcal {V}(\delta ^2)\rightarrow \mathcal {V}(\delta ^2)$$ C : V ( δ 2 ) V ( δ 2 ) given by$$\mathcal {C} h_I\otimes h_J = h_I\otimes h_J$$ C h I h J = h I h J if$$|I|\le |J|$$ | I | | J | , and$$\mathcal {C} h_I\otimes h_J = 0$$ C h I h J = 0 if$$|I| > |J|$$ | I | > | J | , as our main result highlights: Given any bounded Haar multiplier$$D:X(Y)\rightarrow X(Y)$$ D : X ( Y ) X ( Y ) , there exist$$\lambda ,\mu \in \mathbb {R}$$ λ , μ R such that$$\begin{aligned} \lambda \mathcal {C} + \mu ({{\,\textrm{Id}\,}}-\mathcal {C})\text { approximately 1-projectionally factors through }D, \end{aligned}$$ λ C + μ ( Id - C ) approximately 1-projectionally factors through D , i.e., for all$$\eta > 0$$ η > 0 , there exist bounded operatorsA, Bso thatABis the identity operator$${{\,\textrm{Id}\,}}$$ Id ,$$\Vert A\Vert \cdot \Vert B\Vert = 1$$ A · B = 1 and$$\Vert \lambda \mathcal {C} + \mu ({{\,\textrm{Id}\,}}-\mathcal {C}) - ADB\Vert < \eta $$ λ C + μ ( Id - C ) - A D B < η . Additionally, if$$\mathcal {C}$$ C is unbounded onX(Y), then$$\lambda = \mu $$ λ = μ and then$${{\,\textrm{Id}\,}}$$ Id either factors throughDor$${{\,\textrm{Id}\,}}-D$$ Id - D
    more » « less
  5. Abstract LetXbe acompact orientable non-Haken 3-manifold modeled on the Thurston geometry Nil {\operatorname{Nil}}. We show that the diffeomorphism group Diff ( X ) {\operatorname{Diff}(X)}deformation retracts to the isometry group Isom ( X ) {\operatorname{Isom}(X)}. Combining this with earlier work by many authors, this completes the determination the homotopy type of Diff ( X ) {\operatorname{Diff}(X)}for any compact, orientable, prime 3-manifoldX. 
    more » « less