skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Identifying topologically associating domains using differential kernels
Chromatin is a polymer complex of DNA and proteins that regulates gene expression. The three-dimensional (3D) structure and organization of chromatin controls DNA transcription and replication. High-throughput chromatin conformation capture techniques generate Hi-C maps that can provide insight into the 3D structure of chromatin. Hi-C maps can be represented as a symmetric matrix A i j , where each element represents the average contact probability or number of contacts between chromatin lociiandj. Previous studies have detected topologically associating domains (TADs), or self-interacting regions in A i j within which the contact probability is greater than that outside the region. Many algorithms have been developed to identify TADs within Hi-C maps. However, most TAD identification algorithms are unable to identify nested or overlapping TADs and for a given Hi-C map there is significant variation in the location and number of TADs identified by different methods. We develop a novel method to identify TADs, KerTAD, using a kernel-based technique from computer vision and image processing that is able to accurately identify nested and overlapping TADs. We benchmark this method against state-of-the-art TAD identification methods on both synthetic and experimental data sets. We find that the new method consistently has higher true positive rates (TPR) and lower false discovery rates (FDR) than all tested methods for both synthetic and manually annotated experimental Hi-C maps. The TPR for KerTAD is also largely insensitive to increasing noise and sparsity, in contrast to the other methods. We also find that KerTAD is consistent in the number and size of TADs identified across replicate experimental Hi-C maps for several organisms. Thus, KerTAD will improve automated TAD identification and enable researchers to better correlate changes in TADs to biological phenomena, such as enhancer-promoter interactions and disease states.  more » « less
Award ID(s):
2021988
PAR ID:
10549201
Author(s) / Creator(s):
; ; ; ; ;
Editor(s):
Dunbrack, Roland L
Publisher / Repository:
PLOS
Date Published:
Journal Name:
PLOS Computational Biology
Volume:
20
Issue:
7
ISSN:
1553-7358
Page Range / eLocation ID:
e1012221
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Hemati, Sara (Ed.)
    The application of 222 nm light from KrCl excimer lamps (GUV222 or far-UVC) is a promising approach to reduce the indoor transmission of airborne pathogens, including the SARS-CoV-2 virus. GUV222 inactivates airborne pathogens and is believed to be relatively safe for human skin and eye exposure. However, UV light initiates photochemical reactions which may negatively impact indoor air quality. We conducted a series of experiments to assess the formation of ozone ( O 3 ), and resulting formation of secondary organic aerosols (SOA), induced by commercial far-UVC devices in an office environment (small conference room) with an air exchange rate of 1.3   h 1 . We studied scenarios with a single far-UVC lamp, corresponding to the manufacturer’s recommendations for disinfection of a space that size, and with four far-UVC lamps, to test conditions of greater far-UVC fluence. The single lamp did not significantly impact O 3 or fine particulate matter levels in the room. Consistent with previous studies in the literature, the higher far-UVC fluences lead to increases in O 3 of 5 to 10 ppb above background, and minor increases in particulate matter (16% ± 10 % increase in particle number count). The use of far-UVC at minimum intensities required for disinfection, and in conjunction with adequate ventilation rates (e.g. ANSI/ASHRAE recommendations), may allow the reduction of airborne pathogen levels while minimizing the formation of air pollutants in furnished indoor environments. 
    more » « less
  2. Zhou, Yu (Ed.)
    A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are O ( N P 2 ) in training and O ( P ) per point in inference, whereNis the number of instances andPthe number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a ‘Susceptible, Infected, Recovered’ (SIR) toy problem, along with the experimental ‘Cascaded Tanks’ benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package. 
    more » « less
  3. Abstract Let$$(h_I)$$ ( h I ) denote the standard Haar system on [0, 1], indexed by$$I\in \mathcal {D}$$ I D , the set of dyadic intervals and$$h_I\otimes h_J$$ h I h J denote the tensor product$$(s,t)\mapsto h_I(s) h_J(t)$$ ( s , t ) h I ( s ) h J ( t ) ,$$I,J\in \mathcal {D}$$ I , J D . We consider a class of two-parameter function spaces which are completions of the linear span$$\mathcal {V}(\delta ^2)$$ V ( δ 2 ) of$$h_I\otimes h_J$$ h I h J ,$$I,J\in \mathcal {D}$$ I , J D . This class contains all the spaces of the formX(Y), whereXandYare either the Lebesgue spaces$$L^p[0,1]$$ L p [ 0 , 1 ] or the Hardy spaces$$H^p[0,1]$$ H p [ 0 , 1 ] ,$$1\le p < \infty $$ 1 p < . We say that$$D:X(Y)\rightarrow X(Y)$$ D : X ( Y ) X ( Y ) is a Haar multiplier if$$D(h_I\otimes h_J) = d_{I,J} h_I\otimes h_J$$ D ( h I h J ) = d I , J h I h J , where$$d_{I,J}\in \mathbb {R}$$ d I , J R , and ask which more elementary operators factor throughD. A decisive role is played by theCapon projection$$\mathcal {C}:\mathcal {V}(\delta ^2)\rightarrow \mathcal {V}(\delta ^2)$$ C : V ( δ 2 ) V ( δ 2 ) given by$$\mathcal {C} h_I\otimes h_J = h_I\otimes h_J$$ C h I h J = h I h J if$$|I|\le |J|$$ | I | | J | , and$$\mathcal {C} h_I\otimes h_J = 0$$ C h I h J = 0 if$$|I| > |J|$$ | I | > | J | , as our main result highlights: Given any bounded Haar multiplier$$D:X(Y)\rightarrow X(Y)$$ D : X ( Y ) X ( Y ) , there exist$$\lambda ,\mu \in \mathbb {R}$$ λ , μ R such that$$\begin{aligned} \lambda \mathcal {C} + \mu ({{\,\textrm{Id}\,}}-\mathcal {C})\text { approximately 1-projectionally factors through }D, \end{aligned}$$ λ C + μ ( Id - C ) approximately 1-projectionally factors through D , i.e., for all$$\eta > 0$$ η > 0 , there exist bounded operatorsA, Bso thatABis the identity operator$${{\,\textrm{Id}\,}}$$ Id ,$$\Vert A\Vert \cdot \Vert B\Vert = 1$$ A · B = 1 and$$\Vert \lambda \mathcal {C} + \mu ({{\,\textrm{Id}\,}}-\mathcal {C}) - ADB\Vert < \eta $$ λ C + μ ( Id - C ) - A D B < η . Additionally, if$$\mathcal {C}$$ C is unbounded onX(Y), then$$\lambda = \mu $$ λ = μ and then$${{\,\textrm{Id}\,}}$$ Id either factors throughDor$${{\,\textrm{Id}\,}}-D$$ Id - D
    more » « less
  4. Abstract We study the family of irreducible modules for quantum affine 𝔰 𝔩 n + 1 {\mathfrak{sl}_{n+1}}whose Drinfeld polynomials are supported on just one node of the Dynkin diagram. We identify all the prime modules in this family and prove a unique factorization theorem. The Drinfeld polynomials of the prime modules encode information coming from the points of reducibility of tensor products of the fundamental modules associated to A m {A_{m}}with m n {m\leq n}. These prime modules are a special class of the snake modules studied by Mukhin and Young. We relate our modules to the work of Hernandez and Leclerc and define generalizations of the category 𝒞 - {\mathscr{C}^{-}}. This leads naturally to the notion of an inflation of the corresponding Grothendieck ring. In the last section we show that the tensor product of a (higher order) Kirillov–Reshetikhin module with its dual always contains an imaginary module in its Jordan–Hölder series and give an explicit formula for its Drinfeld polynomial. Together with the results of [D. Hernandez and B. Leclerc,A cluster algebra approach toq-characters of Kirillov–Reshetikhin modules,J. Eur. Math. Soc. (JEMS) 18 2016, 5, 1113–1159] this gives examples of a product of cluster variables which are not in the span of cluster monomials. We also discuss the connection of our work with the examples arising from the work of [E. Lapid and A. Mínguez,Geometric conditions for \square-irreducibility of certain representations of the general linear group over a non-archimedean local field,Adv. Math. 339 2018, 113–190]. Finally, we use our methods to give a family of imaginary modules in type D 4 {D_{4}}which do not arise from an embedding of A r {A_{r}}with r 3 {r\leq 3}in D 4 {D_{4}}. 
    more » « less
  5. Abstract For a smooth projective varietyXover an algebraic number fieldka conjecture of Bloch and Beilinson predicts that the kernel of the Albanese map ofXis a torsion group. In this article we consider a product$$X=C_1\times \cdots \times C_d$$ X = C 1 × × C d of smooth projective curves and show that if the conjecture is true for any subproduct of two curves, then it is true forX. For a product$$X=C_1\times C_2$$ X = C 1 × C 2 of two curves over$$\mathbb {Q} $$ Q with positive genus we construct many nontrivial examples that satisfy the weaker property that the image of the natural map$$J_1(\mathbb {Q})\otimes J_2(\mathbb {Q})\xrightarrow {\varepsilon }{{\,\textrm{CH}\,}}_0(C_1\times C_2)$$ J 1 ( Q ) J 2 ( Q ) ε CH 0 ( C 1 × C 2 ) is finite, where$$J_i$$ J i is the Jacobian variety of$$C_i$$ C i . Our constructions include many new examples of non-isogenous pairs of elliptic curves$$E_1, E_2$$ E 1 , E 2 with positive rank, including the first known examples of rank greater than 1. Combining these constructions with our previous result, we obtain infinitely many nontrivial products$$X=C_1\times \cdots \times C_d$$ X = C 1 × × C d for which the analogous map$$\varepsilon $$ ε has finite image. 
    more » « less