Statistical descriptions of earthquakes offer important probabilistic information, and newly emerging technologies of high-precision observations and machine learning collectively advance our knowledge regarding complex earthquake behaviors. Still, there remains a formidable knowledge gap for predicting individual large earthquakes’ locations and magnitudes. Here, this study shows that the individual large earthquakes may have unique signatures that can be represented by new high-dimensional features—Gauss curvature-based coordinates. Particularly, the observed earthquake catalog data are transformed into a number of pseudo physics quantities (i.e., energy, power, vorticity, and Laplacian) which turn into smooth surface-like information via spatio-temporal convolution, giving rise to the new high-dimensional coordinates. Validations with 40-year earthquakes in the West U.S. region show that the new coordinates appear to hold uniqueness for individual large earthquakes (
This content will become publicly available on December 1, 2024
Predicting individual large earthquakes (EQs)’ locations, magnitudes, and timing remains unreachable. The author’s prior study shows that individual large EQs have unique signatures obtained from multi-layered data transformations. Via spatio-temporal convolutions, decades-long EQ catalog data are transformed into pseudo-physics quantities (e.g., energy, power, vorticity, and Laplacian), which turn into surface-like information via Gauss curvatures. Using these new features, a rule-learning machine learning approach unravels promising prediction rules. This paper suggests further data transformation via Fourier transformation (FT). Results show that FT-based new feature can help sharpen the prediction rules. Feasibility tests of large EQs (
- NSF-PAR ID:
- 10491691
- Publisher / Repository:
- Nature
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 13
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract ), and the pseudo physics quantities help identify a customized data-driven prediction model. A Bayesian evolutionary algorithm in conjunction with flexible bases can identify a data-driven model, demonstrating its promising reproduction of individual large earthquake’s location and magnitude. Results imply that an individual large earthquake can be distinguished and remembered while its best-so-far model can be customized by machine learning. This study paves a new way to data-driven automated evolution of individual earthquake prediction.$$M_w \ge 7.0$$ -
Abstract Massive gully land consolidation projects, launched in China’s Loess Plateau, aim to restore 2667
agricultural lands in total by consolidating 2026 highly eroded gullies. This effort represents a social engineering project where the economic development and livelihood of the farming families are closely tied to the ability of these emergent landscapes to provide agricultural services. Whether these ‘time zero’ landscapes have the resilience to provide a sustainable soil condition such as soil organic carbon (SOC) content remains unknown. By studying two watersheds, one of which is a control site, we show that the consolidated gully serves as an enhanced carbon sink, where the magnitude of SOC increase rate (1.0$$\mathrm{km}^2$$ ) is about twice that of the SOC decrease rate (− 0.5$$\mathrm{g\,C}/\mathrm{m}^2/\mathrm{year}$$ ) in the surrounding natural watershed. Over a 50-year co-evolution of landscape and SOC turnover, we find that the dominant mechanisms that determine the carbon cycling are different between the consolidated gully and natural watersheds. In natural watersheds, the flux of SOC transformation is mainly driven by the flux of SOC transport; but in the consolidated gully, the transport has little impact on the transformation. Furthermore, we find that extending the surface carbon residence time has the potential to efficiently enhance carbon sequestration from the atmosphere with a rate as high as 8$$\mathrm{g\,C}/\mathrm{m}^2/\mathrm{year}$$ compared to the current 0.4$$\mathrm{g\,C}/\mathrm{m}^2/\mathrm{year}$$ . The success for the completion of all gully consolidation would lead to as high as 26.67$$\mathrm{g\,C}/\mathrm{m}^2/\mathrm{year}$$ sequestrated into soils. This work, therefore, not only provides an assessment and guidance of the long-term sustainability of the ‘time zero’ landscapes but also a solution for sequestration$$\mathrm{Gg\,C}/\mathrm{year}$$ into soils.$$\hbox {CO}_2$$ -
Abstract Background Protein–protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protein has any PPI relationship with other existing proteins. Current computational PPI prediction methods usually compare the new protein to existing proteins one by one in a pairwise manner. This is time consuming.
Results In this work, we propose a more efficient model, called deep hash learning protein-and-protein interaction (DHL-PPI), to predict all-against-all PPI relationships in a database of proteins. First, DHL-PPI encodes a protein sequence into a binary hash code based on deep features extracted from the protein sequences using deep learning techniques. This encoding scheme enables us to turn the PPI discrimination problem into a much simpler searching problem. The binary hash code for a protein sequence can be regarded as a number. Thus, in the pre-screening stage of DHL-PPI, the string matching problem of comparing a protein sequence against a database with
M proteins can be transformed into a much more simpler problem: to find a number inside a sorted array of lengthM . This pre-screening process narrows down the search to a much smaller set of candidate proteins for further confirmation. As a final step, DHL-PPI uses the Hamming distance to verify the final PPI relationship.Conclusions The experimental results confirmed that DHL-PPI is feasible and effective. Using a dataset with strictly negative PPI examples of four species, DHL-PPI is shown to be superior or competitive when compared to the other state-of-the-art methods in terms of precision, recall or F1 score. Furthermore, in the prediction stage, the proposed DHL-PPI reduced the time complexity from
to$$O(M^2)$$ for performing an all-against-all PPI prediction for a database with$$O(M\log M)$$ M proteins. With the proposed approach, a protein database can be preprocessed and stored for later search using the proposed encoding scheme. This can provide a more efficient way to cope with the rapidly increasing volume of protein datasets. -
Abstract We introduce a family of Finsler metrics, called the
-Fisher–Rao metrics$$L^p$$ , for$$F_p$$ , which generalizes the classical Fisher–Rao metric$$p\in (1,\infty )$$ , both on the space of densities$$F_2$$ and probability densities$${\text {Dens}}_+(M)$$ . We then study their relations to the Amari–C̆encov$${\text {Prob}}(M)$$ -connections$$\alpha $$ from information geometry: on$$\nabla ^{(\alpha )}$$ , the geodesic equations of$${\text {Dens}}_+(M)$$ and$$F_p$$ coincide, for$$\nabla ^{(\alpha )}$$ . Both are pullbacks of canonical constructions on$$p = 2/(1-\alpha )$$ , in which geodesics are simply straight lines. In particular, this gives a new variational interpretation of$$L^p(M)$$ -geodesics as being energy minimizing curves. On$$\alpha $$ , the$${\text {Prob}}(M)$$ and$$F_p$$ geodesics can still be thought as pullbacks of natural operations on the unit sphere in$$\nabla ^{(\alpha )}$$ , but in this case they no longer coincide unless$$L^p(M)$$ . Using this transformation, we solve the geodesic equation of the$$p=2$$ -connection by showing that the geodesic are pullbacks of projections of straight lines onto the unit sphere, and they always cease to exists after finite time when they leave the positive part of the sphere. This unveils the geometric structure of solutions to the generalized Proudman–Johnson equations, and generalizes them to higher dimensions. In addition, we calculate the associate tensors of$$\alpha $$ , and study their relation to$$F_p$$ .$$\nabla ^{(\alpha )}$$ -
Abstract We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in
and other complexity classes do not have small circuits (in the worst case and/or on average) from various circuit classes$\mathsf {Quasi}\text {-}\mathsf {NP} = \mathsf {NTIME}[n^{(\log n)^{O(1)}}]$ , by showing that$\mathcal { C}$ admits non-trivial satisfiability and/or$\mathcal { C}$ # SAT algorithms which beat exhaustive search by a minor amount. In this paper, we present a new strong lower bound consequence of having a non-trivial# SAT algorithm for a circuit class . Say that a symmetric Boolean function${\mathcal C}$ f (x 1,…,x n ) issparse if it outputs 1 onO (1) values of . We show that for every sparse${\sum }_{i} x_{i}$ f , and for all “typical” , faster$\mathcal { C}$ # SAT algorithms for circuits imply lower bounds against the circuit class$\mathcal { C}$ , which may be$f \circ \mathcal { C}$ stronger than itself. In particular:$\mathcal { C}$ # SAT algorithms forn k -size -circuits running in 2$\mathcal { C}$ n /n k time (for allk ) implyN E X P does not have -circuits of polynomial size.$(f \circ \mathcal { C})$ # SAT algorithms for -size$2^{n^{{\varepsilon }}}$ -circuits running in$\mathcal { C}$ time (for some$2^{n-n^{{\varepsilon }}}$ ε > 0) implyQ u a s i -N P does not have -circuits of polynomial size.$(f \circ \mathcal { C})$ Applying
# SAT algorithms from the literature, one immediate corollary of our results is thatQ u a s i -N P does not haveE M A J ∘A C C 0∘T H R circuits of polynomial size, whereE M A J is the “exact majority” function, improving previous lower bounds againstA C C 0[Williams JACM’14] andA C C 0∘T H R [Williams STOC’14], [Murray-Williams STOC’18]. This is the first nontrivial lower bound against such a circuit class.