skip to main content

Title: A database of calculated solution parameters for the AlphaFold predicted protein structures

Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting “folding frenzy” has already produced predicted protein structure databases for the entire human and other organisms’ proteomes. However, rapidly ascertaining a predicted structure’s reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients ($${D_{t(20,w)}^{0}}$$Dt(20,w)0,$${s_{{\left( {{20},w} \right)}}^{{0}} }$$s20,w0) and the intrinsic viscosity ([η]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution functionp(r) vs.r. Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite, a database was implemented calculating from AlphaFold structures the corresponding$${D_{t(20,w)}^{0}}$$Dt(20,w)0,$${s_{{\left( {{20},w} \right)}}^{{0}} }$$s20,w0, [η],p(r) vs.r, and other parameters. Circular dichroism spectra were computed using the SESCA program. Some of AlphaFold’s drawbacks were mitigated, such as generating whenever possible a protein’s mature form. Others, like the AlphaFold direct applicability to single-chain structures only, the absence of prosthetic groups, or flexibility issues, are discussed. Overall, this implementation of the US-SOMO-AF database should already aid in rapidly evaluating the consistency in solution of a relevant portion of AlphaFold predicted protein structures.

Publication Date:
Journal Name:
Scientific Reports
Nature Publishing Group
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    This paper examined the effect of Si addition on the cracking resistance of Inconel 939 alloy after laser additive manufacturing (AM) process. With the help of CALculation of PHAse Diagrams (CALPHAD) software Thermo-Calc, the amounts of specific elements (C, B, and Zr) in liquid phase during solidification, cracking susceptibility coefficients (CSC) and cracking criterion based on$$\left| {{\text{d}}T/{\text{d}}f_{{\text{s}}}^{1/2} } \right|$$dT/dfs1/2values (T: solidification temperature,fs: mass fraction of solid during solidification) were evaluated as the indicators for composition optimization. It was found that CSC together with$$\left| {{\text{d}}T/{\text{d}}f_{{\text{s}}}^{1/2} } \right|$$dT/dfs1/2values provided a better prediction for cracking resistance.

    Graphical abstract

  2. Abstract

    The genericity of Arnold diffusion in the analytic category is an open problem. In this paper, we study this problem in the followinga prioriunstable Hamiltonian system with a time-periodic perturbationHε(p,q,I,φ,t)=h(I)+i=1n±12pi2+Vi(qi)+εH1(p,q,I,φ,t),where(p,q)Rn×Tn,(I,φ)Rd×Tdwithn,d⩾ 1,Viare Morse potentials, andɛis a small non-zero parameter. The unperturbed Hamiltonian is not necessarily convex, and the induced inner dynamics does not need to satisfy a twist condition. Using geometric methods we prove that Arnold diffusion occurs for generic analytic perturbationsH1. Indeed, the set of admissibleH1isCωdense andC3open (a fortiori,Cωopen). Our perturbative technique for the genericity is valid in theCktopology for allk∈ [3, ∞) ∪ {∞,ω}.

  3. Abstract

    We present the first unquenched lattice-QCD calculation of the form factors for the decay$$B\rightarrow D^*\ell \nu $$BDνat nonzero recoil. Our analysis includes 15 MILC ensembles with$$N_f=2+1$$Nf=2+1flavors of asqtad sea quarks, with a strange quark mass close to its physical mass. The lattice spacings range from$$a\approx 0.15$$a0.15fm down to 0.045 fm, while the ratio between the light- and the strange-quark masses ranges from 0.05 to 0.4. The valencebandcquarks are treated using the Wilson-clover action with the Fermilab interpretation, whereas the light sector employs asqtad staggered fermions. We extrapolate our results to the physical point in the continuum limit using rooted staggered heavy-light meson chiral perturbation theory. Then we apply a model-independent parametrization to extend the form factors to the full kinematic range. With this parametrization we perform a joint lattice-QCD/experiment fit using several experimental datasets to determine the CKM matrix element$$|V_{cb}|$$|Vcb|. We obtain$$\left| V_{cb}\right| = (38.40 \pm 0.68_{\text {th}} \pm 0.34_{\text {exp}} \pm 0.18_{\text {EM}})\times 10^{-3}$$Vcb=(38.40±0.68th±0.34exp±0.18EM)×10-3. The first error is theoretical, the second comes from experiment and the last one includes electromagnetic and electroweak uncertainties, with an overall$$\chi ^2\text {/dof} = 126/84$$χ2/dof=126/84, which illustrates the tensions between the experimental data sets, and between theory and experiment. This result is inmore »agreement with previous exclusive determinations, but the tension with the inclusive determination remains. Finally, we integrate the differential decay rate obtained solely from lattice data to predict$$R(D^*) = 0.265 \pm 0.013$$R(D)=0.265±0.013, which confirms the current tension between theory and experiment.

    « less
  4. Abstract

    It has been recently established in David and Mayboroda (Approximation of green functions and domains with uniformly rectifiable boundaries of all dimensions.arXiv:2010.09793) that on uniformly rectifiable sets the Green function is almost affine in the weak sense, and moreover, in some scenarios such Green function estimates are equivalent to the uniform rectifiability of a set. The present paper tackles a strong analogue of these results, starting with the “flagship degenerate operators on sets with lower dimensional boundaries. We consider the elliptic operators$$L_{\beta ,\gamma } =- {\text {div}}D^{d+1+\gamma -n} \nabla $$Lβ,γ=-divDd+1+γ-nassociated to a domain$$\Omega \subset {\mathbb {R}}^n$$ΩRnwith a uniformly rectifiable boundary$$\Gamma $$Γof dimension$$d < n-1$$d<n-1, the now usual distance to the boundary$$D = D_\beta $$D=Dβgiven by$$D_\beta (X)^{-\beta } = \int _{\Gamma } |X-y|^{-d-\beta } d\sigma (y)$$Dβ(X)-β=Γ|X-y|-d-βdσ(y)for$$X \in \Omega $$XΩ, where$$\beta >0$$β>0and$$\gamma \in (-1,1)$$γ(-1,1). In this paper we show that the Green functionGfor$$L_{\beta ,\gamma }$$Lβ,γ, with pole at infinity, is well approximated by multiples of$$D^{1-\gamma }$$D1-γ, in the sense that the function$$\big | D\nabla \big (\ln \big ( \frac{G}{D^{1-\gamma }} \big )\big )\big |^2$$|D(ln(GD1-γ))|2satisfies a Carleson measure estimate on$$\Omega $$Ω. We underline that the strong and the weak results are different in nature and, of course, at the levelmore »of the proofs: the latter extensively used compactness arguments, while the present paper relies on some intricate integration by parts and the properties of the “magical distance function from David et al. (Duke Math J, to appear).

    « less
  5. Abstract

    A well-known open problem of Meir and Moser asks if the squares of sidelength 1/nfor$$n\ge 2$$n2can be packed perfectly into a rectangle of area$$\sum _{n=2}^\infty n^{-2}=\pi ^2/6-1$$n=2n-2=π2/6-1. In this paper we show that for any$$1/21/2<t<1, and any$$n_0$$n0that is sufficiently large depending on t, the squares of sidelength$$n^{-t}$$n-tfor$$n\ge n_0$$nn0can be packed perfectly into a square of area$$\sum _{n=n_0}^\infty n^{-2t}$$n=n0n-2t. This was previously known (if one packs a rectangle instead of a square) for$$1/21/2<t2/3(in which case one can take$$n_0=1$$n0=1).