The softmax policy gradient (PG) method, which performs gradient ascent under softmax policy parameterization, is arguably one of the de facto implementations of policy optimization in modern reinforcement learning. For
The shear viscosity
- Award ID(s):
- 2012947
- NSF-PAR ID:
- 10375933
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- The European Physical Journal C
- Volume:
- 82
- Issue:
- 10
- ISSN:
- 1434-6052
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -discounted infinite-horizon tabular Markov decision processes (MDPs), remarkable progress has recently been achieved towards establishing global convergence of softmax PG methods in finding a near-optimal policy. However, prior results fall short of delineating clear dependencies of convergence rates on salient parameters such as the cardinality of the state space$$\gamma $$ and the effective horizon$${\mathcal {S}}$$ , both of which could be excessively large. In this paper, we deliver a pessimistic message regarding the iteration complexity of softmax PG methods, despite assuming access to exact gradient computation. Specifically, we demonstrate that the softmax PG method with stepsize$$\frac{1}{1-\gamma }$$ can take$$\eta $$ to converge, even in the presence of a benign policy initialization and an initial state distribution amenable to exploration (so that the distribution mismatch coefficient is not exceedingly large). This is accomplished by characterizing the algorithmic dynamics over a carefully-constructed MDP containing only three actions. Our exponential lower bound hints at the necessity of carefully adjusting update rules or enforcing proper regularization in accelerating PG methods.$$\begin{aligned} \frac{1}{\eta } |{\mathcal {S}}|^{2^{\Omega \big (\frac{1}{1-\gamma }\big )}} ~\text {iterations} \end{aligned}$$ -
Abstract Recent spectacular advances by AI programs in 3D structure predictions from protein sequences have revolutionized the field in terms of accuracy and speed. The resulting “folding frenzy” has already produced predicted protein structure databases for the entire human and other organisms’ proteomes. However, rapidly ascertaining a predicted structure’s reliability based on measured properties in solution should be considered. Shape-sensitive hydrodynamic parameters such as the diffusion and sedimentation coefficients (
,$${D_{t(20,w)}^{0}}$$ ) and the intrinsic viscosity ([$${s_{{\left( {{20},w} \right)}}^{{0}} }$$ η ]) can provide a rapid assessment of the overall structure likeliness, and SAXS would yield the structure-related pair-wise distance distribution functionp (r ) vs.r . Using the extensively validated UltraScan SOlution MOdeler (US-SOMO) suite, a database was implemented calculating from AlphaFold structures the corresponding ,$${D_{t(20,w)}^{0}}$$ , [$${s_{{\left( {{20},w} \right)}}^{{0}} }$$ η ],p (r ) vs.r , and other parameters. Circular dichroism spectra were computed using the SESCA program. Some of AlphaFold’s drawbacks were mitigated, such as generating whenever possible a protein’s mature form. Others, like the AlphaFold direct applicability to single-chain structures only, the absence of prosthetic groups, or flexibility issues, are discussed. Overall, this implementation of the US-SOMO-AF database should already aid in rapidly evaluating the consistency in solution of a relevant portion of AlphaFold predicted protein structures. -
Abstract We introduce tools from discrete convexity theory and polyhedral geometry into the theory of West’s stack-sorting map
s . Associated to each permutation is a particular set$$\pi $$ of integer compositions that appears in a formula for the fertility of$$\mathcal V(\pi )$$ , which is defined to be$$\pi $$ . These compositions also feature prominently in more general formulas involving families of colored binary plane trees called$$|s^{-1}(\pi )|$$ troupes and in a formula that converts from free to classical cumulants in noncommutative probability theory. We show that is a transversal discrete polymatroid when it is nonempty. We define the$$\mathcal V(\pi )$$ fertilitope of to be the convex hull of$$\pi $$ , and we prove a surprisingly simple characterization of fertilitopes as nestohedra arising from full binary plane trees. Using known facts about nestohedra, we provide a procedure for describing the structure of the fertilitope of$$\mathcal V(\pi )$$ directly from$$\pi $$ using Bousquet-Mélou’s notion of the canonical tree of$$\pi $$ . As a byproduct, we obtain a new combinatorial cumulant conversion formula in terms of generalizations of canonical trees that we call$$\pi $$ quasicanonical trees . We also apply our results on fertilitopes to study combinatorial properties of the stack-sorting map. In particular, we show that the set of fertility numbers has density 1, and we determine all infertility numbers of size at most 126. Finally, we reformulate the conjecture that is always real-rooted in terms of nestohedra, and we propose natural ways in which this new version of the conjecture could be extended.$$\sum _{\sigma \in s^{-1}(\pi )}x^{\textrm{des}(\sigma )+1}$$ -
Abstract The production of the
particle in heavy-ion collisions has been contemplated as an alternative probe of its internal structure. To investigate this conjecture, we perform transport calculations of the$$X(3872)$$ through the fireball formed in nuclear collisions at the LHC. Within a kinetic-rate equation approach as previously used for charmonia, the formation and dissociation of the$$X(3872)$$ is controlled by two transport parameters,$$X(3872)$$ i.e. , its inelastic reaction rate and thermal-equilibrium limit in the evolving hot QCD medium. While the equilibrium limit is controlled by the charm production cross section in primordial nucleon-nucleon collisions (together with the spectra of charm states in the medium), the structure information is encoded in the reaction rate. We study how different scenarios for the rate affect the centrality dependence and transverse-momentum ( ) spectra of the$$p_T$$ . Larger reaction rates associated with the loosely bound molecule structure imply that it is formed later in the fireball evolution than the tetraquark and thus its final yields are generally smaller by around a factor of two, which is qualitatively different from most coalescence model calculations to date. The$$X(3872)$$ spectra provide further information as the later decoupling time within the molecular scenario leads to harder spectra caused by the blue-shift from the expanding fireball.$$p_T$$ -
Abstract Ultra-pure NaI(Tl) crystals are the key element for a model-independent verification of the long standing DAMA result and a powerful means to search for the annual modulation signature of dark matter interactions. The SABRE collaboration has been developing cutting-edge techniques for the reduction of intrinsic backgrounds over several years. In this paper we report the first characterization of a 3.4 kg crystal, named NaI-33, performed in an underground passive shielding setup at LNGS. NaI-33 has a record low
K contamination of 4.3 ± 0.2 ppb as determined by mass spectrometry. We measured a light yield of 11.1 ± 0.2 photoelectrons/keV and an energy resolution of 13.2% (FWHM/E) at 59.5 keV. We evaluated the activities of$$^{39}$$ Ra and$$^{226}$$ Th inside the crystal to be$$^{228}$$ Bq/kg and$$5.9\pm 0.6~\upmu $$ Bq/kg, respectively, which would indicate a contamination from$$1.6\pm 0.3~\upmu $$ U and$$^{238}$$ Th at part-per-trillion level. We measured an activity of 0.51 ± 0.02 mBq/kg due to$$^{232}$$ Pb out of equilibrium and a$$^{210}$$ quenching factor of 0.63 ± 0.01 at 5304 keV. We illustrate the analyses techniques developed to reject electronic noise in the lower part of the energy spectrum. A cut-based strategy and a multivariate approach indicated a rate, attributed to the intrinsic radioactivity of the crystal, of$$\alpha $$ 1 count/day/kg/keV in the [5–20] keV region.$$\sim $$