skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 8:00 PM ET on Friday, March 21 until 8:00 AM ET on Saturday, March 22 due to maintenance. We apologize for the inconvenience.


Search for: All records

Creators/Authors contains: "Shi, K."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Language models (LMs) are pretrained to imitate internet text, including content that would violate human preferences if generated by an LM: falsehoods, offensive comments, personally identifiable information, low-quality or buggy code, and more. Here, we explore alternative objectives for pretraining LMs in a way that also guides them to generate text aligned with human preferences. We benchmark five objectives for pretraining with human feedback across three tasks and study how they affect the trade-off between alignment and capabilities of pretrained LMs. We find a Pareto optimal and simple approach among those we explored: conditional training, or learning distribution over tokens conditional on their human preference scores given by a reward model. Conditional training reduces the rate of undesirable content by up to an order of magnitude, both when generating without a prompt and with an adversarially chosen prompt. Moreover, conditional training maintains the downstream task performance of standard LM pretraining, both before and after task-specific finetuning. Pretraining with human feedback results in much better preference satisfaction than standard LM pretraining followed by finetuning with feedback, i.e., learning and then unlearning undesirable behavior. Our results suggest that we should move beyond imitation learning when pretraining LMs and incorporate human preferences from the start of training. 
    more » « less
  2. The application of the Young–Laplace equation to a solid–liquid interface is considered. Computer simulations show that the pressure inside a solid cluster of hard spheres is smaller than the external pressure of the liquid (both for small and large clusters). This would suggest a negative value for the interfacial free energy. We show that in a Gibbsian description of the thermodynamics of a curved solid–liquid interface in equilibrium, the choice of the thermodynamic (rather than mechanical) pressure is required, as suggested by Tolman for the liquid–gas scenario. With this definition, the interfacial free energy is positive, and the values obtained are in excellent agreement with previous results from nucleation studies. Although, for a curved fluid–fluid interface, there is no distinction between mechanical and thermal pressures (for a sufficiently large inner phase), in the solid–liquid interface, they do not coincide, as hypothesized by Gibbs.

     
    more » « less
  3. Integrating renewable energy into the manufacturing facility is the ultimate key to realising carbon-neutral operations. Although many firms have taken various initiatives to reduce the carbon footprint of their facilities, there are few quantitative studies focused on cost analysis and supply reliability of integrating intermittent wind and solar power. This paper aims to fill this gap by addressing the following question: shall we adopt power purchase agreement (PPA) or onsite renewable generation to realise the eco-economic benefits? We tackle this complex decision-making problem by considering two regulatory options: government carbon incentives and utility pricing policy. A stochastic programming model is formulated to search for the optimal mix of onsite and offsite renewable power supply. The model is tested extensively in different regions under various climatic conditions. Three findings are obtained. First, in a long term onsite generation and PPA can avoid the price volatility in the spot or wholesale electricity market. Second, at locations where the wind speed is below 6 m/s, PPA at $70/MWh is preferred over onsite wind generation. Third, compared to PPA and wind generation, solar generation is not economically competitive unless the capacity cost is down below USD1.5 M per MW. 
    more » « less
  4. Some arsenite [As(III)]-oxidizing bacteria exhibit positive chemotaxis towards As(III), however, the related As(III) chemoreceptor and regulatory mechanism remain unknown. The As(III)-oxidizing bacterium Agrobacterium tumefaciens GW4 displays positive chemotaxis towards 0.5–2 mM As(III). Genomic analyses revealed a putative chemoreceptor-encoding gene, mcp, located in the arsenic gene island and having a predicted promoter binding site for the As(III) oxidation regulator AioR. Expression of mcp and other chemotaxis related genes (cheA, cheY2 and fliG) was inducible by As(III), but not in the aioR mutant. Using capillary assays and intrinsic tryptophan fluorescence spectra analysis, Mcp was confirmed to be responsible for chemotaxis towards As(III) and to bind As(III) (but not As(V) nor phosphate) as part of the sensing mechanism. A bacterial one-hybrid system technique and electrophoretic mobility shift assays showed that AioR interacts with the mcp regulatory region in vivo and in vitro, and the precise AioR binding site was confirmed using DNase I foot-printing. Taken together, these results indicate that this Mcp is responsible for the chemotactic response towards As(III) and is regulated by AioR. Additionally, disrupting the mcp gene affected bacterial As(III) oxidation and growth, inferring that Mcp may exert some sort of functional connection between As(III) oxidation and As(III) chemotaxis. 
    more » « less
  5. A<sc>bstract</sc>

    Diboson production in association with jets is studied in the fully leptonic final states, pp → (Z/γ*)(Z/γ*) + jets → 22′ + jets, (,′ = e orμ) in proton-proton collisions at a center-of-mass energy of 13 TeV. The data sample corresponds to an integrated luminosity of 138 fb1collected with the CMS detector at the LHC. Differential distributions and normalized differential cross sections are measured as a function of jet multiplicity, transverse momentumpT, pseudorapidityη, invariant mass and ∆ηof the highest-pTand second-highest-pTjets, and as a function of invariant mass of the four-lepton system for events with various jet multiplicities. These differential cross sections are compared with theoretical predictions that mostly agree with the experimental data. However, in a few regions we observe discrepancies between the predicted and measured values. Further improvement of the predictions is required to describe the ZZ+jets production in the whole phase space.

     
    more » « less
    Free, publicly-accessible full text available October 1, 2025
  6. A<sc>bstract</sc>

    A search for Higgs boson pair (HH) production in association with a vector boson V (W or Z boson) is presented. The search is based on proton-proton collision data at a center-of-mass energy of 13 TeV, collected with the CMS detector at the LHC, corresponding to an integrated luminosity of 138 fb1. Both hadronic and leptonic decays of V bosons are used. The leptons considered are electrons, muons, and neutrinos. The HH production is searched for in the$$ \textrm{b}\overline{\textrm{b}}\textrm{b}\overline{\textrm{b}} $$bb¯bb¯decay channel. An observed (expected) upper limit at 95% confidence level of VHH production cross section is set at 294 (124) times the standard model prediction. Constraints are also set on the modifiers of the Higgs boson trilinear self-coupling,kλ, assumingk2V= 1, and vice versa on the coupling of two Higgs bosons with two vector bosons,k2V. The observed (expected) 95% confidence intervals of these coupling modifiers are37.7 <kλ< 37.2 (30.1 <kλ< 28.9) and12.2 <k2V< 13.5 (7.2 <k2V< 8.9), respectively.

     
    more » « less
    Free, publicly-accessible full text available October 1, 2025
  7. Free, publicly-accessible full text available October 1, 2025