skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Smith, N"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available February 1, 2027
  2. The assumption across nearly all language model (LM) tokenization schemes is that tokens should be subwords, i.e., contained within word boundaries. While providing a seemingly reasonable inductive bias, is this common practice limiting the potential of modern LMs? Whitespace is not a reliable delimiter of meaning, as evidenced by multi-word expressions (e.g., "by the way"), crosslingual variation in the number of words needed to express a concept (e.g., "spacesuit helmet" in German is "raumanzughelm"), and languages that do not use whitespace at all (e.g., Chinese). To explore the potential of tokenization beyond subwords, we introduce a "superword" tokenizer, SuperBPE, which incorporates a simple pretokenization curriculum into the byte-pair encoding (BPE) algorithm to first learn subwords, then superwords that bridge whitespace. This brings dramatic improvements in encoding efficiency: when fixing the vocabulary size to 200k, SuperBPE encodes a fixed piece of text with up to 33% fewer tokens than BPE on average. In experiments, we pretrain 8B transformer LMs from scratch while fixing the model size, vocabulary size, and train compute, varying *only* the algorithm for learning the vocabulary. Our model trained with SuperBPE achieves an average +4.0% absolute improvement over the BPE baseline across 30 downstream tasks (including +8.2% on MMLU), while simultaneously requiring 27% less compute at inference time. In analysis, we find that SuperBPE results in segmentations of text that are more uniform in per-token difficulty. Qualitatively, this may be because SuperBPE tokens often capture common multi-word expressions that function semantically as a single unit. SuperBPE is a straightforward, local modification to tokenization that improves both encoding efficiency and downstream performance, yielding better language models overall. 
    more » « less
    Free, publicly-accessible full text available April 14, 2026
  3. Freed, R; Harshaw, R; Genet, Russell M (Ed.)
    We have taken astrometric measurements of three star systems: WDS 00033+5332 A 1500 AB,C, WDS 05283+0358 HJ 2266, and WDS 19557+3805 DAM 1 AB. We used the Las Cumbres Observatory telescopes to take images of these star systems, and we then analyzed them using Afterglow Workbench. For WDS 00033+5332, we found the position angle to be 81.62° ± 0.45° and an angular separation of 9.01’’ ± 0.04’’. Based on our analysis, we were not able to determine whether the WDS 00033+5332 double is physical. For WDS 05283+0358, we found the position angle to be 37.58° ± 0.15° and an angular separation of 7.29’’ ± 0.04’’. It is already known that WDS 05283+0358 is a physical double, and our new data supports this claim. For WDS 19557+3805, we found the position angle to be 234.64° ± 0.63° and an angular separation of 6.89’’ ± 0.10’’. Our new data points suggest this system is gravitationally bound 
    more » « less
  4. We present a comprehensive photometric and spectroscopic study of the Type IIP supernova (SN) 2018is. TheVband luminosity and the expansion velocity at 50 days post-explosion are −15.1 ± 0.2 mag (corrected for AV= 1.34 mag) and 1400 km s−1, classifying it as a low-luminosity SN II. The recombination phase in theVband is shorter, lasting around 110 days, and exhibits a steeper decline (1.0 mag per 100 days) compared to most other low-luminosity SNe II. Additionally, the optical and near-infrared spectra display hydrogen emission lines that are strikingly narrow, even for this class. The Fe IIand Sc IIline velocities are at the lower end of the typical range for low-luminosity SNe II. Semi-analytical modelling of the bolometric light curve suggests an ejecta mass of ∼8 M, corresponding to a pre-supernova mass of ∼9.5 M, and an explosion energy of ∼0.40 × 1051erg. Hydrodynamical modelling further indicates that the progenitor had a zero-age main sequence mass of 9 M, coupled with a low explosion energy of 0.19 × 1051erg. The nebular spectrum reveals weak [O I]λλ6300,6364 lines, consistent with a moderate-mass progenitor, while features typical of Fe core-collapse events, such as He I, [C I], and Fe I, are indiscernible. However, the redder colours and low ratio of Ni to Fe abundance do not support an electron-capture scenario either. As a low-luminosity SN II with an atypically steep decline during the photospheric phase and remarkably narrow emission lines, SN 2018is contributes to the diversity observed within this population. 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  5. Abstract Theory predicts that rising CO2increases global photosynthesis, a process known as CO2fertilization, and that this is responsible for much of the current terrestrial carbon sink. The estimated magnitude of the historic CO2fertilization, however, differs by an order of magnitude between long-term proxies, remote sensing-based estimates and terrestrial biosphere models. Here we constrain the likely historic effect of CO2on global photosynthesis by combining terrestrial biosphere models, ecological optimality theory, remote sensing approaches and an emergent constraint based on global carbon budget estimates. Our analysis suggests that CO2fertilization increased global annual terrestrial photosynthesis by 13.5 ± 3.5% or 15.9 ± 2.9 PgC (mean ± s.d.) between 1981 and 2020. Our results help resolve conflicting estimates of the historic sensitivity of global terrestrial photosynthesis to CO2and highlight the large impact anthropogenic emissions have had on ecosystems worldwide. 
    more » « less
  6. Abstract We present JWST NIRCam (F356W and F444W filters) and MIRI (F770W) images and NIRSpec Integral Field Unit (IFU) spectroscopy of the young Galactic supernova remnant Cassiopeia A (Cas A) to probe the physical conditions for molecular CO formation and destruction in supernova ejecta. We obtained the data as part of a JWST survey of Cas A. The NIRCam and MIRI images map the spatial distributions of synchrotron radiation, Ar-rich ejecta, and CO on both large and small scales, revealing remarkably complex structures. The CO emission is stronger at the outer layers than the Ar ejecta, which indicates the re-formation of CO molecules behind the reverse shock. NIRSpec-IFU spectra (3–5.5μm) were obtained toward two representative knots in the NE and S fields that show very different nucleosynthesis characteristics. Both regions are dominated by the bright fundamental rovibrational band of CO in the two R and P branches, with strong [Arvi] and relatively weaker, variable strength ejecta lines of [Siix], [Caiv], [Cav], and [Mgiv]. The NIRSpec-IFU data resolve individual ejecta knots and filaments spatially and in velocity space. The fundamental CO band in the JWST spectra reveals unique shapes of CO, showing a few tens of sinusoidal patterns of rovibrational lines with pseudocontinuum underneath, which is attributed to the high-velocity widths of CO lines. Our results with LTE modeling of CO emission indicate a temperature of ∼1080 K and provide unique insight into the correlations between dust, molecules, and highly ionized ejecta in supernovae and have strong ramifications for modeling dust formation that is led by CO cooling in the early Universe. 
    more » « less
  7. Background The distribution of resources can affect animal range sizes, which in turn may alter infectious disease dynamics in heterogenous environments. The risk of pathogen exposure or the spatial extent of outbreaks may vary with host range size. This study examined the range sizes of herbivorous anthrax host species in two ecosystems and relationships between spatial behavior and patterns of disease outbreaks for a multi-host environmentally transmitted pathogen. Methods We examined range sizes for seven host species and the spatial extent of anthrax outbreaks in Etosha National Park, Namibia and Kruger National Park, South Africa, where the main host species and numbers of cases differ. We evaluated host range sizes using the local convex hull method at different temporal scales, within-individual temporal range overlap, and relationships between ranging behavior and species contributions to anthrax cases in each park. We estimated the spatial extent of annual anthrax mortalities and evaluated whether the extent was correlated with case numbers of a given host species. Results Range size differences among species were not linearly related to anthrax case numbers. In Kruger the main host species had small range sizes and high range overlap, which may heighten exposure when outbreaks occur within their ranges. However, different patterns were observed in Etosha, where the main host species had large range sizes and relatively little overlap. The spatial extent of anthrax mortalities was similar between parks but less variable in Etosha than Kruger. In Kruger outbreaks varied from small local clusters to large areas and the spatial extent correlated with case numbers and species affected. Case numbers of secondary host species with larger range sizes were positively correlated with the spatial extent of outbreaks in both parks. Conclusions Our results provide new information on the spatiotemporal structuring of ranging movements of anthrax host species in two ecosystems. The results linking anthrax dynamics to host space use are correlative, yet suggest that, though partial and proximate, host range size and overlap may be contributing factors in outbreak characteristics for environmentally transmitted pathogens. 
    more » « less
  8. Invasive species alter invaded ecosystems via direct impacts such as consumption. In turn, an invasive species’ ability to thrive in new habitats depends on its ability to exploit available resources, which may change over time and space. Diet quality and quantity are indicators of a consumer’s consumptive effects and can be strongly influenced by season and latitude. We examined the effects of season and latitude on the diet quality and quantity of the invasive Asian shore crab Hemigrapsus sanguineus throughout a non-winter sampling year at 5 different sites spanning 8° of latitude across its invaded United States range. We found that diet quality, averaged through time, largely follows an expected latitudinal cline, being higher in the center of its range and lower toward the southern and northern edges. We also found that while some sites show similar patterns of diet quality variation with season, no pattern is consistent across all latitudes. Finally, we found that crabs at sites with low diet quality during summer reproductive months did not compensate by increasing total consumption. Because the Asian shore crab is an important consumer in its invaded ecosystems, understanding how its diet quality and quantity vary with season and latitude can help us better understand how this species influences trophic interactions and community structure, how it has been able to establish across a wide ecological and environmental range, and where future range expansion is most likely to occur. 
    more » « less