skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Balaji"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 24, 2026
  2. Free, publicly-accessible full text available June 29, 2026
  3. Free, publicly-accessible full text available May 19, 2026
  4. Free, publicly-accessible full text available July 13, 2026
  5. Two modern programs involving analogies between general relativity and electro-magnetism, gravito-electromagnetism (GEM) and the classical double copy (CDC), induce electromagnetic potentials from specific classes of spacetime metrics. We demonstrate such electromagnetic potentials are typically gauge equivalent to Killing vectors present in the spacetime, long known themselves to be analogous to electromagnetic potentials. We utilize this perspective to relate the Type D Weyl double copy to the Kerr-Schild double copy without appealing to specific coordinates. We analyze the typical assumptions taken within Kerr-Schild double copies, emphasizing the role Killing vectors play in the construction. The basis of the GEM program utilizes comparisons of tidal tensors between GR and EM; we perform a more detailed analysis of conditions necessary for equivalent tidal tensors between the theories, and note they require the same source prescription as the classical double copy. We discuss how these Killing vector potentials relate to the Weyl double copy, in particular there must a relation between the field strength formed from the Killing vector and the Weyl tensor. We consider spacetimes admitting a Killing-Yano tensor which provide a particularly insightful example of this correspondence. This includes a broad class of spacetimes, and provides an explanation for observations regarding the splitting of the Weyl tensor noted when including sources. 
    more » « less
    Free, publicly-accessible full text available May 1, 2026
  6. Free, publicly-accessible full text available June 17, 2026
  7. Free, publicly-accessible full text available June 4, 2026
  8. Training large language models (LLMs) increasingly relies on geographically distributed accelerators, causing prohibitive communication costs across regions and uneven utilization of heterogeneous hardware. We propose HALoS, a hierarchical asynchronous optimization framework that tackles these issues by introducing local parameter servers (LPSs) within each region and a global parameter server (GPS) that merges updates across regions. This hierarchical design minimizes expensive inter-region communication, reduces straggler effects, and leverages fast intra-region links. We provide a rigorous convergence analysis for HALoS under non-convex objectives, including theoretical guarantees on the role of hierarchical momentum in asynchronous training. Empirically, HALoS attains up to 7.5x faster convergence than synchronous baselines in geo-distributed LLM training and improves upon existing asynchronous methods by up to 2.1x. Crucially, HALoS preserves the model quality of fully synchronous SGD-matching or exceeding accuracy on standard language modeling and downstream benchmarks-while substantially lowering total training time. These results demonstrate that hierarchical, server-side update accumulation and global model merging are powerful tools for scalable, efficient training of new-era LLMs in heterogeneous, geo-distributed environments. 
    more » « less
    Free, publicly-accessible full text available June 5, 2026
  9. Sparse Bayesian Learning (SBL) is a popular sparse signal recovery method, and various algorithms exist under the SBL paradigm. In this paper, we introduce a novel re-parameterization that allows the iterations of existing algorithms to be viewed as special cases of a unified and general mapping function. Furthermore, the re-parameterization enables an interesting beamforming interpretation that lends insights to all the considered algorithms. Utilizing the abstraction allowed by the general mapping viewpoint, we introduce a novel neural network architecture for learning improved iterative update rules under the SBL framework. Our modular design of the architecture enables the model to be independent of the size of the measurement matrix and provides us a unique opportunity to test the generalization capabilities across different measurement matrices. We show that the network when trained on a particular parameterized dictionary generalizes in many ways hitherto not possible; different measurement matrices, both type and dimension, and number of snapshots. Our numerical results showcase the generalization capability of our network in terms of mean square error and probability of support recovery across sparsity levels, different signal-to-noise ratios, number of snapshots and multiple measurement matrices of different sizes. 
    more » « less
    Free, publicly-accessible full text available April 6, 2026
  10. Training large language models (LLMs) increasingly relies on geographically distributed accelerators, causing prohibitive communication costs across regions and uneven utilization of heterogeneous hardware. We propose HALoS, a hierarchical asynchronous optimization framework that tackles these issues by introducing local parameter servers (LPSs) within each region and a global parameter server (GPS) that merges updates across regions. This hierarchical design minimizes expensive inter-region communication, reduces straggler effects, and leverages fast intra-region links. We provide a rigorous convergence analysis for HALoS under non-convex objectives, including theoretical guarantees on the role of hierarchical momentum in asynchronous training. Empirically, HALoS attains up to 7.5x faster convergence than synchronous baselines in geo-distributed LLM training and improves upon existing asynchronous methods by up to 2.1x. Crucially, HALoS preserves the model quality of fully synchronous SGD-matching or exceeding accuracy on standard language modeling and downstream benchmarks-while substantially lowering total training time. These results demonstrate that hierarchical, server-side update accumulation and global model merging are powerful tools for scalable, efficient training of new-era LLMs in heterogeneous, geo-distributed environments. 
    more » « less
    Free, publicly-accessible full text available June 5, 2026