skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The DOI auto-population feature in the Public Access Repository (PAR) will be unavailable from 4:00 PM ET on Tuesday, July 8 until 4:00 PM ET on Wednesday, July 9 due to scheduled maintenance. We apologize for the inconvenience caused.


Search for: All records

Creators/Authors contains: "Maguluri, Siva Theja"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Motivated by engineering applications such as resource allocation in networks and inventory systems, we consider average-reward Reinforcement Learning with unbounded state space and reward function. Recent work Murthy et al. (2024) studied this problem in the actor-critic framework and established finite sample bounds assuming access to a critic with certain error guarantees. We complement their work by studying Temporal Difference (TD) learning with linear function approximation and establishing finite-time bounds with the optimal sample complexity. These results are obtained using the following general-purpose theorem for non-linear Stochastic Approximation (SA). Suppose that one constructs a Lyapunov function for a non-linear SA with certain drift condition. Then, our theorem establishes finite-time bounds when this SA is driven by unbounded Markovian noise under suitable conditions. It serves as a black box tool to generalize sample guarantees on SA from i.i.d. or martingale difference case to potentially unbounded Markovian noise. The generality and the mild assumptions of the setup enables broad applicability of our theorem. We illustrate its power by studying two more systems: (i) We improve upon the finite-time bounds of Q-learning in Chen et al. (2024) by tightening the error bounds and also allowing for a larger class of behavior policies. (ii) We establish the first ever finite-time bounds for distributed stochastic optimization of high-dimensional smooth strongly convex function using cyclic block coordinate descent. 
    more » « less
    Free, publicly-accessible full text available May 5, 2026
  2. Free, publicly-accessible full text available April 1, 2026
  3. Researchers have developed a novel model inspired by quantum switches to address the complexities of matching requests for entangled qubits in a discrete-time system. The study examines two types of arrivals: requests for entangled qubits between nodes and qubits supplied by nodes, which are subject to decoherence over time. Unlike classical queueing models, this system features server-less multiway matching and correlated abandonments, posing unique analytical challenges. By applying a max-weight policy, the researchers characterized the system’s stability using a two-time-scale fluid limit to account for qubit abandonments. They demonstrated that the max-weight policy is throughput optimal, outperforming nonidling policies under certain conditions. Intriguingly, the study revealed counterintuitive behavior: The longest request queue may grow temporarily, even in a stable system. These findings offer new insights into managing quantum-inspired systems with practical constraints, opening avenues for further research into quantum network optimization. 
    more » « less
    Free, publicly-accessible full text available March 4, 2026
  4. We consider a load-balancing system composed of a fixed number of single-server queues operating under the well-known join-the-shortest queue policy and where jobs/customers are impatient and abandon if they do not receive service after some (random) amount of time. In this setting, we characterize the centered and appropriately scaled steady-state queue-length distribution (hereafter referred to as limiting distribution) in the limit as the abandonment rate goes to zero at the same time as the load either converges to one or is larger than one. Depending on the arrival, service, and abandonment rates, we observe three different regimes of operation that yield three different limiting distributions. The first regime is when the system is underloaded, and its load converges relatively slowly to one. In this case, abandonments do not affect the limiting distribution, and we obtain the same exponential distribution as in the system without abandonments. When the load converges to one faster, we have the second regime, where abandonments become significant. Here, the system undergoes a phase transition, and the limiting distribution is a truncated Gaussian. Further, the third regime is when the system is heavily overloaded, and so, the queue lengths are very large. In this case, we show that the limiting distribution converges to a normal distribution. To establish our results, we first prove a weaker form of state space collapse by providing a uniform bound on the second moment of the (unscaled) perpendicular component of the queue lengths, which shows that the system behaves like a single-server queue. We then use exponential Lyapunov functions to characterize the limiting distribution of the steady-state queue-length vector. Funding: This work was supported by the National Science Foundation [Grants CMMI-2140534 and EPCN-2144316]. 
    more » « less
    Free, publicly-accessible full text available February 19, 2026
  5. This paper studies the input-queued switch operating under the MaxWeight algorithm when the arrivals are according to a Markovian process. We exactly characterize the heavy-traffic scaled mean sum queue length in the heavy-traffic limit, and show that it is within a factor of less than 2 from a universal lower bound. Moreover, we obtain lower and upper bounds that are applicable in all traffic regimes and become tight in the heavy-traffic regime. We obtain these results by generalizing the drift method recently developed for the case of independent and identically distributed arrivals to the case of Markovian arrivals. We illustrate this generalization by first obtaining the heavy-traffic mean queue length and its distribution in a single-server queue under Markovian arrivals and then applying it to the case of an input-queued switch. The key idea is to exploit the geometric mixing of finite-state Markov chains, and to work with a time horizon that is chosen so that the error due to mixing depends on the heavy-traffic parameter. 
    more » « less
  6. We study optimal pricing in a single server queue when the customers valuation of service depends on their waiting time. In particular, we consider a very general model, where the customer valuations are random and are sampled from a distribution that depends on the queue length. The goal of the service provider is to set dynamic state dependent prices in order to maximize its revenue, while also managing congestion. We model the problem as a Markov decision process and present structural results on the optimal policy. We also present an algorithm to find an approximate optimal policy. We further present a myopic policy that is easy to evaluate and present bounds on its performance. We finally illustrate the quality of our approximate solution and the myopic solution using numerical simulations. 
    more » « less
  7. Motivated by applications from gig economy and online marketplaces, we study a two-sided queueing system under joint pricing and matching controls. The queueing system is modeled by a bipartite graph, where the vertices represent customer or server types and the edges represent compatible customer-server pairs. We propose a threshold-based two-price policy and queue length-based maximum-weight matching policy and show that it achieves a near-optimal profit. We study the system under the large-scale regime, wherein the arrival rates are scaled up, and under the large-market regime, wherein both the arrival rates and numbers of customer and server types increase. We show that two-price policy is a primary driver for optimality in the large-scale regime. We demonstrate the advantage of maximum-weight matching with respect to the number of customer and server types. Concurrently, we show that the interplay of pricing and matching is crucial for optimality in the large-market regime. 
    more » « less