skip to main content


This content will become publicly available on August 12, 2025

Title: Agile Queue: A Fast Scalable Concurrent FIFO Queue on GPU
Award ID(s):
2219543 1907838
NSF-PAR ID:
10545291
Author(s) / Creator(s):
; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400718021
Page Range / eLocation ID:
108 to 109
Format(s):
Medium: X
Location:
Gotland Sweden
Sponsoring Org:
National Science Foundation
More Like this
  1. Queue-Sharing Multiple Access (QSMA) is introduced and analyzed. The new channel-access method consists of establishing and maintaining a distributed transmission queue among nodes sharing a common channel and results in a sequence of queue cycles, with each cycle having one or multiple queue turns with collision-free transmissions from nodes that have joined the transmission queue, followed by a joining period for the current cycle. Nodes can take advantage of carrier sensing to improve the efficiency with which nodes join and use the shared transmission queue. The through- put of ALOHA with priority ACK’s, CSMA with priority ACK’s, CSMA/CD with priority ACK’s, TDMA with a fixed schedule, and QSMA with and without carrier sensing is compared analytically and by simulation in ns-3. The results show that QSMA is more efficient than TDMA with the simplicity of CSMA or ALOHA. 
    more » « less
  2. Modern high-speed devices (e.g., network adapters, storage, accelerators) use new host interfaces, which expose multiple software queues directly to the device. These multi-queue interfaces allow mutually distrusting applications to access the device without any cross-core interaction, enabling throughput in the order of millions of IOP/s on multicore systems. Unfortunately, while independent device access is scalable, it also introduces a new problem: unfairness. Mechanisms that were used to provide fairness for older devices are no longer tenable in the wake of multi-queue design, and straightforward attempts to re-introduce it would require cross-core synchronization that undermines the scalability for which multiple queues were designed. To address these challenges, we present Multi-Queue Fair Queueing (MQFQ), the first fair, work-conserving scheduler suitable for multi-queue systems. Specifically, we (1) reformulate a classical fair queueing algorithm to accommodate multiqueue designs, and (2) describe a scalable implementation that bounds potential unfairness while minimizing synchronization overhead. Our implementation of MQFQ in Linux 4.15 demonstrates both fairness and high throughput. Evaluation with an NVMe over RDMA fabric (NVMf) device shows that MQFQ can reach up to 3.1 Million IOP/s on a single machine--20× higher than the state-of-the-art Linux Budget Fair Queueing. Compared to a system with no fairness, MQFQ reduces the slowdown caused by an antagonist from 3:78× to 1:33× for the FlashX workload and from 6:57× to 1:03× for the Aerospike workload (2× is considered "fair" slowdown). 
    more » « less
  3. This article presents a Hawkes process model with Markovian baseline intensi- ties for high-frequency order book data modeling. We classied intraday order book trading events into a range of categories based on their order types and the price change after their arrivals. In order to capture the stimulating eects between mul- tiple types of order book events, we use multivariate Hawkes process to model the self- and mutually-exciting event arrivals. We also integrate a Markovian baseline intensities into the event arrival dynamic, by including the impacts of order book liquidity state and time factor on the baseline intensity. A regression-based non- parametric estimation procedure is adopted to estimate the model parameters in our Hawkes+Markovian model. To eliminate redundant model parameters, LASSO reg- ularization is incorporated into the estimation procedure. Besides, model selection method based on Akaike Information Criteria is applied to evaluate the eect of each part of the proposed model. An implementation example based on real LOB data is provided. Through the example we studied the empirical shapes of Hawkes excitement functions, the eects of liquidity as well as time factors, the LASSO vari- able selection, and the explanation power of Hawkes and Markovian elements to the dynamics of order book. 
    more » « less
  4. Abstract

    Imagine, you enter a grocery store to buy food. How many people do you overlap with in this store? How much time do you overlap with each person in the store? In this paper, we answer these questions by studying the overlap times between customers in the infinite server queue. We compute in closed form the steady-state distribution of the overlap time between a pair of customers and the distribution of the number of customers that an arriving customer will overlap with. Finally, we define a residual process that counts the number of overlapping customers that overlap in the queue for at least$\delta$time units and compute its distribution.

     
    more » « less