skip to main content

Search for: All records

Creators/Authors contains: "Zhang, Hui"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. As the size of data generated every day grows dramatically, the computational bottleneck of computer systems has shifted toward storage devices. The interface between the storage and the computational platforms has become the main limitation due to its limited bandwidth, which does not scale when the number of storage devices increases. Interconnect networks do not provide simultaneous access to all storage devices and thus limit the performance of the system when executing independent operations on different storage devices. Offloading the computations to the storage devices eliminates the burden of data transfer from the interconnects. Near-storage computing offloads a portion ofmore »computations to the storage devices to accelerate big data applications. In this article, we propose a generic near-storage sort accelerator for data analytics, NASCENT2, which utilizes Samsung SmartSSD, an NVMe flash drive with an on-board FPGA chip that processes data in situ. NASCENT2 consists of dictionary decoder, sort, and shuffle FPGA-based accelerators to support sorting database tables based on a key column with any arbitrary data type. It exploits data partitioning applied by data processing management systems, such as SparkSQL, to breakdown the sort operations on colossal tables to multiple sort operations on smaller tables. NASCENT2 generic sort provides 2 × speedup and 15.2 × energy efficiency improvement as compared to the CPU baseline. It moreover considers the specifications of the SmartSSD (e.g., the FPGA resources, interconnect network, and solid-state drive bandwidth) to increase the scalability of computer systems as the number of storage devices increases. With 12 SmartSSDs, NASCENT2 is 9.9× (137.2 ×) faster and 7.3 × (119.2 ×) more energy efficient in sorting the largest tables of TPCC and TPCH benchmarks than the FPGA (CPU) baseline.« less
    Free, publicly-accessible full text available June 30, 2023
  2. Smooth topological surfaces embedded in 4D create complex internal structures in their projected 3D figures. Often these 3D figures twist, turn, and fold back on themselves, leaving important properties behind the surface sheets. Triangle meshes are not well suited for illustrating such internal structures and their topological features. In this paper, we propose a new approach to visualize these internal structures by slicing the 4D surfaces in our dimensions and revealing the underlying 4D structures using their cross-sectional diagrams. We think of a 4D-embedded surface as a collection of 3D curves stacked and evolved in time, very much like amore »3D movie in a time-elapse form; and our new approach is to translate a surface in 4-space into such a movie — a sequence of time-lapse frames where successive terms in the sequence differ at most by a critical change. The visualization interface presented in this paper allows us to interactively define the longitudinal axis, and the automatic algorithms can partition the 4D surface into parallel slices and expose its internal structure by generating a time-lapse movie consisting of topologically meaningful cross-sectional diagrams from the representative slices. We have extracted movies from a range of known 4D mathematical surfaces with our approach. The results of the usability study show that the proposed slicing interface allows a mathematically true user experience with surfaces in four dimensions.« less
    Free, publicly-accessible full text available December 1, 2022
  3. Free, publicly-accessible full text available December 1, 2022
  4. Free, publicly-accessible full text available November 1, 2022
  5. Free, publicly-accessible full text available October 1, 2022
  6. Free, publicly-accessible full text available October 1, 2022
  7. Abstract N- Arylation of NH -sulfoximines represents an appealing approach to access N- aryl sulfoximines, but has not been successfully applied to NH -diaryl sulfoximines. Herein, a copper-catalyzed photoredox dehydrogenative Chan-Lam coupling of free diaryl sulfoximines and arylboronic acids is described. This neutral and ligand-free coupling is initiated by ambient light-induced copper-catalyzed single-electron reduction of NH -sulfoximines. This electron transfer route circumvents the sacrificial oxidant employed in traditional Chan-Lam coupling reactions, increasing the environmental friendliness of this process. Instead, dihydrogen gas forms as a byproduct of this reaction. Mechanistic investigations also reveal a unique autocatalysis process. The C–N coupling products,more »N- arylated sulfoximines, serve as ligands along with NH -sulfoximine to bind to the copper species, generating the photocatalyst. DFT calculations reveal that both the NH -sulfoximine substrate and the N -aryl product can ligate the copper accounting for the observed autocatalysis. Two energetically viable stepwise pathways were located wherein the copper facilitates hydrogen atom abstraction from the NH -sulfoximine and the ethanol solvent to produce dihydrogen. The protocol described herein represents an appealing alternative strategy to the classic oxidative Chan-Lam reaction, allowing greater substrate generality as well as the elimination of byproduct formation from oxidants.« less
    Free, publicly-accessible full text available December 1, 2022
  8. Free, publicly-accessible full text available October 1, 2022
  9. null (Ed.)
    In this paper, we investigate the intersection traffic management for connected automated vehicles (CAVs). In particular, a decentralized autonomous intersection management scheme that takes into account both the traffic efficiency and scheduling flexibility is proposed, which adopts a novel intersection–vehicle model to check conflicts among CAVs in the entire intersection area. In addition, a priority-based collision-avoidance rule is set to improve the performance of traffic efficiency and shorten the delays of emergency CAVs. Moreover, a multi-objective function is designed to obtain the optimal trajectories of CAVs, which considers ride comfort, velocities of CAVs, fuel consumption, and the constraints of safety,more »velocity, and acceleration. Simulation results demonstrate that our proposed scheme can achieve good performance in terms of traffic efficiency and shortening the delays of emergency CAVs.« less
    Free, publicly-accessible full text available September 1, 2022
  10. The performance of Adaptive Bitrate (ABR) algorithms for video streaming depends on accurately predicting the download time of video chunks. Existing prediction approaches (i) assume chunk download times are dominated by network throughput; and (ii) apriori cluster sessions (e.g., based on ISP and CDN) and only learn from sessions in the same cluster. We make three contributions. First, through analysis of data from real-world video streaming sessions, we show (i) apriori clustering prevents learning from related clusters; and (ii) factors such as the Time to First Byte (TTFB) are key components of chunk download times but not easily incorporated intomore »existing prediction approaches. Second, we propose Xatu, a new prediction approach that jointly learns a neural network sequence model with an interpretable automatic session clustering method. Xatu learns clustering rules across all sessions it deems relevant, and models sequences with multiple chunk-dependent features (e.g., TTFB) rather than just throughput. Third, evaluations using the above datasets and emulation experiments show that Xatu significantly improves prediction accuracies by 23.8% relative to CS2P (a state-of-the-art predictor). We show Xatu provides substantial performance benefits when integrated with multiple ABR algorithms including MPC (a well studied ABR algorithm), and FuguABR (a recent algorithm using stochastic control) relative to their default predictors (CS2P and a fully connected neural network respectively). Further, Xatu combined with MPC outperforms Pensieve, an ABR based on deep reinforcement learning.« less
    Free, publicly-accessible full text available October 1, 2022