skip to main content

Search for: All records

Creators/Authors contains: "Zhang, Xiaodong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. R-tree is a foundational data structure used in spatial databases and scientific databases. With the advancement of networks and computer architectures, in-memory data processing for R-tree in distributed systems has become a common platform. We have observed new performance challenges to process R-tree as the amount of multidimensional datasets become increasingly high. Specifically, an R-tree server can be heavily overloaded while the network and client CPU are lightly loaded, and vice versa. In this article, we present the design and implementation of Catfish, an RDMA-enabled R-tree for low latency and high throughput by adaptively utilizing the available network bandwidth and computing resources to balance the workloads between clients and servers. We design and implement two basic mechanisms of using RDMA for a client-server R-tree data processing system. First, in the fast messaging design, we use RDMA writes to send R-tree requests to the server and let server threads process R-tree requests to achieve low query latency. Second, in the RDMA offloading design, we use RDMA reads to offload tree traversal from the server to the client, which rescues the server as it is overloaded. We further develop an adaptive scheme to effectively switch an R-tree search between fast messaging andmore »RDMA offloading, maximizing the overall performance. Our experiments show that the adaptive solution of Catfish on InfiniBand significantly outperforms R-tree that uses only fast messaging or only RDMA offloading in both latency and throughput. Catfish can also deliver up to one order of magnitude performance over the traditional schemes using TCP/IP on 1 and 40 Gbps Ethernet. We make a strong case to use RDMA to effectively balance workloads in distributed systems for low latency and high throughput.« less
    Free, publicly-accessible full text available June 30, 2023
  2. Free, publicly-accessible full text available June 10, 2023
  3. Free, publicly-accessible full text available July 1, 2023
  4. We derived the angular response function (WN) for scattering sensors that automatically satisfies the normalization criterion and its corresponding weight (WT).WN’s, derived for two commercial sensors, HydroScat-6 (HOBI Labs) and ECO-BB (Sea-Bird Inc.), agrees well with the Monte Carlo simulation and direct measurements. The backscattering measured for microbeads of known sizes agrees better with Mie calculation when the derivedWNwas applied. We deduced that the reduction ofWTwith increasing attenuation coefficient is related to path length attenuation and showed that this theoretically derived correction factor performs better than the default methods for the two commercial backscattering sensors. The analysis conducted in this study also leads to an estimate of uncertainty budget for the two sensors. The major uncertainty for ECO-BB is associated with its angular response function because of its wide field of view, whereas the main uncertainty for the HydrScat-6 is due to attenuation correction because of its relatively long path length.

  5. Visual contents, including images and videos, are dominant on the Internet today. The conventional search engine is mainly designed for textual documents, which must be extended to process and manage increasingly high volumes of visual data objects. In this paper, we present Mixer, an effective system to identify and analyze visual contents and to extract their features for data retrievals, aiming at addressing two critical issues: (1) efficiently and timely understanding visual contents, (2) retrieving them at high precision and recall rates without impairing the performance. In Mixer, the visual objects are categorized into different classes, each of which has representative visual features. Subsystems for model production and model execution are developed. Two retrieval layers are designed and implemented for images and videos, respectively. In this way, we are able to perform aggregation retrievals of the two types in efficient ways. The experiments with Baidu's production workloads and systems show that Mixer halves the model production time and raises the feature production throughput by 9.14x. Mixer also achieves the precision and recall of video retrievals at 95% and 97%, respectively. Mixer has been in its daily operations, which makes the search engine highly scalable for visual contents at a lowmore »cost. Having observed productivity improvement of upper-level applications in the search engine, we believe our system framework would generally benefit other data processing applications.« less
  6. Nested queries are commonly used to express complex use-cases by connecting the output of a subquery as an input to the outer query block. However, their execution is highly time consuming. Researchers have proposed various algorithms and techniques that unnest subqueries to improve performance. Since this is a customized approach that needs high algorithmic and engineering efforts, it is largely not an open feature in most existing database systems. Our approach is general-purpose and GPU-acceleration based, aiming for high performance at a minimum development cost. We look into the major differences between nested and unnested query structures to identify their merits and limits for GPU processing. Furthermore, we focus on the nested approach that is algorithmically simple and rich in parallels, in relatively low space complexity, and generic in program structure. We create a new code generation framework that best fits GPU for the nested method. We also make several critical system optimizations including massive parallel scanning with indexing, effective vectorization to optimize join operations, exploiting cache locality for loops and efficient GPU memory management. We have implemented the proposed solutions in NestGPU, a GPU-based column-store database system that is GPU device independent. We have extensively evaluated and tested themore »system to show the effectiveness of our proposed methods.« less
  7. Volume scattering functions were measured using two instruments in waters near the Ocean Station Papa (50°N 145°W) and show consistency in estimating theχ<#comment/>factor attributable to particles (χ<#comment/>p). Whileχ<#comment/>pin the study area exhibits a limited variability, it could vary significantly when compared with data obtained in various parts of the global oceans. The global comparison also confirms that the minimal variation ofχ<#comment/>pis at scattering angles near 120°. With an uncertainty of<<#comment/>10%<#comment/>,χ<#comment/>pcan be assumed as spectrally independent. For backscatter sensors with wide field of view (FOV), the averaging of scattering within the FOV reduces the values ofχ<#comment/>pneeded to compute the backscattering coefficient by up to 20% at angles<<#comment/>130∘<#comment/>.

  8. This paper presents a fault-tolerant control method for a quadrotor UAV using solely on-board sensors. A simultaneous localization and mapping (SLAM) system is developed utilizing a laser rangefinder and an open source SLAM algorithm called GMapping. This system allows for mapping of the surrounding environment as well as localizing the position of the quadrotor, enabling real-time position control. However, the SLAM system using the laser rangefinder may fail in certain degenerate environment like featureless tunnels or straight hallways. In order to compensate for possible faults in the SLAM measurements, a fault detection and fault-tolerant control method is developed. An observer is designed to estimate the translational velocity of the quadrotor using SLAM position measurements. The fault detection residual is defined as the deviation between this SLAM-based velocity estimate and another velocity estimate generated by an optical flow algorithm utilizing measurements provided by a downward facing camera. Real-time experimental results have shown the effectiveness of the fault-tolerant control algorithm.