skip to main content


Search for: All records

Creators/Authors contains: "Zhu, Z."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. One of the grand challenges in computer vision is to recover 3D poses and shapes of multiple human bodies with absolute scales from a single RGB image. The challenge stems from the inherent depth and scale ambiguity from a single view. The state of the art on 3D human pose and shape estimation mainly focuses on estimating the 3D joint locations relative to the root joint, defined as the pelvis joint. In this paper, a novel approach called Absolute-ROMP is proposed, which builds upon a one-stage multi-person 3D mesh predictor network, ROMP, to estimate multi-person 3D poses and shapes, but with absolute scales from a single RGB image. To achieve this, we introduce absolute root joint localization in the camera coordinate frame, which enables the estimation of 3D mesh coordinates of all persons in the image and their root joint locations normalized by the focal point. Moreover, a CNN and transformer hybrid network, called TransFocal, is proposed to predict the focal length of the image’s camera. This enables Absolute-ROMP to obtain absolute depth information of all joints in the camera coordinate frame, further improving the accuracy of our proposed method. The Absolute-ROMP is evaluated on the root joint localization and root-relative 3D pose estimation tasks on publicly available multi-person 3D pose datasets, and TransFocal is evaluated on a dataset created from the Pano360 dataset. Our proposed approach achieves state-of-the-art results on these tasks, outperforming existing methods or has competitive performance. Due to its real-time performance, our method is applicable to in-the-wild images and videos. 
    more » « less
    Free, publicly-accessible full text available August 22, 2025
  2. Free, publicly-accessible full text available August 13, 2025
  3. Navigating safely and independently presents considerable challenges for people who are blind or have low vision (BLV), as it re- quires a comprehensive understanding of their neighborhood environments. Our user study reveals that understanding sidewalk materials and objects on the sidewalks plays a crucial role in navigation tasks. This paper presents a pioneering study in the field of navigational aids for BLV individuals. We investigate the feasibility of using auditory data, specifically the sounds produced by cane tips against various sidewalk materials, to achieve material identification. Our approach utilizes ma- chine learning and deep learning techniques to classify sidewalk materials solely based on audio cues, marking a significant step towards empowering BLV individuals with greater autonomy in their navigation. This study contributes in two major ways: Firstly, a lightweight and practical method is developed for volunteers or BLV individuals to autonomously collect auditory data of sidewalk materials using a microphone-equipped white cane. This innovative approach transforms routine cane usage into an effective data-collection tool. Secondly, a deep learning-based classifier algorithm is designed that leverages a dual architecture to enhance audio feature extraction. This includes a pre-trained Convolutional Neural Network (CNN) for regional feature extraction from two-dimensional Mel-spectrograms and a booster module for global feature enrichment. 
    more » « less
    Free, publicly-accessible full text available March 27, 2025
  4. This paper presents the results of a research that created and analyzed a Multimedia dataset for building energy efficiency estimation. First a new Multimedia Building Energy Efficiency (MMBEE) dataset was created from publicly available data. This work then explored the use of the window-to-wall ratio (WWR) information from building facade images and integrated it with traditional tabular data to create new training data, in order to predict building energy efficiency measures. Finally, we discuss potential applications and future research directions in using the MMBEE dataset for building energy efficiency prediction. Throughout the paper, a number of important processes and analyses were performed, which include feature selection, data correlation analysis, WWR extraction, and comparison of deep network and random forest models in building energy efficiency estimation. From this first attempt at using the Multimedia dataset for building energy efficiency estimation, we found the performances of deep models were better than traditional models such as random forest. We also found that there was an optimal point of what features shall be used for the prediction. Nonetheless, the incorporation of the current WWR estimation results did not yield the anticipated enhancement in estimation performance. Subsequently, a comprehensive investigation was conducted to ascertain potential contributing factors, and several avenues for future research were identified to enhance the predictive utility of the WWR feature. 
    more » « less
    Free, publicly-accessible full text available March 27, 2025
  5. Much of our knowledge of the North American lithosphere comes from imaging seismic velocities. Additional constraints on the subsurface can be gained by studying seismic attenuation, which has different sensitivity to physical properties. We produce a model of lateral variations in attenuation across the conterminous U.S. by analyzing data recorded by the EarthScope Transportable Array. We divide the study area into 12 overlapping tiles and differential attenuation is measured in each tile independently; and twice for four of the tiles. Measurements are combined into a smooth map using a set of linear inversions. Comparing results for adjacent tiles and for repeated tiles shows that the imaged features are robust. The final map shows generally higher attenuation west of the Rocky Mountain Front than east of it, with significant small length scale variations superimposed on that broad pattern. In general, there is a strong anticorrelation between differential attenuation and shear wave velocities at depths of 80–250 km. However, a given change in velocity may correspond to a large or small change in attenuation, depending on the area; suggesting that different physical mechanisms are operating. In the western and south‐central U.S., as well as the Appalachians, velocity variations are large compared to attenuation changes, while the opposite is true in the north‐central and southeastern U.S. Calculations with the Very Broadband Rheology calculator show that these results are consistent with the main source of heterogeneity being temperature and melt fraction in the former regions and grain size variability in the latter ones.

     
    more » « less
    Free, publicly-accessible full text available December 1, 2024
  6. Robles, A. (Ed.)
    Although various navigation apps are available, people who are blind or have low vision (PVIB) still face challenges to locate store entrances due to missing geospatial information in existing map services. Previously, we have developed a crowdsourcing platform to collect storefront accessibility and localization data to address the above challenges. In this paper, we have significantly improved the efficiency of data collection and user engagement in our new AI-enabled Smart DoorFront platform by designing and developing multiple important features, including a gamified credit ranking system, a volunteer contribution estimator, an AI-based pre-labeling function, and an image gallery feature. For achieving these, we integrate a specially designed deep learning model called MultiCLU into the Smart DoorFront. We also introduce an online machine learning mechanism to iteratively train the MultiCLU model, by using newly labeled storefront accessibility objects and their locations in images. Our new DoorFront platform not only significantly improves the efficiency of storefront accessibility data collection, but optimizes user experience. We have conducted interviews with six adults who are blind to better understand their daily travel challenges and their feedback indicated that the storefront accessibility data collected via the DoorFront platform would be very beneficial for them. 
    more » « less