skip to main content


Search for: All records

Award ID contains: 1942714

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available December 15, 2024
  2. Free, publicly-accessible full text available December 15, 2024
  3. Free, publicly-accessible full text available December 15, 2024
  4. Cloud computing has become a major approach to help reproduce computational experiments. Yet there are still two main difficulties in reproducing batch based big data analytics (including descriptive and predictive analytics) in the cloud. The first is how to automate end-to-end scalable execution of analytics including distributed environment provisioning, analytics pipeline description, parallel execution, and resource termination. The second is that an application developed for one cloud is difficult to be reproduced in another cloud, a.k.a. vendor lock-in problem. To tackle these problems, we leverage serverless computing and containerization techniques for automated scalable execution and reproducibility, and utilize the adapter design pattern to enable application portability and reproducibility across different clouds. We propose and develop an open-source toolkit that supports 1) fully automated end-to-end execution and reproduction via a single command, 2) automated data and configuration storage for each execution, 3) flexible client modes based on user preferences, 4) execution history query, and 5) simple reproduction of existing executions in the same environment or a different environment. We did extensive experiments on both AWS and Azure using four big data analytics applications that run on virtual CPU/GPU clusters. The experiments show our toolkit can achieve good execution performance, scalability, and efficient reproducibility for cloud-based big data analytics. 
    more » « less
  5. Spatial resolution is critical for observing and monitoring environmental phenomena. Acquiring high-resolution bathymetry data directly from satellites is not always feasible due to limitations on equipment, so spatial data scientists and researchers turn to single image super-resolution (SISR) methods that utilize deep learning techniques as an alternative method to increase pixel density. While super resolution residual networks (e.g., SR-ResNet) are promising for this purpose, several challenges still need to be addressed: (1) Earth data such as bathymetry is expensive to obtain and relatively limited in its data record amount; (2) certain domain knowledge needs to be complied with during model training; (3) certain areas of interest require more accurate measurements than other areas. To address these challenges, following the transfer learning principle, we study how to leverage an existing pre-trained super-resolution deep learning model, namely SR-ResNet, for high-resolution bathymetry data generation. We further enhance the SR-ResNet model to add corresponding loss functions based on domain knowledge. To let the model perform better for certain spatial areas, we add additional loss functions to increase the penalty of the areas of interest. Our experiments show our approaches achieve higher accuracy than most baseline models when evaluating using metrics including MSE, PSNR, and SSIM. 
    more » « less
  6. Arctic amplification has altered the climate patterns both regionally and globally, resulting in more frequent and more intense extreme weather events in the past few decades. The essential part of Arctic amplification is the unprecedented sea ice loss as demonstrated by satellite observations. Accurately forecasting Arctic sea ice from sub-seasonal to seasonal scales has been a major research question with fundamental challenges at play. In addition to physics-based Earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven approaches to study sea ice variations, we propose MT-IceNet – a UNet-based spatial and multi-temporal (MT) deep learning model for forecasting Arctic sea ice concentration (SIC). The model uses an encoder-decoder architecture with skip connections and processes multi-temporal input streams to regenerate spatial maps at future timesteps. Using bi-monthly and monthly satellite retrieved sea ice data from NSIDC as well as atmospheric and oceanic variables from ERA5 reanalysis product during 1979-2021, we show that our proposed model provides promising predictive performance for per-pixel SIC forecasting with up to 60% decrease in prediction error for a lead time of 6 months as compared to its state-of-the-art counterparts. 
    more » « less
  7. Domain adaptation techniques using deep neural networks have been mainly used to solve the distribution shift problem in homogeneous domains where data usually share similar feature spaces and have the same dimensionalities. Nevertheless, real world applications often deal with heterogeneous domains that come from completely different feature spaces with different dimensionalities. In our remote sensing application, two remote sensing datasets collected by an active sensor and a passive one are heterogeneous. In particular, CALIOP actively measures each atmospheric column. In this study, 25 measured variables/features that are sensitive to cloud phase are used and they are fully labeled. VIIRS is an imaging radiometer, which collects radiometric measurements of the surface and atmosphere in the visible and infrared bands. Recent studies have shown that passive sensors may have difficulties in prediction cloud/aerosol types in complicated atmospheres (e.g., overlapping cloud and aerosol layers, cloud over snow/ice surface, etc.). To overcome the challenge of the cloud property retrieval in passive sensor, we develop a novel VAE based approach to learn domain invariant representation that capture the spatial pattern from multiple satellite remote sensing data (VDAM), to build a domain invariant cloud property retrieval method to accurately classify different cloud types (labels) in the passive sensing dataset. We further exploit the weight based alignment method on the label space to learn a powerful domain adaptation technique that is pertinent to the remote sensing application. Experiments demonstrate our method outperforms other state-of-the-art machine learning methods and achieves higher accuracy in cloud property retrieval in the passive satellite dataset. 
    more » « less
  8. The Arctic is a region with unique climate features, motivating new AI methodologies to study it. Unfortunately, Arc- tic sea ice has seen a continuous decline since 1979. This not only poses a significant threat to Arctic wildlife and surrounding coastal communities but is also adversely affecting the global climate patterns. To study the potential of AI in tackling climate change, we analyze the performance of four probabilistic machine learning methods in forecasting sea-ice extent for lead times of up to 6 months, further comparing them with traditional machine learning methods. Our comparative analysis shows that Gaussian Process Regression is a good fit to predict sea-ice extent for longer lead times with lowest RMSE score. 
    more » « less