skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Inverse Distance Weighted Random Forests: Modeling Unevenly Distributed Non-Stationary Geographic Data
Recent years saw explosive growth of Human Geography Data, in which spatial non-stationarity is often observed, i.e., relationships between features depend on the location. For these datasets, a single global model cannot accurately describe the relationships among features that vary across space. To address this problem, a viable solution- that has been adopted by many studies-is to create multiple local models instead of a global one, with each local model representing a subregion of the space. However, the challenge with this approach is that the local models are only fitted to nearby observations. For sparsely sampled regions, the data could be too few to generate any high-quality model. This is especially true for Human Geography datasets, as human activities tend to cluster at a few locations. In this paper, we present a modeling method that addresses this problem by letting local models operate within relatively large subregions, where overlapping is allowed. Results from all local models are then fused using an inverse distance weighted approach, to minimize the impact brought by overlapping. Experiments showed that this method handles non-stationary geographic data very Well, even When they are unevenly distributed.  more » « less
Award ID(s):
2018611 1920182 1532061 1338922
PAR ID:
10275853
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2020 International Conference on Advanced Computer Science and Information Systems (ICACSIS)
Page Range / eLocation ID:
41 to 46
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Geographic datasets are usually accompanied by spatial non-stationarity – a phenomenon that the relationship between features varies across space. Naturally, nonstationarity can be interpreted as the underlying rule that decides how data are generated and alters over space. Therefore, traditional machine learning algorithms are not suitable for handling non-stationary geographic datasets, as they only render a single global model. To solve this problem, researchers often adopt the multiple-local-model approach, which uses different models to account for different sub-regions of space. This approach has been proven efficient but not optimal, as it is inherently difficult to decide the size of subregions. Additionally, the fact that local models are only trained on a subset of data also limits their potential. This paper proposes an entirely different strategy that interprets nonstationarity as a lack of data and addresses it by introducing latent variables to the original dataset. Backpropagation is then used to find the best values for these latent variables. Experiments show that this method is at least as efficient as multiple-local-model-based approaches and has even greater potential. 
    more » « less
  2. null (Ed.)
    Non-stationarity is often observed in Geographic datasets. One way to explain non-stationarity is to think of it as a hidden "local knowledge" that varies across space. It is inherently difficult to model such data as models built for one region do not necessarily fit another area as the local knowledge could be different. A solution for this problem is to construct multiple local models at various locations, with each local model accounting for a sub-region within which the data remains relatively stationary. However, this approach is sensitive to the size of data, as the local models are only trained from a subset of observations from a particular region. In this paper, we present a novel approach that addresses this problem by aggregating spatially similar sub-regions into relatively large partitions. Our insight is that although local knowledge shifts over space, it is possible for multiple regions to share the same local knowledge. Data from these regions can be aggregated to train a more accurate model. Experiments show that this method can handle non-stationary and outperforms when the dataset is relatively small. 
    more » « less
  3. null (Ed.)
    Training a semantic segmentation model requires large densely-annotated image datasets that are costly to obtain. Once the training is done, it is also difficult to add new object categories to such segmentation models. In this paper, we tackle the few-shot semantic segmentation problem, which aims to perform image segmentation task on unseen object categories merely based on one or a few support example(s). The key to solving this few-shot segmentation problem lies in effectively utilizing object information from support examples to separate target objects from the background in a query image. While existing methods typically generate object-level representations by averaging local features in support images, we demonstrate that such object representations are typically noisy and less distinguishing. To solve this problem, we design an object representation generator (ORG) module which can effectively aggregate local object features from support image( s) and produce better object-level representation. The ORG module can be embedded into the network and trained end-to-end in a weakly-supervised fashion without extra human annotation. We incorporate this design into a modified encoder-decoder network to present a powerful and efficient framework for few-shot semantic segmentation. Experimental results on the Pascal-VOC and MS-COCO datasets show that our approach achieves better performance compared to existing methods under both one-shot and five-shot settings. 
    more » « less
  4. In this paper, we propose a Spatial Robust Mixture Regression model to investigate the relationship between a response variable and a set of explanatory variables over the spatial domain, assuming that the relationships may exhibit complex spatially dynamic patterns that cannot be captured by constant regression coefficients. Our method integrates the robust finite mixture Gaussian regression model with spatial constraints, to simultaneously handle the spatial non-stationarity, local homogeneity, and outlier contaminations. Compared with existing spatial regression models, our proposed model assumes the existence a few distinct regression models that are estimated based on observations that exhibit similar response-predictor relationships. As such, the proposed model not only accounts for non-stationarity in the spatial trend, but also clusters observations into a few distinct and homogenous groups. This provides an advantage on interpretation with a few stationary sub-processes identified that capture the predominant relationships between response and predictor variables. Moreover, the proposed method incorporates robust procedures to handle contaminations from both regression outliers and spatial outliers. By doing so, we robustly segment the spatial domain into distinct local regions with similar regression coefficients, and sporadic locations that are purely outliers. Rigorous statistical hypothesis testing procedure has been designed to test the significance of such segmentation. Experimental results on many synthetic and real-world datasets demonstrate the robustness, accuracy, and effectiveness of our proposed method, compared with other robust finite mixture regression, spatial regression and spatial segmentation methods. 
    more » « less
  5. In recent years, remarkable results have been achieved in self-supervised action recognition using skeleton sequences with contrastive learning. It has been observed that the semantic distinction of human action features is often represented by local body parts, such as legs or hands, which are advantageous for skeleton-based action recognition. This paper proposes an attention-based contrastive learning framework for skeleton representation learning, called SkeAttnCLR, which integrates local similarity and global features for skeleton-based action representations. To achieve this, a multi-head attention mask module is employed to learn the soft attention mask features from the skeletons, suppressing non-salient local features while accentuating local salient features, thereby bringing similar local features closer in the feature space. Additionally, ample contrastive pairs are generated by expanding contrastive pairs based on salient and non-salient features with global features, which guide the network to learn the semantic representations of the entire skeleton. Therefore, with the attention mask mechanism, SkeAttnCLR learns local features under different data augmentation views. The experiment results demonstrate that the inclusion of local feature similarity significantly enhances skeleton-based action representation. Our proposed SkeAttnCLR outperforms state-of-the-art methods on NTURGB+D, NTU120-RGB+D, and PKU-MMD datasets. The code and settings are available at this repository: https://github.com/GitHubOfHyl97/SkeAttnCLR. 
    more » « less