skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 1, 2026

Title: A Big Data Approach for the Regional-Scale Spatial Pattern Analysis of Amazonian Palm Locations
Arecaceae (palms) are an important resource for indigenous communities as well as fauna populations across Amazonia. Understanding the spatial patterns and the environmental factors that determine the habitats of palms is of considerable interest to rainforest ecologists. Here, we utilize remotely sensed imagery in conjunction with topography and soil attribute data and employ a generalized cluster identification algorithm, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), to study the underlying patterns of palms in two areas of Guyana, South America. The results of the HDBSCAN assessment were cross-validated with several point pattern analysis methods commonly used by ecologists (the quadrat test for complete spatial randomness, Morista Index, Ripley’s L-function, and the pair correlation function). A spatial logistic regression model was generated to understand the multivariate environmental influences driving the placement of cluster and outlier palms. Our results showed that palms are strongly clustered in the areas of interest and that the HDBSCAN’s clustering output correlates well with traditional analytical methods. The environmental factors influencing palm clusters or outliers, as determined by logistic regression, exhibit qualitative similarities to those identified in conventional ground-based palm surveys. These findings are promising for prospective research aiming to integrate remote flora identification techniques with traditional data collection studies.  more » « less
Award ID(s):
2501699 2047940
PAR ID:
10597731
Author(s) / Creator(s):
;
Publisher / Repository:
MDPI
Date Published:
Journal Name:
Remote Sensing
Volume:
17
Issue:
5
ISSN:
2072-4292
Page Range / eLocation ID:
784
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present results from the archaeological analysis of 331 km2 of high-resolution airborne lidar data collected in the Upper Usumacinta River basin of Mexico and Guatemala. Multiple visualizations of the DEM and multi-spectral data from four lidar transects crossing the Classic period (AD 350–900) Maya kingdoms centered on the sites of Piedras Negras, La Mar, and Lacanja Tzeltal permitted the identification of ancient settlement and associated features of agricultural infrastructure. HDBSCAN (hierarchical density-based clustering of applications with noise) cluster analysis was applied to the distribution of ancient structures to define urban, peri-urban, sub-urban, and rural settlement zones. Interpretations of these remotely sensed data are informed by decades of ground-based archaeological survey and excavations, as well as a rich historical record drawn from inscribed stone monuments. Our results demonstrate that these neighboring kingdoms in three adjacent valleys exhibit divergent patterns of structure clustering and low-density urbanism, distributions of agricultural infrastructure, and economic practices during the Classic period. Beyond meeting basic subsistence needs, agricultural production in multiple areas permitted surpluses likely for the purposes of tribute, taxation, and marketing. More broadly, this research highlights the strengths of HDBSCAN to the archaeological study of settlement distributions when compared to more commonly applied methods of density-based cluster analysis. 
    more » « less
  2. Abstract Background and aimsPalm fossils are often used as evidence for warm and wet palaeoenvironments, reflecting the affinities of most modern palms. However, several extant palm lineages tolerate cool and/or arid climates, making a clear understanding of the taxonomic composition of ancient palm communities important for reliable palaeoenvironmental inference. However, taxonomically identifiable palm fossils are rare and often confined to specific facies. Although the resolution of taxonomic information they provide remains unclear, phytoliths (microscopic silica bodies) provide a possible solution because of their high preservation potential under conditions where other plant fossils are scarce. We thus evaluate the taxonomic and palaeoenvironmental utility of palm phytoliths. MethodsWe quantified phytolith morphology of 97 modern palm and other monocot species. Using this dataset, we tested the ability of five common discriminant methods to identify nine major palm clades. We then compiled a dataset of species’ climate preferences and tested if they were correlated with phytolith morphology using a phylogenetic comparative approach. Finally, we reconstructed palm communities and palaeoenvironmental conditions at six fossil sites. Key resultsBest-performing models correctly identified phytoliths to their clade of origin only 59 % of the time. Although palms were generally distinguished from non-palms, few palm clades were highly distinct, and phytolith morphology was weakly correlated with species’ environmental preferences. Reconstructions at all fossil sites suggested that palm communities were dominated by Trachycarpeae and Areceae, with warm, equable climates and high, potentially seasonal rainfall. However, fossil site reconstructions had high uncertainty and often conflicted with other climate proxies. ConclusionsWhile phytolith morphology provides some distinction among palm clades, caution is warranted. Unlike prior spatially restricted studies, our geographically and phylogenetically broad study indicates phytolith morphology may not reliably differentiate most palm taxa in deep time. Nevertheless, it reveals distinct clades, including some likely to be palaeoenvironmentally informative. 
    more » « less
  3. ABSTRACT Understanding local stellar kinematic substructures in the solar neighbourhood helps build a complete picture of the formation of the Milky Way, as well as an empirical phase space distribution of dark matter that would inform detection experiments. We apply the clustering algorithm hdbscan on the Gaia early third data release to identify a list of stable clusters in velocity space and action-angle space by taking into account the measurement uncertainties and studying the stability of the clustering results. We find 1405 (497) stars in 23 (6) robust clusters in velocity space (action-angle space) that are consistently not associated with noise. We discuss the kinematic properties of these structures and study whether many of the small clusters belong to a similar larger cluster based on their chemical abundances. They are attributed to the known structures: the Gaia Sausage-Enceladus, the Helmi Stream, and globular cluster NGC 3201 are found in both spaces, while NGC 104 and the thick disc (Sequoia) are identified in velocity space (action-angle space). Although we do not identify any new structures, we find that the hdbscan member selection of already known structures is unstable to input kinematics of the stars when resampled within their uncertainties. We therefore present the stable subset of local kinematic structures, which are consistently identified by the clustering algorithm, and emphasize the need to take into account error propagation during both the manual and automated identification of stellar structures, both for existing ones as well as future discoveries. 
    more » « less
  4. Abstract Previous studies discovered a spatially heterogeneous expansion of Siberian larch into the tundra of the Polar Urals (Russia). This study reveals that the spatial pattern of encroachment of tree stands is related to environmental factors including topography and snow cover. Structural and allometric characteristics of trees, along with terrain elevation and snow depth were collected along a transect 860 m long and 80 m wide. Terrain curvature indices, as representative properties, were derived across a range of scales in order to characterize microtopography. A density-based clustering method was used here to analyze the spatial and temporal patterns of tree stems distribution. Results of the topographic analysis suggest that trees tend to cluster in areas with convex surfaces. The clustering analysis also indicates that the patterns of tree locations are linked to snow distribution. Records from the earliest campaign in 1960 show that trees lived mainly at the middle and bottom of the transect across the areas of high snow depth. As trees expanded uphill following a warming climate trend in recent decades, the high snow depth areas also shifted upward creating favorable conditions for recent tree growth at locations that were previously covered with heavy snow. The identified landscape signatures of increasing tall vegetation, and the effects of microtopography and snow may facilitate the understanding of treeline dynamics at larger scales. 
    more » « less
  5. We present Descending from Stochastic Clustering Variance Regression (DiSCoVeR) (https://www.github.com/sparks-baird/mat_discover), a Python tool for identifying and assessing high-performing, chemically unique compositions relative to existing compounds using a combination of a chemical distance metric, density-aware dimensionality reduction, clustering, and a regression model. In this work, we create pairwise distance matrices between compounds via Element Mover's Distance (ElMD) and use these to create 2D density-aware embeddings for chemical compositions via Density-preserving Uniform Manifold Approximation and Projection (DensMAP). Because ElMD assigns distances between compounds that are more chemically intuitive than Euclidean-based distances, the compounds can then be clustered into chemically homogeneous clusters via Hierarchical Density-based Spatial Clustering of Applications with Noise (HDBSCAN*). In combination with performance predictions via Compositionally-Restricted Attention-Based Network (CrabNet), we introduce several new metrics for materials discovery and validate DiSCoVeR on Materials Project bulk moduli using compound-wise and cluster-wise validation methods. We visualize these via multi-objective Pareto front plots and assign a weighted score to each composition that encompasses the trade-off between performance and density-based chemical uniqueness. In addition to density-based metrics, we explore an additional uniqueness proxy related to property gradients in DensMAP space. As a validation study, we use DiSCoVeR to screen materials for both performance and uniqueness to extrapolate to new chemical spaces. Top-10 rankings are provided for the compound-wise density and property gradient uniqueness proxies. Top-ranked compounds can be further curated via literature searches, physics-based simulations, and/or experimental synthesis. Finally, we compare DiSCoVeR against the naive baseline of random search for several parameter combinations in an adaptive design scheme. To our knowledge, this is the first time automated screening has been performed with explicit emphasis on discovering high-performing, novel materials. 
    more » « less