skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 1, 2026

Title: Mindat.org: The open access mineralogy database to accelerate data-intensive geoscience research
Abstract The mindat.org website (Mindat) has been operating since October 2000 as a free, crowd-sourced, and expert-curated database particularly focused on mineral species and their occurrences worldwide. The project has transformed from a hobbyist site in the beginning into a resource that has found use in various scientific research projects and educational programs. Together with other open data resources, Mindat has helped accelerate scientific discoveries in many fields, such as mineral evolution, mineral ecology, and the co-evolution of the geosphere and biosphere. Recently, through open data efforts, machine interfaces and software packages have been established to enable flexible data discovery and download from Mindat. We assume that the data access and usage will further scale up in the next years. Although Mindat is curated by a team of geoscience and database experts across the world, the crowd-sourced records in Mindat possess some bias. In this paper, we first present an overview of the primary data subjects in Mindat and then give extensive details about the characteristics and partiality of three of the most popular data subjects: locality, mineral species, and mineral occurrence. In the discussion, we also give an outlook on appropriate data usage and future extension of data records. We hope users can obtain a more comprehensive view of the Mindat database through this paper and thus better plan their data use. We also hope more people will be inspired to contribute to the data curation work to make Mindat a sustained data ecosystem for geoscience research.  more » « less
Award ID(s):
2126315
PAR ID:
10618805
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ;
Editor(s):
Hummer, Daniel
Publisher / Repository:
GeoscienceWorld
Date Published:
Journal Name:
American Mineralogist
Volume:
110
Issue:
6
ISSN:
0003-004X
Page Range / eLocation ID:
833 to 844
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract. Technologies such as machine learning and deep learning are powering the discovery of meaningful patterns in Earth science big data. In the field of mineralogy, Mindat (“mindat.org”) is one of the largest databases. Although its front-end website is open and free, a machine interface for bulk data query and download had never been set up before 2022. Through a project called OpenMindat, an application programming interface (API) to enable open data query and access from Mindat was set up in 2023. To further lower the barrier between Mindat open data and geoscientists with limited coding skills, we developed an R package (OpenMindat v1.0.0) on top of the API. The Mindat API includes multiple data subjects such as geomaterials (e.g., rocks, minerals, synonyms, variety, mixture, and commodity), localities, and the IMA-approved (International Mineralogical Association) mineral list. The OpenMindat v1.0.0 package wraps the capabilities of the Mindat API and is designed to be user-friendly and extensible. In addition to providing functions for querying data subjects on the API, the package supports exporting data to various formats. In real-world applications, these functions only require minor coding for users to get desired datasets, and various other packages in the R environment can be used to analyze and visualize the data. The OpenMindat v1.0.0 package, which includes detailed tutorials and examples, is available on GitHub under the MIT license. The field of mineralogy and many other geoscience disciplines are facing opportunities enabled by open data. Various research topics such as mineral network analysis, mineral association rule mining, mineral ecology, mineral evolution, and critical minerals have already benefited from Mindat's open data efforts in recent years. We hope this R package can help accelerate those data-intensive studies and lead to more scientific discoveries. 
    more » « less
  2. Abstract The open data movement has brought revolutionary changes to the field of mineralogy. With a growing number of datasets made available through community efforts, researchers are now able to explore new scientific topics such as mineral ecology, mineral evolution and new classification systems. The recent results have shown that the necessary open data coupled with data science skills and expertise in mineralogy will lead to impressive new scientific discoveries. Yet, feedback from researchers also reflects the needs for better FAIRness of open data, that is, findable, accessible, interoperable and reusable for both humans and machines. In this paper, we present our recent work on building the open data service of Mindat, one of the largest mineral databases in the world. In the past years, Mindat has supported numerous scientific studies but a machine interface for data access has never been established. Through the OpenMindat project we have achieved solid progress on two activities: (1) cleanse data and improve data quality, and (2) build a data sharing platform and establish a machine interface for data query and access. We hope OpenMindat will help address the increasing data needs from researchers in mineralogy for an internationally recognized authoritative database that is fully compliant with the FAIR guiding principles and helps accelerate scientific discoveries. 
    more » « less
  3. Abstract Animal trait data are scattered across several datasets, making it challenging to compile and compare trait information across different groups. For plants, the TRY database has been an unwavering success for those ecologists interested in addressing how plant traits influence a wide variety of processes and patterns, but the same is not true for most animal taxonomic groups. Here, we introduce ZooTraits, a Shiny app designed to help users explore and obtain animal trait data for research in ecology and evolution. ZooTraits was developed to tackle the challenge of finding in a single site information of multiple trait datasets and facilitating access to traits by providing an easy‐to‐use, open‐source platform. This app combines datasets centralized in the Open Trait Network, raw data from the AnimalTraits database, and trait information for animals compiled by Gonçalves‐Souza et al. (2023,Ecology and Evolution13, e10016). Importantly, the ZooTraits app can be accessed freely and provides a user‐friendly interface through three functionalities that will allow users to easily visualize, compare, download, and upload trait data across the animal tree of life—ExploreTrait,FeedTrait, andGetTrait. By usingExploreTraitandGetTrait, users can explore, compare, and extract 3954 trait records from 23,394 species centralized in the Open Traits Network, and trait data for ~2000 species from the AnimalTraits database. The app summarizes trait information for numerous taxonomic groups within the Animal Kingdom, encompassing data from diverse aquatic and terrestrial ecosystems and various geographic regions worldwide. Moreover, ZooTraits enables researchers to upload trait information, serving as a hub for a continually expanding global trait database. By promoting the centralization of trait datasets and offering a platform for data sharing, ZooTraits is facilitating advancements in trait‐based ecological and evolutionary studies. We hope that other trait databases will evolve to mirror the approach we have outlined here. 
    more » « less
  4. Blair, Jaime E. (Ed.)
    Phytophthora species cause severe diseases on food, forest, and ornamental crops. Since the genus was described in 1876, it has expanded to comprise over 190 formally described species. There is a need for an open access phylogenetic tool that centralizes diverse streams of sequence data and metadata to facilitate research and identification of Phytophthora species. We used the Tree-Based Alignment Selector Toolkit (T-BAS) to develop a phylogeny of 192 formally described species and 33 informal taxa in the genus Phytophthora using sequences of eight nuclear genes. The phylogenetic tree was inferred using the RAxML maximum likelihood program. A search engine was also developed to identify microsatellite genotypes of P . infestans based on genetic distance to known lineages. The T-BAS tool provides a visualization framework allowing users to place unknown isolates on a curated phylogeny of all Phytophthora species. Critically, the tree can be updated in real-time as new species are described. The tool contains metadata including clade, host species, substrate, sexual characteristics, distribution, and reference literature, which can be visualized on the tree and downloaded for other uses. This phylogenetic resource will allow data sharing among research groups and the database will enable the global Phytophthora community to upload sequences and determine the phylogenetic placement of an isolate within the larger phylogeny and to download sequence data and metadata. The database will be curated by a community of Phytophthora researchers and housed on the T-BAS web portal in the Center for Integrated Fungal Research at NC State. The T-BAS web tool can be leveraged to create similar metadata enhanced phylogenies for other Oomycete, bacterial or fungal pathogens. 
    more » « less
  5. This database compiles comprehensive occurrence information, based on voucher specimens of small-eared shrews, genusCryptotis, that occur from México to Peru. The database integrates the information obtained from four main sources: natural history museums, public databases, fieldwork and scientific literature. It contains 3,639 records from 53 species in 12 countries. Of the total, 83.54% have collecting dates, 51.36% of the specimens are sexed and 84.56% have decimal degrees coordinates. By generating this database and making it publicly available, we hope to improve the biological knowledge of this group of small mammals still poorly studied in the region. It aims to be a valuable resource for students, researchers, conservationists and decision-makers. The dataset contains information on all species of the genusCryptotisin the Neotropical Region (namely from México to Peru), incorporating the most updated taxonomic and nomenclatural changes. The database includes records in regions and countries that are poorly represented in currently available data repositories. Most records have verified temporal and spatial information. 
    more » « less