Abstract We present a draft Minimum Information About Geospatial Information System (MIAGIS) standard for facilitating public deposition of geospatial information system (GIS) datasets that follows the FAIR (Findable, Accessible, Interoperable and Reusable) principles. The draft MIAGIS standard includes a deposition directory structure and a minimum javascript object notation (JSON) metadata formatted file that is designed to capture critical metadata describing GIS layers and maps as well as their sources of data and methods of generation. The associated miagis Python package facilitates the creation of this MIAGIS metadata file and directly supports metadata extraction from both Esri JSON and GEOJSON GIS data formats plus options for extraction from user-specified JSON formats. We also demonstrate their use in crafting two example depositions of ArcGIS generated maps. We hope this draft MIAGIS standard along with the supporting miagis Python package will assist in establishing a GIS standards group that will develop the draft into a full standard for the wider GIS community as well as a future public repository for GIS datasets.
more »
« less
Open Government Data and File Formats: Constraints on Collaboration
This exploratory interpretive case study investigated the collaborative potential of open government data available through data.gov, the US federal open data catalog. Open data is a central aspect of open government collaboration because it fosters exchange and communication between governments and the public. Government organizations that release open data make choices about file formats that have a substantial impact on the potential for collaboration. A file format, such as a document or a spreadsheet, is a constraint on which programs can read the file and what actions a user can do with the file. Overall, we found data.gov formats with limited collaboration potential but files that could be accessed by people with a wide range of skills. The findings are incorporated into suggestions for future iterations of open data policy. The advantages and limitations of using file formats for open data research are considered. The exploratory findings raise questions about future user-centric open data evaluations.
more »
« less
- Award ID(s):
- 1635449
- PAR ID:
- 10033572
- Date Published:
- Journal Name:
- Proceedings of the 18th Annual International Conference on Digital Government Research dg.o 2017
- Page Range / eLocation ID:
- 155 to 159
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract This paper summarizes the open community conventions developed by the Ecological Forecasting Initiative (EFI) for the common formatting and archiving of ecological forecasts and the metadata associated with these forecasts. Such open standards are intended to promote interoperability and facilitate forecast communication, distribution, validation, and synthesis. For output files, we first describe the convention conceptually in terms of global attributes, forecast dimensions, forecasted variables, and ancillary indicator variables. We then illustrate the application of this convention to the two file formats that are currently preferred by the EFI, netCDF (network common data form), and comma‐separated values (CSV), but note that the convention is extensible to future formats. For metadata, EFI's convention identifies a subset of conventional metadata variables that are required (e.g., temporal resolution and output variables) but focuses on developing a framework for storing information about forecast uncertainty propagation, data assimilation, and model complexity, which aims to facilitate cross‐forecast synthesis. The initial application of this convention expands upon the Ecological Metadata Language (EML), a commonly used metadata standard in ecology. To facilitate community adoption, we also provide a Github repository containing a metadata validator tool and several vignettes in R and Python on how to both write and read in the EFI standard. Lastly, we provide guidance on forecast archiving, making an important distinction between short‐term dissemination and long‐term forecast archiving, while also touching on the archiving of code and workflows. Overall, the EFI convention is a living document that can continue to evolve over time through an open community process.more » « less
-
Gregor, Ingo; Erdmann, Rainer; Koberling, Felix (Ed.)Photon-HDF5 is an open-source and open file format for storing photon-counting data from single molecule microscopy experiments, introduced to simplify data exchange and increase the reproducibility of data analysis. Part of the Photon-HDF5 ecosystem, is phconvert, an extensible python library that allows converting proprietary formats into Photon-HDF5 files. However, its use requires some proficiency with command line instructions, the python programming language, and the YAML markup format. This creates a significant barrier for potential users without that expertise, but who want to benefit from the advantages of releasing their files in an open format. In this work, we present a GUI that lowers this barrier, thus simplifying the use of Photon-HDF5. This tool uses the phconvert python library to convert data files originally saved in proprietary data formats to Photon-HDF5 files, without users having to write a single line of code. Because reproducible analyses depend on essential experimental information, such as laser power or sample description, the GUI also includes (currently limited) functionality to associate valid metadata with the converted file, without having to write any YAML. Finally, the GUI includes several productivity-enhancing features such as whole-directory batch conversion and the ability to re-run a failed batch, only converting the files that could not be converted in the previous run.more » « less
-
Purpose Open data resources contain few signals for assessing their suitability for data analytics. The purpose of this paper is to characterize the uncertainty experienced by open data consumers with a framework based on economic theory. Design/methodology/approach Drawing on information asymmetry theory about market exchanges, this paper investigates the practical challenges faced by data consumers seeking to reuse open data. An inductive qualitative analysis of over 2,900 questions asked between 2013 and 2018 on an internet forum identified how a community of 15,000 open data consumers expressed uncertainty about data sources. Findings Open data consumers asked direct questions that expressed uncertainty about the availability, interoperability and interpretation of data resources. Questions focused on future value and some requests were devoted to seeking data that matched known sources. The study proposes a data signal framework that explains uncertainty about open data within the context of control and visibility. Originality/value The proposed framework bridges digital government practice to information signaling theory. The empirical evidence substantiates market aspects of open data portals. This paper provided a needed case study of how data consumers experience uncertainty. The study integrates established theories about risk to improve the reuse of open data.more » « less
-
Interdependent privacy (IDP) violations occur when users share personal information about others without permission, resulting in potential embarrassment, reputation loss, or harassment. There are several strategies that can be applied to protect IDP, but little is known regarding how social media users perceive IDP threats or how they prefer to respond to them. We utilized a mixed-method approach with a replication study to examine user beliefs about various government-, platform-, and user-level strategies for managing IDP violations. Participants reported that IDP represented a 'serious' online threat, and identified themselves as primarily responsible for responding to violations. IDP strategies that felt more familiar and provided greater perceived control over violations (e.g., flagging, blocking, unfriending) were rated as more effective than platform or government driven interventions. Furthermore, we found users were more willing to share on social media if they perceived their interactions as protected. Findings are discussed in relation to control paradox theory.more » « less
An official website of the United States government

