skip to main content

Title: A vision for the development of benchmarks to bridge geoscience and data science
The massive surge in the amount of observational field data demands richer and more meaningful collab- oration between data scientists and geoscientists. This document was written by members of the Working Group on Case Studies of the NSF-funded RCN on Intelli- gent Systems Research To Support Geosciences (IS-GEO, https://is-geo.org/) to describe our vision to build and enhance such collaboration through the use of specially- designed benchmark datasets. Benchmark datasets serve as summary descriptions of problem areas, providing a simple interface between disciplines without requiring extensive background knowledge. Benchmark data intend to address a number of overarching goals. First, they are concrete, identifiable, and public, which results in a natural coordination of research efforts across multiple disciplines and institutions. Second, they provide multi- fold opportunities for objective comparison of various algorithms in terms of computational costs, accuracy, utility and other measurable standards, to address a particular question in geoscience. Third, as materials for education, the benchmark data cultivate future human capital and interest in geoscience problems and data science methods. Finally, a concerted effort to produce and publish benchmarks has the potential to spur the development of new data science methods, while provid- ing deeper insights into many fundamental problems in more » modern geosciences. That is, similarly to the critical role the genomic and molecular biology data archives serve in facilitating the field of bioinformatics, we expect that the proposed geosciences data repository will serve as “catalysts” for the new discicpline of geoinformatics. We describe specifications of a high quality geoscience bench- mark dataset and discuss some of our first benchmark efforts. We invite the Climate Informatics community to join us in creating additional benchmarks that aim to address important climate science problems. « less
Authors:
; ; ; ; ; ; ;
Award ID(s):
1632211
Publication Date:
NSF-PAR ID:
10057023
Journal Name:
7th International Workshop on Climate Informatics
Sponsoring Org:
National Science Foundation
More Like this
  1. The massive surge in the amount of observational field data demands richer and more meaningful collab-oration between data scientists and geoscientists. This document was written by members of the Working Group on Case Studies of the NSF-funded RCN on Intelli-gent Systems Research To Support Geosciences (IS-GEO, https:// is-geo.org/ ) to describe our vision to build and enhance such collaboration through the use of specially-designed benchmark datasets. Benchmark datasets serve as summary descriptions of problem areas, providing a simple interface between disciplines without requiring extensive background knowledge. Benchmark data intend to address a number of overarching goals. First, they are concrete,more »identifiable, and public, which results in a natural coordination of research efforts across multiple disciplines and institutions. Second, they provide multi-fold opportunities for objective comparison of various algorithms in terms of computational costs, accuracy, utility and other measurable standards, to address a particular question in geoscience. Third, as materials for education, the benchmark data cultivate future human capital and interest in geoscience problems and data science methods. Finally, a concerted effort to produce and publish benchmarks has the potential to spur the development of new data science methods, while provid-ing deeper insights into many fundamental problems in modern geosciences. That is, similarly to the critical role the genomic and molecular biology data archives serve in facilitating the field of bioinformatics, we expect that the proposed geosciences data repository will serve as “catalysts” for the new discicpline of geoinformatics. We describe specifications of a high quality geoscience bench-mark dataset and discuss some of our first benchmark efforts. We invite the Climate Informatics community to join us in creating additional benchmarks that aim to address important climate science problems.« less
  2. Adoption of data and compute-intensive research in geosciences is hindered by the same social and technological reasons as other science disciplines - we're humans after all. As a result, many of the new opportunities to advance science in today's rapidly evolving technology landscape are not approachable by domain geoscientists. Organizations must acknowledge and actively mitigate these intrinsic biases and knowledge gaps in their users and staff. Over the past ten years, CyVerse (www.cyverse.org) has carried out the mission "to design, deploy, and expand a national cyberinfrastructure for life sciences research, and to train scientists in its use." During this time,more »CyVerse has supported and enabled transdisciplinary collaborations across institutions and communities, overseen many successes, and encountered failures. Our lessons learned in user engagement, both social and technical, are germane to the problems facing the geoscience community today. A key element of overcoming social barriers is to set up an effective education, outreach, and training (EOT) team to drive initial adoption as well as continued use. A strong EOT group can reach new users, particularly those in under-represented communities, reduce power distance relationships, and mitigate users' uncertainty avoidance toward adopting new technology. Timely user support across the life of a project, based on mutual respect between the developers' and researchers' different skill sets, is critical to successful collaboration. Without support, users become frustrated and abandon research questions whose technical issues require solutions that are 'simple' from a developer's perspective, but are unknown by the scientist. At CyVerse, we have found there is no one solution that fits all research challenges. Our strategy has been to maintain a system of systems (SoS) where users can choose 'lego-blocks' to build a solution that matches their problem. This SoS ideology has allowed CyVerse users to extend and scale workflows without becoming entangled in problems which reduce productivity and slow scientific discovery. Likewise, CyVerse addresses the handling of data through its entire lifecycle, from creation to publication to future reuse, supporting community driven big data projects and individual researchers.« less
  3. The geosciences have to solve increasingly complex problems relating to earth and society, as resources become limited, natural hazards and changes in climate impact larger communities, and as people interacting with Earth become more interconnected. However, the profession has dismally low representation from geoscientists who are from diverse racial, ethnic, or socioeconomic backgrounds, as well as women in leadership roles. This underrepresentation also includes individuals whose gender identity/expression is non-binary or gender-conforming, or those who have physical, cognitive, or emotional disabilities. This lack of diversity ultimately impacts our profession’s ability to produce our best science and work with the communitiesmore »that we strive to protect and serve as stewards of the earth. As part of the NSF GOLD solicitation, we developed a project (Geoscience Diversity Experiential Simulations) to train 30 faculty and administrators to be “champions for diversity” and combat the hostile climates in geoscience departments. We hosted a 3-day workshop in November that used virtual simulations to give participants experience in building the skills to react to situations regarding bias, discrimination, microaggressions, or bullying often cited in geoscience culture. Participants interacted with avatars on screen, who responded to participants’ actions and choices, given certain scenarios. The scenarios are framed within a geoscience perspective; we integrated qualitative interview data from informants who experienced inequitable judgement, bias, discrimination, or harassment during their geoscience careers. The simulations gave learners a safe environment to practice and build self-efficacy in how to professionally and productively engage peers in difficult conversations. In addition, we obtained pre-workshop survey data about participants’ understanding regarding Diversity, Equity, and Inclusion practices, as well as observation data of participants’ responses during the simulations. Follow-up activities include monthly online meetings to engage problem solving and strategy-building skills for catalyzing institutional culture change within departments. This talk will specifically focus on workshop observations and preliminary reactions to the training.« less
  4. The geosciences are one of the least diverse disciplines in the United States, despite the field's relevance to livelihoods and local and global economies. Bias, discrimination, and harassment present serious hurdles to diversifying the field. These behaviors persist due to historical structures of exclusion, severe power imbalances, unique challenges associated with geoscientist stereotypes, and a culture of impunity that tolerates exclusionary behaviors and marginalization of scholars from underserved groups. We summarize recent research on exclusionary behaviors that create hostile climates and contribute to persistent low retention of diverse groups in the geosciences and other science, technology, engineering, and mathematics (STEM)more »fields. We then discuss recent initiatives in the US by geoscience professional societies and organizations, including the National Science Foundation-supported ADVANCEGeo Partnership, to improve diversity, equity, and inclusion by improving workplace climate. Social networks and professional organizations can transform scientific culture through providing opportunities for mentorship and community building and counteracting professional isolation that can result from experiencing hostile behaviors, codifying ethical practice, and advocating for policy change. We conclude with a call for a reexamination of current institutional structures, processes, and practices for a transformational and equitable scientific enterprise. To be truly successful, cultural and behavioral changes need to be accompanied by reeducation about the historical political structures of academic institutions to start conversations about the real change that has to happen for a transformational and equitable scientific enterprise.« less
  5. Concept maps have emerged as a valid and reliable method for assessing deep conceptual understanding in engineering education within disciplines as well as interdisciplinary knowledge integration across disciplines. Most work on concept maps, however, focuses on undergraduates. In this paper, we use concept maps to examine changes in graduate students’ conceptual understanding and knowledge integration resulting from an interdisciplinary graduate program. Our study context is pair of foundational, team-taught courses in an interdisciplinary Disaster Resilience and Risk Management (DRRM) graduate program. The courses include a 3-hour research course and a 1-hour seminar that aim to build student understanding within andmore »across Urban Affairs and Planning, Civil and Environmental Engineering, Geosciences, and Business Information Technology. The courses introduce core principles of DRRM and relevant research methods in these disciplines, and drive students to understand the intersections of these disciplines in the context of planning for and responding to natural and human-made disasters. To understand graduate student growth from disciplinary-based to interdisciplinary scholars, we pose the research questions: 1) In what ways do graduate students’ understandings of DRRM change as a result of their introduction to an interdisciplinary graduate research program? and 2) To what extent and in what ways do concept maps serve as a tool to capture interdisciplinary learning in this context? Data includes pre/post concept maps centered on disaster resilience and risk management, a one-page explanation of the post-concept map, and ethnographic field notes gathered from class and faculty meetings. Pre-concept maps were collected on the first day of class; post-concept maps will be collected as part of the final course assignment. We assess the students’ concept maps for depth of conceptual understanding within disciplines and interdisciplinary competency across disciplines, using the field notes to provide explanatory context. The results presented in this paper support the inclusion of an explanation component to concept maps, and also suggest that concept maps alone may not be the best measure of student understanding of concepts within and across disciplines in this specific context. If similar programs wish to use concept maps as an assessment method, we suggest the inclusion of an explanation component and suggest providing explicit instructions that specify the intended audience. We also suggest using a holistic scoring method, as it is more likely to capture nuances in the concept maps than traditional scoring methods, which focus solely on counting factors like hierarchies and number of cross-links.« less