Modern advancements in science and engineering are built upon multidisciplinary projects that bring experts together from different fields. Within their respective disciplines, researchers rely on precise terminology for specific ideas, principles, methods, and theories. Hence, the potential for miscommunication is substantial, especially when common words have been adopted by one (or both) group(s) to represent very specific, precise, but, perhaps, different concepts. Under the best circumstances, misunderstanding key terms will lead toward a breakdown of efficiency. Under less optimal conditions, miscommunication will sow frustration, lead to errors, and inhibit scientific breakthroughs. Here, our research group of geoscientists and machine learning experts presents a process to help geoscientists understand the fundamentals of supervised learning by describing the general workflow (i.e., a conceptual pipeline) for supervised learning that must be understood by all the parties involved in a geoscience-machine learning endeavor. Terms critical for machine learning are introduced, defined, and used within the context of an overly simplified mock hydrological study to illustrate their appropriate usage, and then used again in the context of a published geothermal-machine learning study. These key terms are divided into two groups, which are 1) essential to the field of machine learning but are predominantly absent in geoscience or 2) homonyms (i.e., words with the same spelling or pronunciation but with different meanings) between the fields. Lastly, we discuss a few other important homonyms that were not introduced in the general workflow but arise regularly in machine learning applications.
more »
« less
What Did They Just Say? Building a Rosetta Stone for Geoscience and Machine Learning
Modern advancements in science and engineering are built upon multidisciplinary projects that bring experts together from different fields. Within their respective disciplines, researchers rely on precise terminology for specific ideas, principles, methods, and theories. Hence, the potential for miscommunication is substantial, especially when common words have been adopted by one (or both) group(s) to represent very specific, precise, but, perhaps, different concepts. Under the best circumstances, misunderstanding key terms will lead toward a breakdown of efficiency. Under less optimal conditions, miscommunication will sow frustration, lead to errors, and inhibit scientific breakthroughs. Here, our research group of geoscientists and machine learning experts presents a process to help geoscientists understand the fundamentals of supervised learning by describing the general workflow (i.e., a conceptual pipeline) for supervised learning that must be understood by all the parties involved in a geoscience-machine learning endeavor. Terms critical for machine learning are introduced, defined, and used within the context of an overly simplified mock hydrological study to illustrate their appropriate usage, and then used again in the context of a published geothermal-machine learning study. These key terms are divided into two groups, which are 1) essential to the field of machine learning but are predominantly absent in geoscience or 2) homonyms (i.e., words with the same spelling or pronunciation but with different meanings) between the fields. Lastly, we discuss a few other important homonyms that were not introduced in the general workflow but arise regularly in machine learning applications.
more »
« less
- Award ID(s):
- 2046175
- PAR ID:
- 10434719
- Date Published:
- Journal Name:
- 2022 Geothermal Rising Conference
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Recent studies have found that for students entering college, altruism is a desired aspect of a future career. Problematically, few students perceived geoscience careers as altruistic or even expressed an understanding of the potential career paths in geoscience. This dissonance in incoming student perceptions of geoscience may be linked to declining major enrollment. Classically, geoscientists have often cited job benefits such as high income, working outdoors, and travel as reasons to pursue a career in geoscience, but these may not be as appealing to the next generation of scientists. This research seeks to test if alternative forms of outreach and recruitment that highlight geoscientists’ roles in renewable energy, remediation and environmental fields, and studying climate change alter students’ perceptions of geoscientists. To accomplish this, a co-operative game was developed, originally based on SERC activity 49774, a carbon cycle dice game by Callan Bentley. The activity was first modified by Ryan Hollister for the 2018 Earth Educators’ Rendezvous, where card sheets for reservoirs were introduced and edited to have students more explicitly calculate relative reservoir sizes, fluxes between reservoirs, and the duration carbon may spend in each reservoir. The game was further altered at North Dakota State University to make carbon reservoir cards more specific to the North Dakota-Minnesota region. The most recent iteration adds co-operative gameplay where students actively intervene in the carbon cycle through roles, including geoscientist, that can actively impact the climate. Our goal is to demonstrate the influence geoscience careers can have on modern challenges, such as climate change, in an engaging format. This most recent version of the game will be used as an alternative outreach tool. This research is currently underway, and data will be collected at middle school, high school, early college, and community events through 2022.more » « less
-
null (Ed.)Many of the world’s most pressing issues, such as the emergence of zoonotic diseases, can only be addressed through interdisciplinary research. However, the findings of interdisciplinary research are susceptible to miscommunication among both professional and non-professional audiences due to differences in training, language, experience, and understanding. Such miscommunication contributes to the misunderstanding of key concepts or processes and hinders the development of effective research agendas and public policy. These misunderstandings can also provoke unnecessary fear in the public and have devastating effects for wildlife conservation. For example, inaccurate communication and subsequent misunderstanding of the potential associations between certain bats and zoonoses has led to persecution of diverse bats worldwide and even government calls to cull them. Here, we identify four types of miscommunication driven by the use of terminology regarding bats and the emergence of zoonotic diseases that we have categorized based on their root causes: (1) incorrect or overly broad use of terms; (2) terms that have unstable usage within a discipline, or different usages among disciplines; (3) terms that are used correctly but spark incorrect inferences about biological processes or significance in the audience; (4) incorrect inference drawn from the evidence presented. We illustrate each type of miscommunication with commonly misused or misinterpreted terms, providing a definition, caveats and common misconceptions, and suggest alternatives as appropriate. While we focus on terms specific to bats and disease ecology, we present a more general framework for addressing miscommunication that can be applied to other topics and disciplines to facilitate more effective research, problem-solving, and public policy.more » « less
-
Researchers using social media data want to understand the discussions occurring in and about their respective fields. These domain experts often turn to topic models to help them see the entire landscape of the conversation, but unsupervised topic models often produce topic sets that miss topics experts expect or want to see. To solve this problem, we propose Guided Topic-Noise Model (GTM), a semi-supervised topic model designed with large domain-specific social media data sets in mind. The input to GTM is a set of topics that are of interest to the user and a small number of words or phrases that belong to those topics. These seed topics are used to guide the topic generation process, and can be augmented interactively, expanding the seed word list as the model provides new relevant words for different topics. GTM uses a novel initialization and a new sampling algorithm called Generalized Polya Urn (GPU) seed word sampling to produce a topic set that includes expanded seed topics, as well as new unsupervised topics. We demonstrate the robustness of GTM on open-ended responses from a public opinion survey and four domain-specific Twitter data sets.more » « less
-
Understanding the skills bachelor-level geoscientists need to enter the workforce is critical to their success. The goal of this study was to identify the workforce skills that are most requested from a broad range of geoscience employers. We collected 3668 job advertisements for bachelor-level geoscientists and used a case-insensitive, code-matching function in Matlab to determine the skills geoscience employers seek. Written communication (67%), field skills (63%), planning (53%), and driving (51%) were most frequently requested. Field skills and data collection were frequently found together in the ads. Written communication skills were common regardless of occupation. Quantitative skills were requested less frequently (23%) but were usually mentioned several times in the ads that did request them, signaling their importance for certain jobs. Some geoscience-specific skills were rarely found, such as temporal understanding (5%) and systems thinking (0%). We also subdivided field skills into individual tasks and ranked them by employer demand. Site assessments and evaluations, unspecified field tasks, and monitoring were the most frequently requested field skills. This study presents the geoscience community with a picture of the skills sought by employers of bachelor-level geoscientists and provides departments and programs with data they can use to assess their curricula for workforce preparation.more » « less
An official website of the United States government

