skip to main content


Title: Understanding Data Science Instruction in Multiple STEM Domains
As technology advances, data-driven work is becoming increasingly important across all disciplines. Data science is an emerging field that encompasses a large array of topics including data collection, data preprocessing, data visualization, and data analysis using statistical and machine learning methods. As undergraduates enter the workforce in the future, they will need to “benefit from a fundamental awareness of and competence in data science”[9]. This project has formed a research-practice partnership that brings together STEM+C instructors and researchers from three universities and education research and consulting groups. We aim to use high-frequency monitoring data collected from real-world systems to develop and implement an interdisciplinary approach to enable undergraduate students to develop an understanding of data science concepts through individual STEM disciplines that include engineering, computer science, environmental science, and biology. In this paper, we perform an initial exploratory analysis on how data science topics are introduced into the different courses, with the ultimate goal of understanding how instructional modules and accompanying assessments can be developed for multidisciplinary use. We analyze information collected from instructor interviews and surveys, student surveys, and assessments from five undergraduate courses (243 students) at the three universities to understand aspects of data science curricula that are common across disciplines. Using a qualitative approach, we find commonalities in data science instruction and assessment components across the disciplines. This includes topical content, data sources, pedagogical approaches, and assessment design. Preliminary analyses of instructor interviews also suggest factors that affect the content taught and the assessment material across the five courses. These factors include class size, students’ year of study, students’ reasons for taking class, and students’ background expertise and knowledge. These findings indicate the challenges in developing data modules for multidisciplinary use. We hope that the analysis and reflections on our initial offerings have improved our understanding of these challenges, and how we may address them when designing future data science teaching modules. These are the first steps in a design-based approach to developing data science modules that may be offered across multiple courses.  more » « less
Award ID(s):
1915487
NSF-PAR ID:
10288015
Author(s) / Creator(s):
Date Published:
Journal Name:
2021 ASEE Virtual Annual Conference
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    As technology advances, data driven work is becoming increasingly important across all disciplines. Data science is an emerging field that encompasses a large array of topics including data collection, data preprocessing, data visualization, and data analysis using statistical and machine learning methods. As undergraduates enter the workforce in the future, they will need to “benefit from a fundamental awareness of and competence in data science”[9]. This project has formed a research practice partnership that brings together STEM+C instructors and researchers from three universities and an education research and consulting group. We aim to use high frequency monitoring data collected from real-world systems to develop and implement an interdisciplinary approach to enable undergraduate students to develop an understanding of data science concepts through individual STEM disciplines that include engineering, computer science, environmental science, and biology. In this paper, we perform an initial exploratory analysis on how data science topics are introduced into the different courses, with the ultimate goal of understanding how instructional modules and accompanying assessments can be developed for multidisciplinary use. We analyze information collected from instructor interviews and surveys, student surveys, and assessments from five undergraduate courses (243 students) at the three universities to understand aspects of data science curricula that are common across disciplines. Using a qualitative approach, we find commonalities in data science instruction and assessment components across the disciplines. This includes topical content, data sources, pedagogical approaches, and assessment design. Preliminary analyses of instructor interviews also suggest factors that affect the content taught and the assessment material across the five courses. These factors include class size, students’ year of study, students’ reasons for taking class, and students’ background expertise and knowledge. These findings indicate the challenges in developing data modules for multidisciplinary use. We hope that the analysis and reflections on our initial offerings has improved our understanding of these challenges, and how we may address them when designing future data science teaching modules. These are the first steps in a design-based approach to developing data science modules that may be offered across multiple courses. 
    more » « less
  2. null (Ed.)
    As technology advances, data-driven work is becoming increasingly important across all disciplines. Data science is an emerging field that encompasses a large array of topics including data collection, data preprocessing, data visualization, and data analysis using statistical and machine learning methods. As undergraduates enter the workforce in the future, they will need to “benefit from a fundamental awareness of and competence in data science”[9]. This project has formed a research-practice partnership that brings together STEM+C instructors and researchers from three universities and an education research and consulting group. We aim to use high-frequency monitoring data collected from real-world systems to develop and implement an interdisciplinary approach to enable undergraduate students to develop an understanding of data science concepts through individual STEM disciplines that include engineering, computer science, environmental science, and biology. In this paper, we perform an initial exploratory analysis on how data science topics are introduced into the different courses, with the ultimate goal of understanding how instructional modules and accompanying assessments can be developed for multidisciplinary use. We analyze information collected from instructor interviews and surveys, student surveys, and assessments from five undergraduate courses (243 students) at the three universities to understand aspects of data science curricula that are common across disciplines. Using a qualitative approach, we find commonalities in data science instruction and assessment components across the disciplines. This includes topical content, data sources, pedagogical approaches, and assessment design. Preliminary analyses of instructor interviews also suggest factors that affect the content taught and the assessment material across the five courses. These factors include class size, students’ year of study, students’ reasons for taking class, and students’ background expertise and knowledge. These findings indicate the challenges in developing data modules for multidisciplinary use. We hope that the analysis and reflections on our initial offerings have improved our understanding of these challenges, and how we may address them when designing future data science teaching modules. These are the first steps in a design-based approach to developing data science modules that may be offered across multiple courses. 
    more » « less
  3. null (Ed.)
    As technology advances, data driven work is becoming increasingly important across all disciplines. Data science is an emerging field that encompasses a large array of topics including data collection, data preprocessing, data visualization, and data analysis using statistical and machine learning methods. As undergraduates enter the workforce in the future, they will need to “benefit from a fundamental awareness of and competence in data science”[9]. This project has formed a research practice partnership that brings together STEM+C instructors and researchers from three universities and an education research and consulting group. We aim to use high frequency monitoring data collected from real-world systems to develop and implement an interdisciplinary approach to enable undergraduate students to develop an understanding of data science concepts through individual STEM disciplines that include engineering, computer science, environmental science, and biology. In this paper, we perform an initial exploratory analysis on how data science topics are introduced into the different courses, with the ultimate goal of understanding how instructional modules and accompanying assessments can be developed for multidisciplinary use. We analyze information collected from instructor interviews and surveys, student surveys, and assessments from five undergraduate courses (243 students) at the three universities to understand aspects of data science curricula that are common across disciplines. Using a qualitative approach, we find commonalities in data science instruction and assessment components across the disciplines. This includes topical content, data sources, pedagogical approaches, and assessment design. Preliminary analyses of instructor interviews also suggest factors that affect the content taught and the assessment material across the five courses. These factors include class size, students’ year of study, students’ reasons for taking class, and students’ background expertise and knowledge. These findings indicate the challenges in developing data modules for multidisciplinary use. We hope that the analysis and reflections on our initial offerings has improved our understanding of these challenges, and how we may address them when designing future data science teaching modules. These are the first steps in a design-based approach to developing data science modules that may be offered across multiple courses. 
    more » « less
  4. With increasingly technology-driven workplaces and high data volumes, instructors across STEM+C disciplines are integrating more data science topics into their course learning objectives. However, instructors face significant challenges in integrating additional data science concepts into their already full course schedules. Streamlined instructional modules that are integrated with course content, and cover relevant data science topics, such as data collection, uncertainty in data, visualization, and analysis using statistical and machine learning methods can benefit instructors across multiple disciplines. As part of a cross-university research program, we designed a systematic structural approach–based on shared instructional and assessment principles–to construct modules that are tailored to meet the needs of multiple instructional disciplines, academic levels, and pedagogies. Adopting a research-practice partnership approach, we have collectively developed twelve modules working closely with instructors and their teaching assistants for six undergraduate courses. We identified and coded primary data science concepts in the modules into five common themes: 1) data acquisition, 2) data quality issues, 3) data use and visualization, 4) advanced machine learning techniques, and 5) miscellaneous topics that may be unique to a particular discipline (e.g., how to analyze data streams collected by a special sensor). These themes were further subdivided to make it easier for instructors to contextualize the data science concepts in discipline-specific work. In this paper, we present as a case study the design and analysis of four of the modules, primarily so we can compare and contrast pairs of similar courses that were taught at different levels or at different universities. Preliminary analyses show the wide distribution of data science topics that are common among a number of environmental science and engineering courses. We identified commonalities and differences in the integration of data science instruction (through modules) into these courses. This analysis informs the development of a set of key considerations for integrating data science concepts into a variety of STEM + C courses. 
    more » « less
  5. With increasingly technology-driven workplaces and high data volumes, instructors across STEM+C disciplines are integrating more data science topics into their course learning objectives. However, instructors face significant challenges in integrating additional data science concepts into their already full course schedules. Streamlined instructional modules that are integrated with course content, and cover relevant data science topics, such as data collection, uncertainty in data, visualization, and analysis using statistical and machine learning methods can benefit instructors across multiple disciplines. As part of a cross-university research program, we designed a systematic structural approach–based on shared instructional and assessment principles–to construct modules that are tailored to meet the needs of multiple instructional disciplines, academic levels, and pedagogies. Adopting a research-practice partnership approach, we have collectively developed twelve modules working closely with instructors and their teaching assistants for six undergraduate courses. We identified and coded primary data science concepts in the modules into five common themes: 1) data acquisition, 2) data quality issues, 3) data use and visualization, 4) advanced machine learning techniques, and 5) miscellaneous topics that may be unique to a particular discipline (e.g., how to analyze data streams collected by a special sensor). These themes were further subdivided to make it easier for instructors to contextualize the data science concepts in discipline-specific work. In this paper, we present as a case study the design and analysis of four of the modules, primarily so we can compare and contrast pairs of similar courses that were taught at different levels or at different universities. Preliminary analyses show the wide distribution of data science topics that are common among a number of environmental science and engineering courses. We identified commonalities and differences in the integration of data science instruction (through modules) into these courses. This analysis informs the development of a set of key considerations for integrating data science concepts into a variety of STEM + C courses. 
    more » « less