With increasingly technology-driven workplaces and high data volumes, instructors across STEM+C disciplines are integrating more data science topics into their course learning objectives. However, instructors face significant challenges in integrating additional data science concepts into their already full course schedules. Streamlined instructional modules that are integrated with course content, and cover relevant data science topics, such as data collection, uncertainty in data, visualization, and analysis using statistical and machine learning methods can benefit instructors across multiple disciplines. As part of a cross-university research program, we designed a systematic structural approach–based on shared instructional and assessment principles–to construct modules that are tailored to meet the needs of multiple instructional disciplines, academic levels, and pedagogies. Adopting a research-practice partnership approach, we have collectively developed twelve modules working closely with instructors and their teaching assistants for six undergraduate courses. We identified and coded primary data science concepts in the modules into five common themes: 1) data acquisition, 2) data quality issues, 3) data use and visualization, 4) advanced machine learning techniques, and 5) miscellaneous topics that may be unique to a particular discipline (e.g., how to analyze data streams collected by a special sensor). These themes were further subdivided to make it easier for instructors to contextualize the data science concepts in discipline-specific work. In this paper, we present as a case study the design and analysis of four of the modules, primarily so we can compare and contrast pairs of similar courses that were taught at different levels or at different universities. Preliminary analyses show the wide distribution of data science topics that are common among a number of environmental science and engineering courses. We identified commonalities and differences in the integration of data science instruction (through modules) into these courses. This analysis informs the development of a set of key considerations for integrating data science concepts into a variety of STEM + C courses.
more »
« less
Topics for an Introductory Data Science Course
Introductory data science courses are appearing at colleges, universities, and high schools around the country and the world. What topics do we cover in these courses, and how and why are these decisions made? How do we consider the background knowledge of our students and how they hope to utilize their skills after this course (whether professionally, additional courses, or as an engaged citizen)? In addition, the course is being taught by computer scientists, statisticians, business analysts, mathematicians, journalists, etc. Each of these disciplines approaches the topics differently. What upskilling is required of instructors to prepare them to integrate material from academic disciplines in which they were not trained into the course? How much, if any, cross-disciplinary collaboration, and discussion occurs or should occur in designing this course? Participants in this birds-of-a-feather will share their decision processes and choices about introductory data science courses that they teach or are designing. This includes choices made about the content as well as whether and how upskilling occurs. They will review and refine a list of current data science topics created based on national surveys of data science instructors as well as a review of curriculum guidelines. Close attention will be paid to differing language between data science instructors from different academic backgrounds. We welcome new and experienced data science instructors, educators planning on or interested in teaching such a course.
more »
« less
- Award ID(s):
- 2013392
- PAR ID:
- 10561965
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400704246
- Page Range / eLocation ID:
- 1920 to 1920
- Subject(s) / Keyword(s):
- Data Science Undergraduate Instruction Undergraduate Studies
- Format(s):
- Medium: X
- Location:
- Portland OR USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With increasingly technology-driven workplaces and high data volumes, instructors across STEM+C disciplines are integrating more data science topics into their course learning objectives. However, instructors face significant challenges in integrating additional data science concepts into their already full course schedules. Streamlined instructional modules that are integrated with course content, and cover relevant data science topics, such as data collection, uncertainty in data, visualization, and analysis using statistical and machine learning methods can benefit instructors across multiple disciplines. As part of a cross-university research program, we designed a systematic structural approach–based on shared instructional and assessment principles–to construct modules that are tailored to meet the needs of multiple instructional disciplines, academic levels, and pedagogies. Adopting a research-practice partnership approach, we have collectively developed twelve modules working closely with instructors and their teaching assistants for six undergraduate courses. We identified and coded primary data science concepts in the modules into five common themes: 1) data acquisition, 2) data quality issues, 3) data use and visualization, 4) advanced machine learning techniques, and 5) miscellaneous topics that may be unique to a particular discipline (e.g., how to analyze data streams collected by a special sensor). These themes were further subdivided to make it easier for instructors to contextualize the data science concepts in discipline-specific work. In this paper, we present as a case study the design and analysis of four of the modules, primarily so we can compare and contrast pairs of similar courses that were taught at different levels or at different universities. Preliminary analyses show the wide distribution of data science topics that are common among a number of environmental science and engineering courses. We identified commonalities and differences in the integration of data science instruction (through modules) into these courses. This analysis informs the development of a set of key considerations for integrating data science concepts into a variety of STEM + C courses.more » « less
-
Data Science is one of the fastest growing fields with unmet demand from employers. Many academic institutions have taken on the task of creating programs to meet both current and future needs and demands. Data science, as a field, integrates aspects of computer science, statistics, and subject matter expertise which encourages cross-disciplinary conversations and collaboration. In this talk, we present results from a broad survey of instructors of introductory college-level data science courses for undergraduates. In addition, we explore the alignment of these findings with the recommendations of various professional organizations. We conducted a national survey on topics covered in introductory, college-level data science courses. With responses from computer scientists, statisticians, and allied fields, these results represent a wide array of instructors of data science. The survey identifies topics commonly covered, the amount of time spent on each, common and divergent definitions of data science, and course materials used. These results will be presented. We will then discuss the alignment of these results through a rigorous review and synthesis of recommendations from various professional organizations. These include Association for Computing Machinery's Computing Competencies for Undergraduate Data Science Curricula[1], the National Academies of Science, Engineering, and Medicine’s Data Science for Undergraduates: Opportunities and Options[2], the Park City Math Institute's report Curriculum Guidelines for Undergraduate Programs in Data Science[3], and the American Statistical Association’s Two-Year College Data Science Summit Final Report[4] and Curriculum Guidelines for Undergraduate Programs in Statistical Science[5]. We will also explore alignment with ABET’s accreditation of data science.[6]more » « less
-
There have been many calls recently for computing for all across the nation. While there are many opportunities to study and use computing to advance the fields of computer science, software development, and information technology, computing is also needed in a wide range of other disciplines, including engineering. Most engineering programs require students take a course that teaches them introductory programming, which covers many of the same topics as an introductory course for computing majors (and at times may be the same course). However, statistics about the success of a course that is an introductory programming course are sobering; approximately half the students will fail, forcing them to either repeat the course or leave their chosen field of study if passing the course is required. This NSF IUSE project incorporates instructional techniques identified through educational psychology research as effective ways to improve student learning and retention in introductory programming. The research team has developed worked examples of problems that incorporate subgoal labels, which are explanations that describe the function of steps in the problem solution to the learner and highlight the problem-solving process. Using subgoal labels within worked examples, which has been effective in other STEM fields, students are able to see an expert's problem solving process, which helps students learn to solving problems before they can solve problem themselves. Experts, including instructors, teaching introductory level courses are often unable to explain the process they use in problem solving at a level that learners can grasp because they have automated much of the problem-solving processes after many years of practice. This submission will present the results of the first part of development of subgoals and will explain how to integrate them into classroom lessons in introductory computing classes.more » « less
-
Over the past decade, data science courses have been growing more popular across university campuses. These courses often involve a mix of programming and statistics and are taught by instructors from diverse backgrounds. In our experiences launching a data science program at a large public U.S. university over the past four years, we noticed one central tension within many such courses: instructors must finely balance how much computing versus statistics to teach in the limited available time. In this experience report, we provide a detailed firsthand reflection on how we have personally balanced these two major topic areas within several offerings of a large introductory data science course that we taught and wrote an accompanying textbook for; our course has served several thousand students over the past four years. We present three case studies from our experiences to illustrate how computer science and statistics instructors approach data science differently on topics ranging from algorithmic depth to modeling to data acquisition. We then draw connections to deeper tradeoffs in data science to help guide instructors who design interdisciplinary courses. We conclude by suggesting ways that instructors can incorporate both computer science and statistics perspectives to improve data science teaching.more » « less