Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
null (Ed.)Prompted by the skyrocketing demand for data scientists, progress made by the ACM Data Science Task Force on defining data science competencies, and inquiries about data science accreditation, ABET is in the process of developing accreditation criteria for undergraduate data science programs. The effort is led by members of a joint data science criteria subcommittee appointed by ABET’s Computing Accreditation Commission (CAC) and CSAB (the lead society for computing accreditation). Establishing data science accreditation criteria is a notable milestone in the maturing data science discipline, indicating the presence of an accepted body of knowledge, standards of practice, and ethical codes for practitioners. This position paper motivates the effort and discusses prior work towards defining data science education requirements. It describes the ongoing process for creating and obtaining approval of the accreditation criteria, and how feedback was and will be solicited from the computing and statistical communities. The current draft data science criteria, which was approved in July 2020 by the relevant ABET bodies for a year of public review and comment, is presented. These criteria emphasize the three pillars of data science: computing foundations, mathematical/statistical foundations, and experience in at least one data application domain. This report thus serves both to inform and to stimulate the academic discussion needed to finalize appropriate data science accreditation by ABET.more » « less
-
In critical infrastructure (CI) sectors such as emergency management or healthcare, researchers can analyze and detect useful patterns in data and help emergency management personnel efficaciously allocate limited resources or detect epidemiology spread patterns. However, all of this data contains personally identifiable information (PII) that needs to be safeguarded for legal and ethical reasons. Traditional techniques for safeguarding, such as anonymization, have shown to be ineffective. Differential privacy is a technique that supports individual privacy while allowing the analysis of datasets for societal benefit. This paper motivates the use of differential privacy to answer a wide range of queries about CI data containing PII with better privacy guarantees than is possible with traditional techniques. Moreover, it introduces a new technique based on Multipleattribute Workload Partitioning, which does not depend on the nature of the underlying dataset and provides better protection for privacy than current differential privacy approaches.more » « less
-
High Performance Computing (HPC) is the ability to process data and perform complex calculations at extremely high speeds. Current HPC platforms can achieve calculations on the order of quadrillions of calculations per second with quintillions on the horizon. The past three decades witnessed a vast increase in the use of HPC across different scientific, engineering and business communities, for example, sequencing the genome, predicting climate changes, designing modern aerodynamics, or establishing customer preferences. Although HPC has been well incorporated into science curricula such as bioinformatics, the same cannot be said for most computing programs. This working group will explore how HPC can make inroads into computer science education, from the undergraduate to postgraduate levels. The group will address research questions designed to investigate topics such as identifying and handling barriers that inhibit the adoption of HPC in educational environments, how to incorporate HPC into various curricula, and how HPC can be leveraged to enhance applied critical thinking and problem-solving skills. Four deliverables include: (1) a catalog of core HPC educational concepts, (2) HPC curricula for contemporary computing needs, such as in artificial intelligence, cyberanalytics, data science and engineering, or internet of things, (3) possible infrastructures for implementing HPC coursework, and (4) HPC-related feedback to the CC2020 project.more » « less
-
As data science is an evolving field, existing definitions reflect this uncertainty with overloaded terms and inconsistency. As a result of the field’s fluidity, there is often a mismatch between what data-related programs teach, what employers expect, and the actual tasks data scientists are performing. In addition, the tools available to data scientists are not necessarily the tools being taught; textbooks do not seem to meet curricular needs; and empirical evidence does not seem to support existing program design. Currently, the field appears to be bifurcating into data science (DS) and data engineering (DE), with specific but overlapping roles in the combined data science and engineering (DSE) lifecycle. However, curriculum design has not yet caught up to this evolution. This working group report shows an empirical and data-driven view of the data-related education landscape, and includes several recommendations for both academia and industry that are based on this analysis.more » « less