Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
This research paper systematically identifies the perceptions of learning machine learning (ML) topics. To keep up with the ever-increasing need for professionals with ML expertise, for-profit and non-profit organizations conduct a wide range of ML-related courses at undergraduate and graduate levels. Despite the availability of ML-related education materials, there is lack of understanding how students perceive ML-related topics and the dissemination of ML-related topics. A systematic categorization of students' perceptions of these courses can aid educators in understanding the challenges that students face, and use that understanding for better dissemination of ML-related topics in courses. The goal of this paper is to help educators teach machine learning (ML) topics by providing an experience report of students' perceptions related to learning ML. We accomplish our research goal by conducting an empirical study where we deploy a survey with 83 students across five academic institutions. These students are recruited from a mixture of undergraduate and graduate courses. We apply a qualitative analysis technique called open coding to identify challenges that students encounter while studying ML-related topics. Using the same qualitative analysis technique we identify quality aspects do students prioritize ML-related topics. From our survey, we identify 11 challenges that students face when learning about ML topics, amongst which data quality is the most frequent, followed by hardware-related challenges. We observe the majority of the students prefer hands-on projects over theoretical lectures. Furthermore, we find the surveyed students to consider ethics, security, privacy, correctness, and performance as essential considerations while developing ML-based systems. Based on our findings, we recommend educators who teach ML-related courses to (i) incorporate hands-on projects to teach ML-related topics, (ii) dedicate course materials related to data quality, (iii) use lightweight virtualization tools to showcase computationally intensive topics, such as deep neural networks, and (iv) empirical evaluation of how large language models can be used in ML-related education.more » « less
-
This research-to-practice paper reports students' perceptions on using a teaching framework called authentic learning to learn about information flow analysis. Using information flow analysis, practitioners find the flow of data across one or multiple programs. Information flow analysis is helpful for multiple software engineering activities, such as detecting software bugs and developing software fuzzing techniques. Despite being helpful in practice, learning about information flow analysis remains an impediment for students, which in turn prevents them from reaping the benefits of using information flow analysis. Therefore, an application of a teaching framework can aid students in learning about information flow analysis. To that end, we systematically investigate if authentic learning---a teaching framework that emphasizes on providing hands on experience for a practically relevant topic---is helpful for students to learn about information flow analysis. Upon conducting the exercise, students are asked to participate in a survey where they report perceptions about the conducted exercise. We analyze data from 170 students who were introduced to information flow analysis through an authentic learning-based exercise. From our analysis, we observe: (i) majority of the students to have little to no knowledge about information flow analysis prior to conducting the authentic learning-based exercise; (ii) 74.1\% of the 170 students find the authentic learning-based exercise helpful to learn about information flow analysis; and (iii) student perceptions to vary for the three components of the authentic learning-based exercise. We conclude our paper by describing the implications of our findings for instructors and researchers. For example, instructors should consider the education level of students while designing activities for individual authentic learning components to educate students on information flow analysis. Furthermore, researchers can devise strategies on how instructors can allocate their efforts for each authentic learning component through empirical studies. These studies may investigate the correlation between reported helpfulness and socio-technical factors, such as education level of students.more » « less
-
This paper presents an innovative approach to DevOps security education, addressing the dynamic landscape of cybersecurity threats. We propose a student-centered learning methodology by developing comprehensive hands-on learning modules. Specifically, we introduce labware modules designed to automate static security analysis, empowering learners to identify known vulnerabilities efficiently. These modules offer a structured learning experience with pre-lab, hands-on, and post-lab sections, guiding students through DevOps concepts and security challenges. In this paper, we introduce hands-on learning modules that familiarize students with recognizing known security flaws through the application of Git Hooks. Through practical exercises with real-world code examples containing security flaws, students gain proficiency in detecting vulnerabilities using relevant tools. Initial evaluations conducted across educational institutions indicate that these hands-on modules foster student interest in software security and cybersecurity and equip them with practical skills to address DevOps security vulnerabilities.more » « less
-
Large Language Models (LLMs) have extensive ability to produce promising output. Nowadays, people are increasingly relying on them due to easy accessibility, rapid and outstanding outcomes. However, the use of these results without appropriate scrutiny poses serious security risks, particularly when they are integrated with other software, APIs, or plugins. This is because the LLM outputs are highly dependent on the prompts they receive. Therefore, it is essential to carefully clean these outputs before using them in additional software environments. This paper is designed to teach students about the potential dangers of contaminated LLM output within the context of web development through prelab, hands-on, and postlab experiences. Hands-on lab provides practical guidance on how to handle LLM vulnerabilities to make applications safe with some real-world examples in Python. This approach aims to provide students with a deeper understanding of the precautions necessary to ensure software against the vulnerabilities introduced by LLM output.more » « less
-
Infrastructure as code (IaC) is the practice of automatically managing computing platforms, such as Internet of Things (IoT) platforms. IaC has gained popularity in recent years, yielding a plethora of software artifacts, such as Ansible playbooks that are available on social coding platforms. Despite the availability of open source software (OSS) Ansible playbooks, there is a lack of empirical research on the quality of these playbooks, which can hinder the progress of IaC-related research. To that end, we conduct an empirical study with 2,952 OSS Ansible playbooks where we evaluate the quality of OSS playbooks from the perspective of executability, i.e., if publicly available OSS Ansible playbooks can be executed without failures. From our empirical study, we observe 71.5\% of the mined 2,952 Ansible playbooks cannot be executed as is because of four categories of failures.more » « less
-
In infrastructure as code (IaC), state reconciliation is the process of querying and comparing the infrastructure state prior to changing the infrastructure. As state reconciliation is pivotal to manage IaC-based computing infrastructure at scale, defects related to state reconciliation can create large-scale consequences. A categorization of state reconciliation defects, i.e., defects related to state reconciliation, can aid in understanding the nature of state reconciliation defects. We conduct an empirical study with 5,110 state reconciliation defects where we apply qualitative analysis to categorize state reconciliation defects. From the identified defect categories, we derive heuristics to design prompts for a large language model (LLM), which in turn are used for validation of state reconciliation. From our empirical study, we identify 8 categories of state reconciliation defects, amongst which 3 have not been reported for previously-studied software systems. The most frequently occurring defect category is inventory, i.e., the category of defects that occur when managing infrastructure inventory. Using an LLM with heuristics-based paragraph style prompts, we identify 9 previously unknown state reconciliation defects of which 7 have been accepted as valid defects, and 4 have already been fixed. Based on our findings, we conclude the paper by providing a set of recommendations for researchers and practitioners.more » « less
-
Zomaya; Albert (Ed.)Context:Compilers are the fundamental tools for software development. Thus, compiler defects can disrupt development productivity and propagate errors into developer-written software source code. Categorizing defects in compilers can inform practitioners and researchers about the existing defects in compilers and techniques that can be used to identify defects systematically. Objective:The goal of this paper is to help researchers understand the nature of defects in compilers by conducting a review of Internet artifacts and peer-reviewed publications that study defect characteristics of compilers. Methodology:We conduct a multi-vocal literature review (MLR) with 26 publications and 32 Internet artifacts to characterize compiler defects. Results:From our MLR, we identify 13 categories of defects, amongst which optimization defects have been the most reported defects in our artifacts publications. We observed 15 defect identification techniques tailored for compilers and no single technique identifying all observed defect categories. Conclusion:Our MLR lays the groundwork for practitioners and researchers to identify defects in compilers systematically.more » « less
-
Generative artificial intelligence (AI) technologies, such as ChatGPT have shown promise in solving software engineering problems. However, these technologies have also shown to be susceptible to generating software artifacts that contain quality issues. A systematic characterization of quality issues, such as smells in ChatGPT-generated artifacts can help in providing recommendations for practitioners who use generative AI for container orchestration.We conduct an empirical study with 98 Kubernetes manifests to quantify smells in manifests generated by ChatGPT. Our empirical study shows: (i) 35.8% of the 98 Kubernetes manifests generated include at least one instance of smell; (ii) two types of objects Kubernetes namely, Deployment and Service are impacted by identified smells; and (iii) the most frequently occurring smell is unset CPU and memory requirements. Based on our findings, we recommend practitioners to apply quality assurance activities for ChatGPT-generated Kubernetes manifests prior to using these manifests for container orchestration.more » « less
-
Feldt, Robert; Zimmermann, Thomas (Ed.)Context Despite being beneficial for managing computing infrastructure at scale, Ansible scripts include security weaknesses, such as hard-coded passwords. Security weaknesses can propagate into tasks, i.e., code constructs used for managing computing infrastructure with Ansible. Propagation of security weaknesses into tasks makes the provisioned infrastructure susceptible to security attacks. A systematic characterization of task infection, i.e., the propagation of security weaknesses into tasks, can aid practitioners and researchers in understanding how security weaknesses propagate into tasks and derive insights for practitioners to develop Ansible scripts securely. Objective The goal of the paper is to help practitioners and researchers understand how Ansible-managed computing infrastructure is impacted by security weaknesses by conducting an empirical study of task infections in Ansible scripts. Method We conduct an empirical study where we quantify the frequency of task infections in Ansible scripts. Upon detection of task infections, we apply qualitative analysis to determine task infection categories. We also conduct a survey with 23 practitioners to determine the prevalence and severity of identified task infection categories. With logistic regression analysis, we identify development factors that correlate with presence of task infections. Results In all, we identify 1,805 task infections in 27,213 scripts. We identify six task infection categories: anti-virus, continuous integration, data storage, message broker, networking, and virtualization. From our survey, we observe tasks used to manage data storage infrastructure perceived to have the most severe consequences. We also find three development factors, namely age, minor contributors, and scatteredness to correlate with the presence of task infections. Conclusion Our empirical study shows computing infrastructure managed by Ansible scripts to be impacted by security weaknesses. We conclude the paper by discussing the implications of our findings for practitioners and researchers.more » « less
-
The field of DevOps security education necessitates innovative approaches to effectively address the ever evolving challenges of cybersecurity. Adopting a student-centered approach, there is the need for the design and development of a comprehensive set of hands-on learning modules. In this paper, we introduce hands-on learning modules that enable learners to be familiar with identifying known security weaknesses, based on taint tracking to accurately pinpoint vulnerable code. To cultivate an engaging and motivating learning environment, our hands-on approach includes a pre-lab, hands-on and postlab sections. They all provide introduction to specific DevOps topics and software security problems at hand, followed by practicing with real world code examples having security issues to detect them using tools. The initial evaluation results from a number of courses across multiple schools show that the hands-on modules are enhancing the interests among students on software security and cybersecurity, while preparing them to address DevOps security vulnerabilities.more » « less
An official website of the United States government

Full Text Available