<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>An Empirical Evaluation of Active Live Coding in CS1</title></titleStmt>
			<publicationStmt>
				<publisher>ACM</publisher>
				<date>06/10/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10599712</idno>
					<idno type="doi">10.1145/3743686</idno>
					<title level='j'>ACM Transactions on Computing Education</title>
<idno>1946-6226</idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Anshul Shah</author><author>Thomas Rexin</author><author>Fatimah Alhumrani</author><author>William G Griswold</author><author>Leo Porter</author><author>Gerald Soosai_Raj</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<p><bold>Objectives</bold>The traditional, instructor-led form of live coding has been extensively studied, with findings showing that this form of live coding imparts similar learning to static-code examples. However, a concern with Traditional Live Coding is that it can turn into a passive learning activity for students as they simply observe the instructor program. Therefore, this study compares Active Live Coding—a form of live coding that leverages in-class coding activities and peer discussion—to Traditional Live Coding on three outcomes: 1) students’ adherence to effective programming processes, 2) students’ performance on exams and in-lecture questions, and 3) students’ lecture experience.</p> <p><bold>Participants</bold>Roughly 530 students were enrolled in an advanced, CS1 course taught in Java at a large, public university in North America. The students were primarily first- and second-year undergraduate students with some prior programming experience. The student population was spread across two lecture sections—348 students in the Active Live Coding (ALC) lecture and 185 students in the Traditional Live Coding (TLC) lecture.</p> <p><bold>Study Methods</bold>We used a mixed-methods approach to answer our‘ research questions. To compare students’ programming processes, we applied process-oriented metrics related to incremental development and error frequencies. To measure students’ learning outcomes, we compared students’ performance on major course components and used pre- and post-lecture questionnaires to compare students’ learning gain during lectures. Finally, to understand students’ lecture experience, we used a classroom observation protocol to measure and compare students’ behavioral engagement during the two lectures. We also inductively coded open-ended survey questions to understand students’ perceptions of live coding.</p> <p><bold>Findings</bold>We did not find a statistically significant effect of ALC on students’ programming processes or learning outcomes. It seems that both ALC and TLC impart similar programming processes and result in similar student learning. However, our findings related to students’ lecture experience shows a persistent engagement effect of ALC, where students’ behavioral engagement peaks and<italic>remains elevated</italic>after the in-class coding activity and peer discussion. Finally, we discuss the unique affordances and drawbacks of the lecture technique as well as students’ perceptions of ALC.</p> <p><bold>Conclusions</bold>Despite being motivated by well-established learning theories, Active Live Coding did not result in improved student learning or programming processes. This study is preceded by several prior works that showed that Traditional Live Coding imparts similar student learning and programming skills as static-code examples. Though potential reasons for the lack of observed learning benefits are discussed in this work, multiple future analyses to further investigate Active Live Coding may help the community understand the impacts (or lack thereof) of the instructional technique.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>Live coding is an instructional technique in which the instructor programs in front of students while verbalizing their thought process. &#21504;&#26624;is instructor-led live coding, which we will call Traditional Live Coding (TLC), has been the subject of extensive study in computing education research. Early works explored common student perceptions of the lecture technique <ref type="bibr">[4,</ref><ref type="bibr">5,</ref><ref type="bibr">18,</ref><ref type="bibr">21]</ref>, subsequent works evaluated the impact of live coding on student grades and learning <ref type="bibr">[25]</ref><ref type="bibr">[26]</ref><ref type="bibr">[27]</ref>, and more recent works measured the e&#26112;&#26112;ect of live coding on students' programming processes <ref type="bibr">[30,</ref><ref type="bibr">33]</ref>. &#21504;&#26624;e recent empirical work on live coding has compared the traditional, instructor-led form of live coding to the use of static-code examples, which is a common alternative to live coding <ref type="bibr">[29]</ref>. However, the &#26112;&#26880;ndings of these recent works have not shown any improvement in student learning as a result of live coding <ref type="bibr">[25,</ref><ref type="bibr">30,</ref><ref type="bibr">33]</ref>.</p><p>A common criticism of traditional live coding is that it can ultimately be a passive experience for students, in which they observe the instructor without any active engagement <ref type="bibr">[15]</ref>. Given the lack of observed learning bene&#26112;&#26880;ts from Traditional Live Coding, a form of live coding that includes an active learning component may o&#26112;&#26112;er the learning bene&#26112;&#26880;ts that were not seen in recent empirical evaluations of live coding <ref type="bibr">[30,</ref><ref type="bibr">33]</ref>. In the form of live coding called Active Live Coding (ALC), the instructor uses Traditional Live Coding with several active coding components in which students complete a small programming task, discuss with peers, and then see a demonstration of the correct solution by the instructor. From a theoretical perspective, Active Live Coding engages more Methods of Cognitive Apprenticeship <ref type="bibr">[9]</ref>-a learning theory concerning the transfer of expertise from expert to learner-and involves a higher level of engagement according to the ICAP Framework <ref type="bibr">[8]</ref>-a framework for classifying learning activities into a hierarchy based on student engagement.</p><p>In this study, we follow a similar experimental setup and data analysis to a recent empirical evaluation to compare Traditional Live Coding to static-code examples by Shah et al. <ref type="bibr">[33]</ref>. Our study implements a course-long treatment of Active Live Coding in order to identify possible short-term and long-term impacts of the teaching technique. Our analysis aims to evaluate Active Live Coding across three key dimensions: 1) students' adherence to programming processes, 2) students' course outcomes and grades, and 3) students' lecture experience. Speci&#26112;&#26880;cally, we ask the following research questions:</p><p>&#8226; RQ1: How do students' programming processes (in terms of incremental development and error frequency metrics) di&#26112;&#26112;er between students in the traditional and active live coding groups?</p><p>&#8226; RQ2: How do course outcomes (such as performance on exams, code comprehension questions, programming assignments, etc.) di&#26112;&#26112;er between students in the traditional and active live coding groups?</p><p>&#8226; RQ3: How does the student experience (in terms of engagement and perceptions of the live coding technique) di&#26112;&#26112;er between students in the traditional and active live coding groups?</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">THEORETICAL FRAMEWORK</head><p>&#21504;&#26624;ere are two theories that we use to frame our study: Cognitive Apprenticeship and the ICAP Framework. We &#26112;&#26880;nd it necessary to involve both theories given the di&#26112;&#26112;erence in how the two theories impact student learning.</p><p>Cognitive Apprenticeship is a learning theory that describes the instructor's choice of learning activities while the ICAP Framework describes how students engage with those learning activities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Method Description Modeling</head><p>Instructor demonstrates a task to learners while the instructor verbalizes their thought process. Sca&#26112;&#26112;olding Instructor provides and fades targeted learning activities for learners to practice a task with support.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Coaching</head><p>Instructor provides feedback and guidance to students as students complete tasks. Articulation Learner explains their reasoning and justi&#26112;&#26880;es the strategies they used. Re&#26112;&#27648;ection Learner re&#26112;&#27648;ects on their own processes and compares their strategies to the instructor's strategies. Exploration Learner completes tasks independently without sca&#26112;&#26112;olds or support from the instructor.</p><p>Table <ref type="table">1</ref>. Methods in Cognitive Apprenticeship.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1">Cognitive Apprenticeship</head><p>&#21504;&#26624;e Cognitive Apprenticeship learning theory was outlined by Collins et al. and aims to bring the traditional apprenticeship model-which is one of the oldest models of knowledge transfer-into the classroom <ref type="bibr">[9]</ref>. A key di&#26112;&#26112;erence between traditional apprenticeship and Cognitive Apprenticeship is that traditional apprenticeship transfers knowledge of primarily physical tasks that can be learned through observation, such as blacksmithing or tailoring <ref type="bibr">[9]</ref>. By contrast, Cognitive Apprenticeship outlines a model for instructors to make their thinking visible to facilitate the transfer of complex skills that require higher-order reasoning and thought processes. &#21504;&#26624;e initial work describing Cognitive Apprenticeship presents examples of teaching students about skills such as reading comprehension, mathematical problem-solving, and writing <ref type="bibr">[9]</ref>. &#21504;&#26624;e Cognitive Apprenticeship learning theory broadly describes four dimensions of a learning environment: Content, Sequence, Sociology, and Methods <ref type="bibr">[9]</ref>. Content refers to the types of knowledge that instructors should teach students, such as domain knowledge, learning strategies, and heuristic strategies <ref type="bibr">[9]</ref>. Sequence refers to the ordering of learning activities to facilitate learning <ref type="bibr">[9]</ref>. Sociology refers to the social characteristics of the learning environment, such as cooperation and situated learning <ref type="bibr">[9]</ref>. Finally, Methods, which is the relevant dimension for the present study, refers to the instructional techniques to promote the development of expertise <ref type="bibr">[9]</ref>. Table <ref type="table">1</ref> outlines the six Methods of Cognitive Apprenticeship: modeling, sca&#26112;&#26112;olding, coaching, re&#26112;&#27648;ection, articulation, and exploration.</p><p>Shah and Soosai Raj conducted a literature review of 143 papers that explicitly mentioned Cognitive Apprenticeship in computing education research venues <ref type="bibr">[34]</ref>. &#21504;&#26624;e review aimed to understand which Cognitive Apprenticeship Methods have been used and evaluated in computing education and what bene&#26112;&#26880;ts have generally been a&#29696;&#29696;ributed to these Methods <ref type="bibr">[34]</ref>. &#21504;&#26624;e authors found that the majority of work discussed teaching strategies that engaged the &#26112;&#26880;rst three Methods of Cognitive Apprenticeship-modeling, sca&#26112;&#26112;olding, and coaching-while signi&#26112;&#26880;cantly less work has mentioned the re&#26112;&#27648;ection, articulation, and exploration Methods of Cognitive Apprenticeship. One potential reason for this di&#26112;&#26112;erence is that instructors felt that implementing the last three Methods of Cognitive Apprenticeship takes up too much lecture time <ref type="bibr">[34]</ref>. Nonetheless, a key takeaway from the literature review is that deeper empirical analyses into the impact of articulation, re&#26112;&#27648;ection, and exploration Methods of Cognitive Apprenticeship are needed <ref type="bibr">[34]</ref>. <ref type="bibr">Selvaraj et al.</ref> found that the theoretical construct most commonly cited with live coding is the modeling Method of Cognitive Apprenticeship <ref type="bibr">[29]</ref>. Although there are variations of live coding, the typical form of live coding involves an instructor programming in front of their students while verbalizing their thought process <ref type="bibr">[29]</ref>, just as prescribed in the modeling Method of Cognitive Apprenticeship <ref type="bibr">[9]</ref>. In fact, the studies conducted by <ref type="bibr">Shah et al.</ref> were motivated by a desire to empirically detect whether live coding-through the lens of the modeling Method-imparts implicit strategies such as incremental development and debugging techniques <ref type="bibr">[33]</ref>. &#21504;&#26624;ese works, which compared live coding to static-code examples, are empirical evaluations of the modeling Method, but they do not involve other Methods of Cognitive Apprenticeship. As Shah et al. point out in their study, the modeling Method exposes learners to only the implicit processes and strategies used by experts, but does not necessarily lead to learners being able to gain control over using these implicit processes. &#21504;&#26624;e remaining Methods, such as sca&#26112;&#26112;olding, re&#26112;&#27648;ection, and articulation, are vital for learners to not just observe but also apply these implicit processes <ref type="bibr">[9]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">The ICAP Framework</head><p>Chi and Wylie developed the ICAP Framework, a theory related to active learning which outlines four "modes" of engagement: Interactive, Constructive, Active, and Passive (creating the acronym "ICAP") <ref type="bibr">[7,</ref><ref type="bibr">8]</ref>. A learning activity can lead to students' engagement behaviors being in one of these four modes <ref type="bibr">[8]</ref>. &#21504;&#26624;e Passive mode occurs when learners simply observe and absorb information from the instructor without overtly engaging with the materials, such as taking notes. &#21504;&#26624;e Passive mode is characterized by a lack of student behavioral engagement with the instruction. &#21504;&#26624;e next mode of engagement is the Active mode, which is classi&#26112;&#26880;ed by "some form of overt motoric action or physical manipulation" being undertaken by students <ref type="bibr">[8]</ref>. A speci&#26112;&#26880;c example of Active engagement provided by Chi and Wylie is students copying down solution steps while listening to a lecture or taking verbatim notes <ref type="bibr">[8]</ref>. Next, the Constructive mode of engagement is characterized by students creating products that go beyond what was provided in the learning materials, such as asking questions, generating predictions, or drawing diagrams <ref type="bibr">[8]</ref>. Finally, the Interactive mode occurs when two students work together and both students are being constructive in their contributions <ref type="bibr">[8]</ref>. Chi and Wylie note that when one partner is dominating the conversation and the other is primarily listening, then the behaviors are not interactive. Instead, the student dominating the conversation is in a Constructive mode while the student that is listening is in Passive mode (or Active if they are taking notes as they listen). Of course, an underlying assumption in the ICAP framework is that learners enact the behaviors that are intended by the instructor <ref type="bibr">[8]</ref>. For example, in a Constructive activity where students have to write new code, they could engage with the activity only Actively by copy-pasting existing code from the example or Passively by simply not working on the task. &#21504;&#26624;is assumption represents a limitation of engagement-based interventions in which students fail to engage with learning activities.</p><p>&#21504;&#26624;e ICAP Framework has been extensively studied in various STEM Education disciplines <ref type="bibr">[8,</ref><ref type="bibr">41]</ref>, such as undergraduate biology <ref type="bibr">[41]</ref>, high-school science <ref type="bibr">[42]</ref>, and undergraduate physics education <ref type="bibr">[10]</ref>. &#21504;&#26624;e most similar study to our present study comes from Deslauriers et al., who conducted an experiment to compare the impact of traditional physics lecturing (which is Passive or Active) to a treatment condition where students are making predictions, problem solving, and discussing with each other (which is Constructive or Interactive) <ref type="bibr">[10]</ref>. Although the study was only conducted in one week in the course, the results showed that students in the treatment condition scored signi&#26112;&#26880;cantly higher on the exam for that week, a&#29696;&#29696;ended class more frequently during that week, and also shared in surveys that they enjoyed the new teaching style <ref type="bibr">[10]</ref>. &#21504;&#26624;is study by Deslauriers et al. is highly relevant to our study because of the similarity in the treatment and control groups to our study. Our similar experimental setup, although spanning an entire term rather than one week, tests a similar set of conditions in the computer science domain.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>CA Method</head><p>Trad.</p><p>Live Coding Active Live Coding Modeling Sca&#26112;&#26112;olding Coaching Articulation Re&#26112;&#27648;ection Exploration Table 3. Theoretical Framing of Traditional and Active Live Coding according to the ICAP Framework.</p><p>&#21504;&#26624;e ICAP Framework has also been cited in computing education research. In their work on subgoal learning via self-explanation, Margulieux and Catrambone use the ICAP Framework to motivate the approach of using self-explanation to learn since the "higher" engagement modes are associated with more student learning. For example, a Constructive approach to self-explain lecture material o&#26112;&#26112;ers greater learning bene&#26112;&#26880;ts to students than a Passive approach where students simply listen to the lecture material. Indeed, Margulieux and Catrambone found that when students self-explain the subgoals of the problems they completed, they perform be&#29696;&#29696;er on future tasks than if they did not self-explain at all <ref type="bibr">[20]</ref>.</p><p>&#21504;&#26624;e studies by Deslauriers et al. and Margulieux and Catrambone are only two of the many works that have established an empirical foundation for the ICAP Framework. Given the signi&#26112;&#26880;cant body of work that has found evidence in support of the ICAP Framework, we would expect that learning activities in the Constructive or Interactive engagement modes will result in more student learning than activities in the Passive or Active engagement modes. &#21504;&#26624;erefore, the ICAP Framework would suggest that students in the Active Live Coding lectures should perform be&#29696;&#29696;er on exams and assignments than those in the Traditional Live Coding lectures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Theoretical Framing of Active vs Traditional Live Coding</head><p>Cognitive Apprenticeship is the primary learning theory cited with live coding <ref type="bibr">[29]</ref>. We have not found any work that discusses live coding through the lens of the ICAP Framework, likely because the most common form of live coding in the literature is traditional, instructor-led live coding, which is mostly a passive learning activity. However, some clear theoretical di&#26112;&#26112;erences arise between Traditional and Active Live Coding, which are summarized in Tables <ref type="table">2</ref> and <ref type="table">3</ref>.</p><p>In terms of the Cognitive Apprenticeship Methods, ALC includes modeling, since the instructor is demonstrating the programming process while verbalizing their thoughts, sca&#26112;&#26112;olding, since the instructor provides a small activity for students to complete, articulation, since students discuss their solutions with each other, and re&#26112;&#27648;ection, since students have the chance to compare their own approach to a peer's approach and the instructor's approach. We do not consider ALC to engage the coaching Method, since each student does not get direct feedback or guidance on their own approach, or the exploration Method, since students are not independently completing open-ended tasks. In contrast, TLC only employs the modeling Method since students are only watching and listening to what the instructor is doing. &#21504;&#26624;ere is no opportunity for students to complete sca&#26112;&#26112;olded activities or discuss with peers. &#21504;&#26624;erefore, we would expect the students in the ALC group to exhibit be&#29696;&#29696;er programming processes than students in the TLC group since they have an opportunity to complete sca&#26112;&#26112;olded activities, articulate their approach, and re&#26112;&#27648;ect on their approach compared to peers.</p><p>In terms of the ICAP Framework, we classify Traditional Live Coding (TLC) as either an Active or Passive learning activity. Traditional, instructor led live coding is a Passive activity in the sense that students can simply watch and listen as the instructor programs rather than coding along <ref type="bibr">[15]</ref>. Of course, students may also be copying down the instructor's code or take notes during TLC sessions, which would constitute an Active engagement mode. Watkins et al. found that in live coding lectures, some students type the instructor's code, but students do not display any other engagement behaviors besides the Active engagement mode of typing along with the instructor. <ref type="bibr">[40]</ref>. On the other hand, Active Live Coding (ALC) at the very least reaches the Constructive mode since students are required to write code and reaches the Interactive mode depending on the quality of discussion between students. In fact, in their original paper presenting the ICAP Framework, Chi and Wylie noted that a learning activity consisting of problem solving and peer discussion, which is very similar to ALC in our study, constituted an Interactive activity. &#21504;&#26624;e reason it is questionable for whether ALC is Interactive, however, is due to the quality of peer discussion: when a discussion is dominated by one student and the other is only listening, then neither student is experiencing Interactive engagement. Based on the higher engagement level associated with Active Live Coding, we would expect the learning gains to be greater in the ALC group.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">RELATED WORK</head><p>&#21504;&#26624;e related work discussed in this section is organized by the three research questions we ask in this study: programming processes, student outcomes, and lecture experience. In general, the line of work related to live coding has existed since the early 2000s, with much of the early work reporting on instructor and student perceptions of live coding <ref type="bibr">[29]</ref>. More recently, however, the studies related to live coding have also studied student behavior and outcomes, with speci&#26112;&#26880;c empirical analyses dedicated to each of the three research questions in this study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1">Impact of Live Coding on Programming Processes</head><p>Much of the early work on live coding uncovered students' and instructors' perceptions of live coding <ref type="bibr">[4,</ref><ref type="bibr">5,</ref><ref type="bibr">18,</ref><ref type="bibr">21]</ref>. For example, Bennedsen and Caspersen discussed how live coding can reveal implicit programming processes to students, such as how to use an IDE and how to use incremental development <ref type="bibr">[4]</ref>. Further, K&#246;lling and Barnes presented live coding through the lens of "apprentice-based learning," such as how the instructor &#26112;&#26880;rst models the process to students, then students apply what they have observed, and &#26112;&#26880;nally design their own open-ended programming task <ref type="bibr">[18]</ref>, invoking the theory of Cognitive Apprenticeship <ref type="bibr">[9]</ref>. Finally, Paxton went a step beyond discussing the goals and theory of live coding by collecting survey responses from students <ref type="bibr">[21]</ref>. Direct statements from students in Paxton's study showed that students enjoyed seeing the debugging process and how an expert solves a programming task <ref type="bibr">[21]</ref>. Although these papers showed that live coding aims to reveal the programming process and that students reported seeing aspects of the programming process, none of these papers empirically tested whether live coding actually imparts adherence to e&#26112;&#26112;ective programming processes such as incremental development, debugging, and testing.</p><p>In order to &#26112;&#26880;ll this gap from prior work, Shah et al. conducted a series of experiments to compare a live coding pedagogy with a static-code one. In these experiments, half of the students in a large, CS1 course were taught via live coding during lectures and the other half of students were taught with static-code examples <ref type="bibr">[33]</ref>. All other course components were identical for the two groups of students, such as assignments, lab sections, and exams.</p><p>One of the goals of these studies was to test whether students in the live coding group adhered to incremental development, debugging, and testing more than the students in the static-code group <ref type="bibr">[33]</ref>. In two separate studies, the authors collected snapshots of students' code on programming assignments and coding assessments each time students ran their code. &#21504;&#26624;ey applied a set of programming process metrics, such as the Measure of Incremental Development (MID) <ref type="bibr">[32]</ref> to measure adherence to incremental development, the Repeated Error Density (RED) <ref type="bibr">[3]</ref> to measure how quickly students debugged an error, and frequency of diagnostic print statements to measure how students tested and veri&#26112;&#26880;ed their code <ref type="bibr">[30,</ref><ref type="bibr">33]</ref>. However, in both studies from Shah et al., the authors found no signi&#26112;&#26880;cant di&#26112;&#26112;erences across any of the programming process metrics.</p><p>Like earlier papers, the studies from Shah et al. frame their experiments on live coding through the lens of Cognitive Apprenticeship <ref type="bibr">[9]</ref>, speci&#26112;&#26880;cally noting that instructor-led live coding only engages the modeling Method of Cognitive Apprenticeship <ref type="bibr">[33]</ref>. &#21504;&#26624;is has been presented as a potential reason for the lack of signi&#26112;&#26880;cant &#26112;&#26880;ndings related to students' programming processes. As a result, part of the motivation for the present study is to evaluate whether a learning technique that involves more Methods of Cognitive Apprenticeship-Active Live Coding-results in greater adherence to incremental development and debugging practices than a technique that only involves the modeling Method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2">Impact of Live Coding on Course Outcomes</head><p>&#21504;&#26624;e &#26112;&#26880;rst empirical studies to evaluate live coding primarily compared students' exam scores between a static-code group and a live-coding group in introductory CS courses. Rubin conducted the &#26112;&#26880;rst comparative, empirical study between live coding and static-code examples. In a large introductory programming course with four lecture sections, Rubin selected two lecture sections to be live-coding groups and used the other two lecture sections as control groups that would learn via static-code examples. Rubin found that the groups scored similarly on the programming assignments and course exams, indicating a similar amount of student learning from the two lecture styles. One di&#26112;&#26112;erence, however, between the experimental and control groups was that the live coding group scored higher on the &#26112;&#26880;nal project at the end of the course, which was graded manually for correctness and clarity <ref type="bibr">[27]</ref>. Importantly, in their interpretation of results, Rubin notes that students may have scored be&#29696;&#29696;er on the &#26112;&#26880;nal project because students' debugging skills would be be&#29696;&#29696;er in the live coding group a&#26112;&#29696;er seeing the instructor debug. However, Rubin did not conduct any empirical analyses on the students' debugging processes.</p><p>In a similar follow-on study conducted by Raj et al., the authors wanted to 1) measure the cognitive load associated with static code examples and live coding via surveys and 2) compare students' learning via a pre-test and post-test <ref type="bibr">[25]</ref>. In terms of cognitive load, the authors found that live coding was associated with signi&#26112;&#26880;cantly less extraneous cognitive load compared to the static-code group. Extraneous load relates to the load on working memory that gets in the way of student learning, such as being distracted by other students during lecture or hearing disorganized lecture material <ref type="bibr">[25]</ref>. &#21504;&#26624;e authors found no signi&#26112;&#26880;cant di&#26112;&#26112;erences on learning gain between the pre-test and post-test, although the static-code group showed slightly higher, though not signi&#26112;&#26880;cant, learning gain than the live coding group <ref type="bibr">[25]</ref>.</p><p>&#21504;&#26624;e series of works by Shah et al. also investigated the impact of live coding compared to static code examples on students' performance on assignments and exams. In the two experiments conducted by Shah et al., both found similar outcomes between live coding and static code groups on exams and assignments, with no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erences between the groups <ref type="bibr">[30,</ref><ref type="bibr">33]</ref>. Even a deeper analysis into student performance on code tracing questions, code writing questions, and code explaining questions showed similar student performance across these di&#26112;&#26112;erent types of questions <ref type="bibr">[33]</ref>. In general, these works from Shah et al. con&#26112;&#26880;rmed the prior &#26112;&#26880;ndings that compared a static code pedagogy to a live coding pedagogy related to course outcomes-in general, there is li&#29696;&#29696;le to no di&#26112;&#26112;erence in student performance on exams and assignments between the two types of code examples <ref type="bibr">[25,</ref><ref type="bibr">27,</ref><ref type="bibr">30,</ref><ref type="bibr">33]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3">Impact of Live Coding on Lecture Experience</head><p>A signi&#26112;&#26880;cant amount of work has concerned the impact of live coding on students' lecture experience, revealing a variety of bene&#26112;&#26880;ts and drawbacks of live coding.</p><p>&#21504;&#26624;e bene&#26112;&#26880;ts of live coding on students' lecture experience includes, but is not limited to, revealing the programming process to students <ref type="bibr">[18,</ref><ref type="bibr">21,</ref><ref type="bibr">26,</ref><ref type="bibr">33]</ref>, reducing cognitive load during lecture <ref type="bibr">[25]</ref>, and potentially engaging more students during lecture <ref type="bibr">[6,</ref><ref type="bibr">21,</ref><ref type="bibr">26,</ref><ref type="bibr">31]</ref>. Many early works related to live coding, as mentioned before, touted live coding as a way to expose the implicit programming process to students <ref type="bibr">[18,</ref><ref type="bibr">21]</ref>. In fact, <ref type="bibr">Shah et al</ref>. conducted an open-ended survey to students in both the live coding and static code lecture groups in their study to understand the main perceived bene&#26112;&#26880;ts from students' point of view <ref type="bibr">[33]</ref>. &#21504;&#26624;e qualitative analysis revealed that students in the live coding group mentioned observing some part of the programming process at a higher rate than students in the static code group. &#21504;&#26624;e opposite was true for "Code Comprehension, " however, as more students in the static code group reported that seeing the static code examples improved their understanding of the code's purpose than students in the live coding group <ref type="bibr">[33]</ref>. Another key perceived bene&#26112;&#26880;t of live coding, which has not yet been empirically tested, is that live coding results in higher student engagement <ref type="bibr">[6,</ref><ref type="bibr">21,</ref><ref type="bibr">26]</ref> (the type of engagement-cognitive, behavioral, or emotional <ref type="bibr">[13]</ref>-is not typically speci&#26112;&#26880;ed in these works). For example, student feedback in Paxton's work showed that students found it fun to see the output of running code <ref type="bibr">[21]</ref>. Similarly, students' feedback in Raj et al.'s study showed that they tend to type along with the instructor as they live code <ref type="bibr">[26]</ref>. It seems intuitive that students may be more engaged watching an instructor program dynamically during lecture, but this claim has not been empirically tested. In fact, this lack of empirical evaluation motivates part of our third research question, which is to measure students' behavioral engagement in the Traditional Live Coding and Active Live Coding lecture sections. &#21504;&#26624;e drawbacks of live coding have also been extensively identi&#26112;&#26880;ed, such as the di&#26112;&#52992;culty for students to follow along with the instructor <ref type="bibr">[26,</ref><ref type="bibr">33]</ref> and the limited time in a lecture that results in a rushed live coding example <ref type="bibr">[6,</ref><ref type="bibr">33,</ref><ref type="bibr">39]</ref>. <ref type="bibr">Shah et al.</ref> also conducted an open-ended survey to ask students about the drawbacks of the code examples in their lecture for both the static-code and live coding groups <ref type="bibr">[33]</ref>. Nearly 20% of the students in the live-coding group suggested that the instructor should slow down, whereas only 2% of students in the static-code group suggested the same thing <ref type="bibr">[33]</ref>. &#21504;&#26624;is feeling of a rushed lecture pace is likely because live coding simply takes more time than static-code examples <ref type="bibr">[6]</ref>. Indeed, Watkins et al. conducted a comparative study of live coding and static-code examples in a single lab session. &#21504;&#26624;ey found that the live coding session, which covered the same material as the static-code session, took more than twice as long to complete <ref type="bibr">[39]</ref>. Although the study by Watkins et al. was in a lab section with &#26112;&#27648;exible timing, when an instructor is bound to a &#26112;&#26880;xed-time lecture, there certainly exists a time constraint to complete all the material. &#21504;&#26624;e impact of this rushed lecture pace is that students are unable to follow along as easily. Shah et al. included an analysis on a set of anonymous, end-of-course feedback items that asked students whether the instructors' lecture style facilitated note-taking and held students' a&#29696;&#29696;ention. In both questions, there was a statistically signi&#26112;&#26880;cant di&#26112;&#26112;erence showing that students in the live coding lectures had a harder time note-taking and paying a&#29696;&#29696;ention <ref type="bibr">[33]</ref>. Given these downsides, instructors must be careful to keep their live coding sessions to a reasonable pace and to ensure that the class is able to follow along with the example. Indeed, there are many factors that determine the e&#26112;&#26112;ectiveness of a live coding lecture <ref type="bibr">[29]</ref>, revealing the di&#26112;&#52992;culty of using live coding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">STUDY CONTEXT 4.1 Course Setup</head><p>&#21504;&#26624;e study was conducted in the Fall 2023 term at UC San Diego-a large, public, research-focused university in North America. &#21504;&#26624;e course was an advanced CS1 course taught in Java. &#21504;&#26624;e course content included basic data types, basic data structures, and object-oriented programming, such as classes, inheritance, and generics. &#21504;&#26624;e course enrollment was 600 total students, who were split into a 400-person Active Live Coding lecture section taught at 9:30AM on Tuesdays and &#21504;&#26624;ursdays and a 200-person Traditional Live Coding lecture section taught at 11AM on the same days. Both lecture sections were taught by the same instructor. When students registered for the course, they only knew the instructor of the course and did not know about the di&#26112;&#26112;erence between the lecture sections (i.e., that one would be Traditional Live Coding and one would be Active Live Coding).</p><p>In a typical week of the course, students a&#29696;&#29696;ended two lecture sections for 80 minutes each and a mandatory discussion section for 50 minutes. Students also completed weekly programming assignments (PAs), worksheets, and textbook activities in an online textbook hosted on Stepik <ref type="bibr">[35]</ref>. &#21504;&#26624;e frequency and description of the di&#26112;&#26112;erent course components is provided in Table <ref type="table">4</ref>.</p><p>Table 4. Key course components of the CS1 course.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Component</head><p>Frequency Description</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Lectures Twice per week</head><p>Each lecture is 80 minutes and covers the main course material. &#21504;&#26624;e treatment condition of Active Live Coding only applies to the lectures. An examination of the lecture structure is in Table <ref type="table">5</ref>. Programming Assignments (PAs)</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Once per week</head><p>Students apply the material they learned in lecture in a weekly programming assignment graded for correctness.</p><p>Assignments are hosted on the Edstem online IDE <ref type="bibr">[12]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Worksheets</head><p>Once per week Students independently complete a paper-based worksheet with code tracing, writing, and explaining questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Reading &#20736;&#29952;izzes Once per week</head><p>Students independently complete interactive programming activities in an online textbook and can submit responses unlimited times without penalty.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Midterm Exam Once per term</head><p>Students independently complete an in-person, proctored exam for 2 hours, covering the concepts taught in the &#26112;&#26880;rst half of the course.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Midterm Coding Challenge</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Once per term</head><p>Students complete a proctored coding task on Edstem <ref type="bibr">[12]</ref>.</p><p>Students have 45 minutes to complete the task, which is graded based on correctness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Final Exam</head><p>Once per term Students independently complete an in-person, proctored exam for 3 hours, covering all concepts taught in the course.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Final Coding Challenge Once per term</head><p>Just like the Midterm Coding Challenge, students independently complete a proctored coding task in 45 minutes and are graded based on correctness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Discussion</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Once per week</head><p>Students may a&#29696;&#29696;end an in-person, 50-minute session to review the material covered in that week's lectures and preview the next programming assignment.</p><p>O&#26112;&#52992;ce Hours Everyday (optional)</p><p>Students may a&#29696;&#29696;end o&#26112;&#52992;ce hours held by course sta&#26112;&#26112; (TAs, tutors, etc.) to receive help on course materials. O&#26112;&#52992;ce hours were hosted M-F from roughly 9 A.M. to 7 P.M.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2">Lecture Structure</head><p>Both the TLC and ALC lectures covered the same material and featured similar instructional activities as seen in Table <ref type="table">5</ref>. Before the lecture started, the course sta&#26112;&#26112; handed out an in-class worksheet to each student. &#21504;&#26624;is worksheet included small code examples and simple questions related to the upcoming lecture material for additional practice. Each lecture, the instructor either asked students to complete the worksheet during the lecture or assigned the worksheet questions to be completed before next lecture. &#21504;&#26624;e lectures always began with roughly 5 minutes of course announcements, followed by a pre-lecture questionnaire that students submi&#29696;&#29696;ed via Gradescope <ref type="bibr">[16]</ref> within a 10-minute window for a&#29696;&#29696;endance. &#21504;&#26624;ese pre-lecture questions were meant to capture students' understanding of the lecture material before the lecture began and provided a basis for measuring learning gain by asking the same questions a&#26112;&#29696;er the lecture. Following the announcements and pre-lecture questions, the instructor reviewed the in-class worksheet from the previous lecture, typically taking up to 10 to 15 minutes. In both ALC and TLC lectures, the instructor then began teaching new material through Traditional Live Coding, complemented by handwri&#29696;&#29696;en notes. During these live coding sessions, the instructor occasionally posed questions to the students, asking them to predict missing code segments or the output of the code. In the ALC lecture only, the professor then initiated an Active Live Coding segment, marking the point where the two lectures diverge. In the active live coding segment, the instructor provided a sca&#26112;&#26112;olded Java &#26112;&#26880;le for students to complete, which was a result of the previous Traditional Live Coding segment. Students usually had to write a simple method or implement the key logic of a method. During this phase, students forked the instructor's workspace on Edstem and spent 5 to 7 minutes completing missing code. Following this, they spent 3 to 5 minutes discussing their approach with peers, a process similar to peer instruction <ref type="bibr">[22]</ref>. Finally, the professor explained the solution using Traditional Live Coding and continued to cover the remainder of the new content. Typically, the instructor used between 1 to 3 ALC components in each lecture. &#21504;&#26624;e only di&#26112;&#26112;erence between the TLC and ALC lectures was the active coding done by students and the peer discussion following the active coding. Importantly, both lectures had traditional live coding components, but only the ALC lecture had the active live coding component.</p><p>During the TLC section, the instructor used this time to live code the same material that students were asked to write during the ALC lecture. However, given that the ALC components take signi&#26112;&#26880;cantly longer than simply using TLC to show the same material, the instructor could slow down during parts of the TLC lecture or spend more time explaining and debugging an error. &#21504;&#26624;e lectures all ended with post-lecture questions that took approximately 5 minutes. Students were free to leave once they &#26112;&#26880;nished answering the questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3">Coding Challenges</head><p>An important part of the course setup and data collection for RQ1 were the Midterm and Final Coding Challenges. In both coding challenges, students were given 45 minutes to 1) create new methods and classes and 2) write accurate tests for their implementations. &#21504;&#26624;ese summative assessments were developed by the research team to evaluate how students create programs on their own in a controlled environment. In the Midterm Coding Challenge (MCC), students were given a partially-complete class with &#26112;&#26880;elds and a valid constructor and were tasked with adding a method to the existing class, writing a new class and de&#26112;&#26880;ning some &#26112;&#26880;elds, a constructor, and methods for that class, and creating a tester class that tests both the given class and newly created class. &#21504;&#26624;e concepts in the MCC included string concatenation, conditionals, incrementing of variables, and testing. In the Final Coding Challenge, students were tasked with writing two functions, one of which computed the average of a given column in a 2-D array and the other returned a cropped version of the 2-D array. &#21504;&#26624;e concepts on the FCC included nested for-loops, array indexing, modifying the elements in an array, and testing edge cases. For both coding challenges, students were graded on correctness based on the amount of test cases that their code passed. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4">Participants</head><p>In accordance with our human subjects protocol, students consented to release their data for research purposes at the start of the course. At the start of the term, the members of the research team sent out a consent form to all students in the class that described the general goals of the research project and asked students whether 1) they were at least 18 years old and 2) consented to release their data for research purposes. &#21504;&#26624;e instructor of the course was not allowed to see the list of consenting students at any point in the research project. A total of 521 students across both lecture sections agreed to release their data. &#21504;&#26624;is human subjects protocol and consent process was only used to collect student data related to their programming processes. At the start of the quarter, we distributed a survey to gather students' demographics, con&#26112;&#26880;dence level, and reason for taking the course. &#21504;&#26624;e students' information in each group is summarized in Table <ref type="table">6</ref>. &#21504;&#26624;e students had very similar distributions for school year and we also did not see any glaring di&#26112;&#26112;erences between the racial or gender makeup of the two lecture sections. &#21504;&#26624;ere was a slight di&#26112;&#26112;erence in the metrics related to con&#26112;&#26880;dence, as more students in the ALC lecture responded with a "4" (con&#26112;&#26880;dent) and "5" (very con&#26112;&#26880;dent) on their con&#26112;&#26880;dence to do well in the course. Similarly, more students in the ALC lecture would be satis&#26112;&#26880;ed with an "A" grade than TLC students.</p><p>In terms of demographics among the consenting students, 61% of students in the ALC group identi&#26112;&#26880;ed with he/him/his pronouns and 36% identi&#26112;&#26880;ed with she/her/hers pronouns compared to 62% and 34% of students in the TLC group, respectively. Furthermore, the self-identi&#26112;&#26880;ed racial makeups for both groups consisted of about 70% Asian or Asian American students, 15% Chicanx or Latinx students, and 15% White or Caucasian students. &#21504;&#26624;e remaining students self-identi&#26112;&#26880;ed into racial groups with less than 10 students and we do not disclose these groups in accordance with our human subject protocol.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5">METHODS</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1">RQ1 Methods: Programming Processes</head><p>To collect data regarding students' programming processes, we conducted two coding challenges, as discussed in Section 4.1. Before the experiment began, we reached out to Edstem <ref type="bibr">[12]</ref> about obtaining snapshots of students' code every time they compiled their code by clicking the "Run" bu&#29696;&#29696;on during a programming task. &#21504;&#26624;e team at Edstem agreed, provided that we show the Edstem sta&#26112;&#26112; the consent form students agreed to at the start of the term and the responses to the form showing which students consented. A&#26112;&#29696;er sending the list of consenting students to Edstem, we obtained snapshots of students' coding workspace each time they compiled and ran their code in the coding challenges. Using this data, we analyzed students' programming processes across three dimensions: 1) adherence to incremental development, 2) error frequencies, and 3) programmer productivity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.1">Comparing</head><p>Adherence to Incremental Development. Incremental development has been speci&#26112;&#26880;cally cited as a perceived learning bene&#26112;&#26880;t of live coding <ref type="bibr">[29]</ref>. Shah et al. already conducted an analysis using the Measure of Incremental Development (MID) to compare students in a static-code pedagogy to a live-coding one <ref type="bibr">[33]</ref>. &#21504;&#26624;erefore, we aimed to replicate this analysis for our experiment. We applied the language-agnostic metric to our data since the MID has been tested on simple programming tasks between one to three functions long, which precisely describes our coding challenges <ref type="bibr">[32]</ref>. &#21504;&#26624;e MID is publicly-available as a Python package <ref type="foot">1</ref> and the source code is freely available, allowing us to access the code and modify its functionality to &#26112;&#26880;t the Java programming language.</p><p>We computed MID values for all Midterm and Final Coding Challenge submissions. &#21504;&#26624;e sample size for both analyses was 345 students for the Active Live Coding group and 185 students for the Traditional Live Coding group. Given this su&#26112;&#52992;ciently large sample size, we conducted two-sample t-tests <ref type="bibr">[24]</ref> to detect any di&#26112;&#26112;erences, if any, between the two groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.1.2">Comparing</head><p>Error Frequencies and Debugging. For debugging and error-frequency measures, we applied the Repeated Error Density (RED) developed by Becker <ref type="bibr">[3]</ref>. &#21504;&#26624;e RED uses the output of student code to track the frequency of error messages and penalize students for consecutive errors of the same type <ref type="bibr">[3]</ref>. For example, three consecutive syntax errors in a row will result in a higher RED score than three syntax errors with at least one compilation between each error <ref type="bibr">[3]</ref>. A higher RED value indicates a greater frequency of a particular error.</p><p>To calculate the RED, we wrote a script to iterate through each students' snapshots, compile the Java &#26112;&#26880;le for each snapshot, and capture the output of the program execution. We parsed the outputs to identify which snapshots results in errors and which errors occurred. As with the MID analysis, we conducted two-sample t-tests to compare the two lecture sections across the various error types on the Midterm and Final Coding Challenges. &#21504;&#26624;e most common errors that students encountered were Symbol Not Found, Missing Identi&#26112;&#26880;er, Syntax Error, Type Mismatch, Non-Static Access, and Rede&#26112;&#26880;nition Con&#26112;&#27648;ict errors, described in Table <ref type="table">7</ref>. Finally, we collected data about programmer productivity on the coding challenges in terms of the number of compilations that students conducted before their &#26112;&#26880;nal submission and the number of requirements correctly implemented. &#21504;&#26624;e number of compilations is easily derived based on the number of snapshots for each student in our dataset. To determine the number of requirements satis&#26112;&#26880;ed, we used student grades on the MCC and FCC since these challenges were graded on whether or not they passed a set of hidden test cases that were run only a&#26112;&#29696;er students made their &#26112;&#26880;nal submission. As with the analyses related to incremental development and error frequencies, We conducted two-sample t-tests across the comparisons for programmer productivity. &#21504;&#26624;ough we aimed to fully replicate the analysis from Shah et al., we were unable to collect data related to the time to completion. &#21504;&#26624;ough we had timestamps in the snapshot data from Edstem, we did not know the start time for each student. As a result, we did not analyze time to completion in this study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2">RQ2 Methods: Course Performance</head><p>Our second research question compares 1) course grades, including assignments, a&#29696;&#29696;endance, and exam scores, and 2) performance across code tracing and code writing questions on the exams.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.1">Comparing Student Grades.</head><p>As discussed in Section 4.1, in a typical week, students completed a reading quiz, a&#29696;&#29696;ended two lectures, a&#29696;&#29696;ended a discussion section, completed a programming assignment, and completed a homework worksheet. Students also completed two exams during the term. Overall, students accumulated grades the following categories: lecture a&#29696;&#29696;endance, programming assignments, exams, and worksheets. To compare student grades, we compared &#26112;&#26880;nal grades across these various course components using two-sample t-tests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.2">Comparing Student Performance on Code Tracing and Code Writing &#20736;&#29952;estions.</head><p>Although overall course grades is an important measure of students' learning outcomes, it is a coarse measurement of students' skills. As a result, we aimed to compare student performance across various types of exam questions, as done by Shah et al. &#8226; Code explaining questions involve students describing the purpose of a piece of code in plain English.</p><p>&#8226; Code writing questions involve students generating blocks or lines of code. &#21504;&#26624;ese questions may ask students to &#26112;&#26880;ll in the blank of a nearly-completed program or even write an entire function.</p><p>&#8226; Code tracing (without loops) questions involve students predicting the output of a piece of code from a given input, provided that the code does not include any loops. Venables et al. distinguish code tracing questions based on whether or not there is a loop (while or for) in the provided code because of the added complexity of the loop.</p><p>&#8226; Code tracing (with loops) are the same as the item above, except applies to code tracing questions where the code includes a while or for loop.</p><p>&#8226; Basic questions involve more students' answering conceptual or theoretical questions that do not fall into the categories above. &#21504;&#26624;e counts of each question type for each exam are outlined in Table <ref type="table">8</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.2.3">Comparing Student Learning Gain During Lecture.</head><p>A deeper description of the methods related to learning gain during lecture can be found in previously-published work from this experiment <ref type="bibr">[2]</ref>. To calculate learning gain, we analyzed the pre-and post-lecture question results. At the start of the experiment, we aimed to have the pre-and post-lecture questions be isomorphic <ref type="bibr">[43]</ref>, but a&#26112;&#29696;er 3 weeks, we made the pre-and post-lecture questions to be the exact same due to concerns about whether the questions were truly isomorphic.</p><p>Our method for calculating learning gain is the same as Porter et al. in their study related to learning gain in Peer Instruction <ref type="bibr">[23]</ref>. In order to make the analysis agnostic to di&#26112;&#26112;erent correctness levels between the two groups in the pre-lecture questions, the calculation of learning gain focuses only on the Potential Learner Group (PLG)-the group who answered the pre-lecture questions incorrectly. &#21504;&#26624;e learning gain metric for each question is the percentage of PLG students that correctly answered the post-lecture question.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3">RQ3 Methods: Lecture Experience</head><p>Our third research question consists of 1) identifying student perceptions of Active Live Coding, 2) comparing perceptions of the Traditional Live Coding components, which occurred in both lectures, and 3) comparing student behavioral engagement during lectures.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.1">Identifying Student Perceptions of Active Live Coding.</head><p>In Week 4 of the course, we required students to complete a re&#26112;&#27648;ection survey that counted for 1% of weekly programming assignment. &#21504;&#26624;e survey was sent to all students, but one section of the survey was speci&#26112;&#26880;cally for students who were part of the Active Live Coding (ALC) lecture section. In total, 406 students responded to the survey out of the 531 total students that &#26112;&#26880;nished the course, resulting in a response rate of 76.5%. Of the 406 total responses, 271 were in the ALC group. One of the questions that we asked speci&#26112;&#26880;cally to ALC students was: "On a scale of 1 to 5, how helpful is the active coding component of lectures in which you write a small part of the live coding example?" We also asked an open-ended, follow-up question that read: "Please give a brief explanation of your rating on the active coding component. "</p><p>We conducted a bo&#29696;&#29696;om-up, thematic analysis (also called "open-coding") of the 271 student responses <ref type="bibr">[17]</ref>. In this process, two members of the research team conducted several rounds of independently coding the data and then deliberating together to create a &#26112;&#26880;nal code book that includes the variety of themes seen in the student responses. &#21504;&#26624;e two coders could apply multiple labels to a single student response if appropriate. In the &#26112;&#26880;rst round, the two coders independently analyzed the &#26112;&#26880;rst 30 responses from students and wrote down one or multiple themes for each answer. For each theme, the coders wrote down a general description of the meaning of that theme. A&#26112;&#29696;er the independent coding, the coders met to go over the themes they identi&#26112;&#26880;ed and come to a consensus on the &#26112;&#26880;rst iteration of the code book, which was only based on the &#26112;&#26880;rst 30 student responses at the time. During this deliberation phase, the coders not only decided on the themes and descriptions for the code book but also came to a consensus on the theme(s) for each of the &#26112;&#26880;rst 30 responses. In the second round of coding, the coders analyzed the next 30 student responses. In the independent coding phase of the round, the coders could apply an existing theme from the code book or could propose a new theme if an existing theme did not &#26112;&#26880;t. Further, the coders could modify or add to the existing description of the themes to more accurately re&#26112;&#27648;ect the meaning of the theme. &#21504;&#26624;e two coders then met again to create the second iteration of the code book and to agree on codes for the 30 responses they individually reviewed. &#21504;&#26624;is process continued for four more rounds, with the coders completing 30 responses in the third round, 60 responses in the fourth round, 60 responses in the &#26112;&#26880;&#26112;&#29696;h round, and the remaining 61 responses in the sixth round.</p><p>&#21504;&#26624;e &#26112;&#26880;nal code book can be found in Table <ref type="table">19</ref> in the Appendix.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.2">Comparing Perceived Benefits of Traditional Live</head><p>Coding. &#21504;&#26624;ough the analysis above uncovers the perceptions of Active Live Coding (ALC), we wanted to compare student perceptions of the Traditional Live Coding (TLC) components of the lectures, since both groups of students were exposed to this lecture technique. &#21504;&#26624;e goal of this analysis was not only to replicate the analysis conducted by Shah et al. <ref type="bibr">[33]</ref> but also to determine whether there is a di&#26112;&#26112;erence in student perceptions of TLC between students who have been exposed to ALC and who have not. To make this comparison while also replicating the analysis from</p><p>Shah et al., we included the same survey question as Shah et al. in the Week 4 survey mentioned above. &#21504;&#26624;e survey question reads: "What are some speci&#26112;&#26880;c things about the live-coding examples that have been helpful for your learning?". As mentioned above, 406 students responded to this survey. Our open-coding process for this survey question relied on the same code book developed by Shah et al. for the same question <ref type="bibr">[33]</ref>. &#21504;&#26624;erefore, we did not undergo the same open-coding process as the analysis for the perceptions of ALC since the code book was already developed. Instead, two coders (the same two coders who conducted the analysis of perceptions of ALC) studied the code book from Shah et al. and began the process of independently coding the responses and then deliberating to resolve disagreements. &#21504;&#26624;e coders could apply multiple labels to a student response. &#21504;&#26624;ough the code book was already created, we were concerned that there may be responses in our data that did not &#26112;&#26880;t the existing code book. &#21504;&#26624;erefore, we instructed the coders to modify or expand the code book if needed. Ultimately, however, our coders did not need to make any changes to the original code book. &#21504;&#26624;e coders analyzed the 406 responses across &#26112;&#26880;ve rounds of analysis. &#21504;&#26624;e coders analyzed 50 responses in the &#26112;&#26880;rst round, 75 in the second round, 75 in the third round, 100 in the fourth round, and 106 in the &#26112;&#26880;&#26112;&#29696;h round. &#21504;&#26624;e &#26112;&#26880;nal code book is the same as the code book presented in the appendix of the original work by Shah et al. <ref type="bibr">[33]</ref>.</p><p>One unique situation arose with our analysis of this open-ended question, which was intended to ask students about only the Traditional Live Coding portion of the lectures. Several students in the ALC group responded to this question with a response that discussed their perception of Active Live Coding rather than Traditional Live Coding. &#21504;&#26624;is likely occurred since students in the ALC group assumed that the active coding component is simply part of a "live coding" pedagogy. &#21504;&#26624;erefore, in our qualitative analysis, we instructed the two coders to indicate when they felt that an answer was speci&#26112;&#26880;cally about Active Live Coding, such as mentioning the process of coding themselves or forking the instructor's workspace. In total, the coders identi&#26112;&#26880;ed 52 of the 406 responses that were speci&#26112;&#26880;c to Active Live Coding. We excluded these answers from the analysis to compare perceptions of live coding. However, we still conducted an analysis of these 52 responses using the code book developed for the Active Live Coding analysis (Figure <ref type="figure">2</ref>). Since students answered the question about their perceptions of Active Live Coding only in a later part of the same survey, we did not include these 52 responses in the other analysis to avoid duplicate perspectives from students. Instead, we conducted a separate qualitative analysis using the same code book from the Active Live Coding perceptions analysis. &#21504;&#26624;e goal of this extra analysis was to ensure that we did not miss additional, unique perspectives of Active Live Coding.</p><p>&#21504;&#26624;e &#26112;&#26880;nal code book that we created can be found in Table <ref type="table">20</ref> in the Appendix.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.3.3">Comparing Behavioral Engagement</head><p>During Lecture. Our analysis on student behavioral engagement is part of a previous publication <ref type="bibr">[31]</ref>.</p><p>To measure the behavioral engagement level of students during lectures, we used the Behavioral Engagement Related to Instruction (BERI) protocol <ref type="bibr">[19]</ref>. &#21504;&#26624;e BERI protocol is an observation method created by Lane and Harris that was speci&#26112;&#26880;cally designed to measure student engagement in large lecture halls that exceed one hundred students <ref type="bibr">[19]</ref>. &#21504;&#26624;e method has been tested for validity and reliability and has been shown to accurately capture student engagement levels of an entire lecture hall with just one classroom observer <ref type="bibr">[19]</ref>. In the BERI protocol, an observer with a clear understanding of the course material and knowledge of the BERI protocol positions themselves in a seat among the students during the lecture. At the beginning of the lecture, the observer selects 10 students to observe throughout the lecture. &#21504;&#26624;en, several times throughout the lecture, the observer spends roughly 15 seconds observing each of the 10 students and applies a rubric, displayed in Table <ref type="table">9</ref>, to determine whether each student is engaged or disengaged.</p><p>To apply the BERI protocol, we used two observers, even though Lane and Harris showed that one observer is su&#26112;&#52992;cient for reliable data collection <ref type="bibr">[19]</ref>. Among the two observers, we had one primary observer who a&#29696;&#29696;ended every lecture throughout the term and a secondary observer a&#29696;&#29696;ended roughly half of the lectures (one lecture per week rather than both lectures per week). When the two observers were at the same lecture, they coordinated their data collection times so that the observations happened at the same time. &#21504;&#26624;e observers made sure to sit in di&#26112;&#26112;erent parts of the classroom in the same lecture to cover a greater variety of seating locations. Furthermore, to ensure consistency between lecture sections, the observers made sure to sit in the same relative area of the classroom for the two lectures so that the data collected on a speci&#26112;&#26880;c day would be comparable between the two lectures. &#21504;&#26624;e observers aimed to be as discrete as possible during lectures as to not cause students to act di&#26112;&#26112;erently. Our observations happened roughly every 10 to 15 minutes during the lecture. &#21504;&#26624;e observers ensured that they collected data at least once per instructional activity (see Table <ref type="table">5</ref>). For each data collection moment, the observers recorded how many of their 10 students were engaged. With this frequency of data collection, we were able to create a representation of how student engagement changes throughout a lecture as the instructor shi&#26112;&#29696;s between instructional activities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6">RESULTS</head><p>6.1 RQ1 Results</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.1">Comparing</head><p>Adherence to Incremental Development. Table <ref type="table">10</ref> compares student adherence to incremental development, measured via the Measure of Incremental Development <ref type="bibr">[32]</ref>, on the Midterm Coding Challenge (MCC) and Final Coding Challenge (FCC). &#21504;&#26624;e table compares MID scores, where a lower value indicates more adherence to incremental development, across all submissions for the MCC and FCC. However, we also wanted to see whether there was a di&#26112;&#26112;erence in MID scores between the students who correctly solved the coding challenges, since students who struggle to get the correct answer may exhibit di&#26112;&#26112;erent programming processes. Across all submissions, we found no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erences in terms of adherence to incremental development. However, when we isolated our analysis to only the perfect scores so that measurement of incremental development did not include students who struggled to complete the task, we see that the Active Live Coding group had a higher adherence to incremental development on the FCC.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.1.2">Comparing</head><p>Error Frequencies and Debugging. Tables <ref type="table">11</ref> and <ref type="table">12</ref> shows the comparison of students' error frequencies, measured via the Repeated Error Density <ref type="bibr">[3]</ref>, on the Midterm Coding Challenge and Final Coding Challenge, respectively. To interpret the RED, a lower value indicates a lower error frequency. We ran the RED for the errors described in Table <ref type="table">7</ref>. However, for the FCC, there was no Non-Static Access error because there were no static methods involved in the FCC. &#21504;&#26624;erefore, we exclude this error for our analysis on the FCC. As seen in Tables <ref type="table">11</ref> and <ref type="table">12</ref>, we found no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erences in any of the error frequencies on the MCC or FCC. Despite the lack of signi&#26112;&#26880;cance, the e&#26112;&#26112;ect sizes tended to favor the TLC group, who had lower average RED values. However, the non-signi&#26112;&#26880;cance of the di&#26112;&#26112;erences ultimately points to a similar error frequency between the groups, despite the trend in the small e&#26112;&#26112;ect sizes.  <ref type="table">13</ref> and <ref type="table">14</ref> shows the two dimensions of programmer productivity that we collected: correctness on the coding challenges (Table <ref type="table">13</ref>) and the number of compilations until submission (Table <ref type="table">14</ref>). We did not &#26112;&#26880;nd any signi&#26112;&#26880;cant di&#26112;&#26112;erences in terms of p-values, though the small e&#26112;&#26112;ect sizes favor the TLC group. Notably, as seen in Table <ref type="table">13</ref>, the TLC group scored 4 to 5 percentage points higher than the ALC group in both coding challenges. Similarly, across both coding challenges and even among perfect scores on coding challenges, there was no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erent between the groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2">RQ2 Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.1">Comparing Student Grades.</head><p>Table 15 compares scores on textbook activities, lecture a&#29696;&#29696;endance, discussion a&#29696;&#29696;endance, programming assignment (PA) grades, worksheet grades, &#26112;&#26880;nal exam score, and overall course grades between the Active Live Coding and Traditional Live Coding groups. We did not detect any statistically signi&#26112;&#26880;cant However, in order to more fully examine the factors that led to this di&#26112;&#26112;erence in &#26112;&#26880;nal exam grades, we conducted a Multiple Linear Regression (MLR) analysis <ref type="bibr">[1]</ref> in order to determine the relative impact of the di&#26112;&#26112;erent independent variables we collected-prior programming experience, university major, and year-inuniversity. &#21504;&#26624;e results of our MLR analysis is shown in Table <ref type="table">16</ref>. In Table <ref type="table">16</ref>, there is a baseline value in each category: the baseline for Year is &#26112;&#26880;rst-year, the baseline for Major is computer science, the baseline for Prior Experience is no, and the baseline for Treatment is the Traditional Live Coding group. Each coe&#26112;&#52992;cient value in the table shows the relative e&#26112;&#26112;ect of that predictor compared to its baseline value. For example, to interpret the Treatment grouping, we would say that all else being equal (such as Year, Major, and Prior Experience), a student in the Active Live Coding condition is expected to score 0.53 percentage points lower on the &#26112;&#26880;nal exam than if they were a student in the Traditional Live Coding condition. However, the e&#26112;&#26112;ect of the ALC treatment is not statistically signi&#26112;&#26880;cant, as seen from the right-most column with the p values. Each of the groupings besides the Treatment condition had a predictor with a signi&#26112;&#26880;cant association with &#26112;&#26880;nal exam grades: second-year, math major, other major, and prior programming experience. &#21504;&#26624;e results of this MLR analysis reveal that the signi&#26112;&#26880;cant di&#26112;&#26112;erence shown in Table <ref type="table">15</ref> is not due to the Treatment condition but rather can be explained through di&#26112;&#26112;erences in Year, Major, and Prior Experience.</p><p>Finally, we conducted a Leave-One-Out analysis <ref type="bibr">[37]</ref> to understand the relative impact of each grouping (Year, Major, Prior Experience, and Treatment) on our model's performance, which is shown in Table <ref type="table">17</ref>. &#21504;&#26624;e results of the Leave-One-Out analysis are shown in order of signi&#26112;&#26880;cance, where the Treatment condition has the lowest impact on model performance, then Year, then Major, and &#26112;&#26880;nally Prior Experience, which has the largest impact   on the model's predictive power. Interestingly, there is no di&#26112;&#26112;erence in the Adjusted 2 value when we remove Treatment from the model, showing the minimal impact of our treatment compared to the other groupings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.2">Comparing Student Performance on Code</head><p>Tracing and Code Writing &#20736;&#29952;estions. Finally, just as Shah et al. had done in their study, we compared student performance across code tracing, code writing, and basic, conceptual questions between the two lecture styles. Figure <ref type="figure">1</ref> shows the comparison of student performance, which revealed no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erences between the two groups.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.2.3">Comparing Student</head><p>Learning Gain During Lecture. Our learning gain metric is the proportion of students who incorrectly answered the pre-lecture question incorrectly but correctly answered the corresponding postlecture question. Since this metric is sensitive to the number of students in the Potential Learning Group (PLG), we compared the rate of correctness of students on the pre-lecture questions for both lecture groups. &#21504;&#26624;e rates of correctness were similar throughout the term, with the ALC lecture having an average of 43.1% of students in the PLG and the TLC lecture having 40.8% of students in the PLG.  <ref type="table">18</ref> shows that there is a 3 percentage point di&#26112;&#26112;erence in the aggregate learning gain throughout the term. &#21504;&#26624;e "Num &#20736;&#29952;estions" column represents the total number of pairs of pre-and post-lecture questions we analyzed during the quarter from the PLG. According to a z-test of proportions <ref type="bibr">[28]</ref>, this di&#26112;&#26112;erence is not statistically signi&#26112;&#26880;cant. &#21504;&#26624;e low Cohen's H <ref type="bibr">[11]</ref> e&#26112;&#26112;ect size implies a relatively small magnitude of the di&#26112;&#26112;erence in the learning gain. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3">RQ3 Results</head><p>Our results for RQ3 can be divided into three parts. First, we conducted a thematic analysis on students' perceptions of Active Live Coding. Second, we compared students' perceived bene&#26112;&#26880;ts of their respective lecture style to detect di&#26112;&#26112;erences in student perceptions. &#21504;&#26624;ird, we conducted an observational study to measure student behavioral engagement during the di&#26112;&#26112;erent lecture styles throughout the term.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.1">Identifying Student Perceptions of Active Live Coding.</head><p>Figure <ref type="figure">2</ref> shows open-ended student responses to the question: "Please give a brief explanation of your rating on the active coding component. " &#21504;&#26624;e column titled Label represents the codes that our research team generated during the coding process and the Category column represents a post-analysis grouping of the labels. &#21504;&#26624;e percentages do not sum to 100% because each student response could have multiple labels. &#21504;&#26624;e categories of perspectives include bene&#26112;&#26880;ts, such as helping with performance on course components, helping students' programming processes, improving student understanding of concepts, and keeping students engaged with active learning, and drawbacks, such as the directions for the active coding component being too vague and lectures being rushed due to Active Live Coding. &#21504;&#26624;e most common response students mentioned was that Active Live Coding "reinforces understanding" (33.6%) of key concepts, which students typically mentioned was due to the "hands on experience" (16.6%) of coding themselves. Students also appreciated that active live coding "applies lecture material" (23.7%) they just saw in class. We also saw that students appreciated how Active Live Coding "provides immediate feedback" (10.0%) on their programming solution since they see other students' approaches and the instructor's solution.    <ref type="figure">3</ref> shows the comparison of student responses to the question "What are some speci&#26112;&#26880;c things about the live-coding examples that have been helpful for your learning? " A darker green color represents a higher frequency of students that mentioned that label. We generally saw similar frequencies between the two groups, though one prominent di&#26112;&#26112;erence we noticed was nearly 40% of students in the TLC lecture mentioned debugging as a bene&#26112;&#26880;t of the live coding examples, whereas only 20% of students in the ALC lecture noted this. Both of these values are in stark contrast to the work from Shah et al., who found that only 13% of students mentioned debugging as a bene&#26112;&#26880;t of live coding.</p><p>As mentioned before, one issue with this analysis was that 51 students in the ALC lecture gave a response about active coding rather than Traditional Live Coding. &#21504;&#26624;is explains the di&#26112;&#26112;erence in the responses for Figure <ref type="figure">2</ref> (n = 271) and Figure <ref type="figure">3</ref> (n = 220). Not only is an interesting &#26112;&#26880;nding that 51 of the 271 students mentioned a quality of Active Live Coding when answering a question about Traditional Live Coding, but we saw a new label emerge from these responses. Speci&#26112;&#26880;cally, students mentioned that they appreciated the level of detail and clarity in the comments and directions that the instructor gave before the active coding component. One student wrote: "[&#21504;&#26624;e instructor] writing the comments of what we're supposed to do before doing the code has also been helpful. "</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.3.3">Comparing Student</head><p>Behavioral Engagement During Lecture. Figure <ref type="figure">4</ref> shows the comparison of student behavioral engagement across all lecture activities throughout the term. Each percentage in one of the horizontal bars represents the average percent of students that were engaged during that lecture activity during the term. &#21504;&#26624;ere is no engagement rate for Active Live Coding for the TLC group since there was no active live coding portion in the TLC lecture. In general, the engagement levels were relatively similar for the Pre-Lecture &#20736;&#29952;estions and Worksheet Review. Interestingly, students in the ALC lecture exhibited slightly higher engagement during traditional live coding (which happens in both lecture groups) but lower engagement during the wri&#29696;&#29696;en notes section.</p><p>Figure <ref type="figure">5</ref> represents the change in student engagement throughout a lecture based on speci&#26112;&#26880;c lecture activities. For each ten minute increment into the lecture in Figure <ref type="figure">5</ref>, we found the most common lecture activity from our  observations (i.e., for minutes 10 to 20 into the lecture, the most common lecture activity were the pre-lecture questions; for minutes 20 to 30 into the lecture, the most common lecture activity was worksheet review, etc.). We then calculated the average engagement for that lecture activity speci&#26112;&#26880;cally within that ten minute increment. Traditional live coding, which occurred in both lectures, was the most common lecture activity for both lectures 30 to 50 minutes into the lecture. For the increment between 50 and 60 minutes the most common activity within the ALC lecture was active live coding whereas the most common activity for the TLC lecture continued to be traditional live coding. We see a strong peak for the ALC lecture during the active live coding component, which is unsurprising given that the active coding portion was required for students to complete. Interestingly, the ten minute increment following the active live coding component shows a higher engagement level for the ALC group than the TLC group, potentially demonstrating a persistent engagement e&#26112;&#26112;ect of active live coding. To further explore this persistent engagement e&#26112;&#26112;ect, we compared student engagement before and a&#26112;&#29696;er the Active Live Coding component in Figure <ref type="figure">6</ref>. We speci&#26112;&#26880;cally compared the traditional live coding components before and a&#26112;&#29696;er an active live coding component. Figure <ref type="figure">6</ref> shows that the engagement rate for traditional live coding before ALC was only 61.5%, but this value increased to 77.7% a&#26112;&#29696;er ALC. A two-sample t-test <ref type="bibr">[24]</ref> revealed that the di&#26112;&#26112;erence between these values is signi&#26112;&#26880;cant, with &lt; 0.001 and Cohen's d of 1.16-a large e&#26112;&#26112;ect size <ref type="bibr">[14]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7">DISCUSSION 7.1 Interpretation of Results</head><p>Paragraphs within this subsection were re-ordered to provide be&#29696;&#29696;er structure for the interpretation of results.</p><p>7.1.1 Similar adherence to programming processes (RQ1) and student learning (RQ2). Our quantitative &#26112;&#26880;ndings related to students' programming processes and learning outcomes generally showed no signi&#26112;&#26880;cant di&#26112;&#26112;erence between the ALC and TLC groups. Students in both groups showed similar adherence to incremental development (Table <ref type="table">10</ref>) and error frequencies (Tables <ref type="table">11</ref> and <ref type="table">12</ref>), while being able to produce correct code at similar rates (Table <ref type="table">13</ref>). Further, students performed similarly across all major grading components in the course (Tables <ref type="table">15</ref> and <ref type="table">16</ref>) and exhibited similar learning gain during lectures (Table <ref type="table">18</ref>). &#21504;&#26624;ough there was one statistically signi&#26112;&#26880;cant di&#26112;&#26112;erence from our analysis that showed the TLC students scoring higher on the &#26112;&#26880;nal exam, our regression analysis of the factors that explain the di&#26112;&#26112;erence between the two groups showed that the ALC treatment condition had an insigni&#26112;&#26880;cant association with &#26112;&#26880;nal exam scores. &#21504;&#26624;ough we only showed the regression analysis for the &#26112;&#26880;nal exam scores, we repeated this analysis on all other comparison items in RQ1 and RQ2. A prevailing theme in all of these analyses is that these other factors-students' year in program, major, and prior programming experience-contributed more to the model's accuracy than our treatment condition. Our main takeaway for RQ1 and RQ2 is that Active Live Coding resulted in similar programming processes and learning outcomes as Traditional Live Coding. Several potential explanations exist for the lack of impact of Active Live Coding. One hypothesis is that the e&#26112;&#26112;ect of roughly 30 to 40 minutes of active coding and peer discussion per week (i.e., one ALC session in each of the two lectures per week) is marginal compared to the signi&#26112;&#26880;cant amount of course components per week. As mentioned in Section 4.1, students' a&#29696;&#29696;end 160 minutes of lecture and 50 minutes of a discussion section per week and complete one programming assignment and worksheet per week. All these other course components represent signi&#26112;&#26880;cant impacts to students' learning beyond the moments of Active Live Coding. Combined with demographic factors such as major, year-in-university, and prior programming experience, these other activities represent a signi&#26112;&#26880;cant amount of learning experiences that may outweigh the e&#26112;&#26112;ect of Active Live Coding. A second hypothesis is that Active Live Coding takes up more lecture time than Traditional Live Coding, resulting in less time for an instructor to explain the material. In the TLC lectures, the instructor covered the same material as the Active Live Coding lectures, but had more time to go into more detail in their lessons. &#21504;&#26624;roughout the term, this may result in more time for instructors to answer student questions, explain program errors, and share their thought process. &#21504;&#26624;ough the ICAP Framework shows that students learn the material be&#29696;&#29696;er by engaging with it more actively, the amount of time that a learner engages with material may also impact learning gains. Indeed, one of the drawbacks of live coding mentioned by Bruhn and Burton is the greater time commitment of live coding compared to static code examples <ref type="bibr">[6]</ref>. A third potential reason for the lack of signi&#26112;&#26880;cant &#26112;&#26880;ndings is that the Traditional Live Coding lectures engaged students beyond a passive engagement level. Since the instructor occasionally prompted the class with a verbal question (i.e., "what would be printed if I ran the code now?", students' engagement level may have become active or constructive. &#21504;&#26624;erefore, the traditional live coding components may already be su&#26112;&#52992;ciently engaging and informative for students.</p><p>7.1.2 Di&#26112;&#26112;erences in students' lecture experience (RQ3). Our &#26112;&#26880;ndings related to students' lecture experience in Active Live Coding revealed important di&#26112;&#26112;erences between ALC and TLC. In fact, student responses demonstrated the multiple Cognitive Apprenticeship Methods being applied. Of course, the modeling Method is engaged by Traditional Live Coding, and the responses in Figure <ref type="figure">3</ref> show how students' in both lecture groups saw the instructor's programming process and heard the instructor's thought process. Further, the open-ended question related to students' perceptions of ALC, shown in Figure <ref type="figure">2</ref>, uncovered aspects speci&#26112;&#26880;c to Active Live Coding that are not present in TLC. For example, students discussed how the active coding component "provides a hands on experience" to students to code on their own. &#21504;&#26624;is demonstrates the sca&#26112;&#26112;olding Method of Cognitive Apprenticeship as students' are able to complete a simple activity on their own in a low-stakes environment that uses the concepts just taught in lecture. &#21504;&#26624;en, students mentioned being able to discuss their solution with peers, which is captured by the label "allows for group learning" in Figure <ref type="figure">2</ref>. &#21504;&#26624;is aspect of ALC leverages the articulation Method of Cognitive Apprenticeship, as students discuss their approach with peers. Finally, students' mentioned "seeing the instructor's solution" to the coding activity, which speci&#26112;&#26880;cally engages the re&#26112;&#27648;ection Method of Cognitive Apprenticeship, as students' compare their own approach to the instructor's approach. An interesting set of responses from 9.96% of respondents mentioned that ALC "provides immediate feedback" on their solution to the programming task-a mechanism that is simply not present in Traditional Live Coding. &#21504;&#26624;e tenet of providing immediate, individual feedback relates to the coaching Method of Cognitive Apprenticeship. In Section 2.1, we did not mention coaching as a Method that is leveraged by Active Live Coding since there is no one-to-one interaction between instructor and student. Instead, there is only a one-to-many feedback channel as the instructor provides the solution to the coding activity. However, despite this limitation of Active Live Coding, it seems students still felt that ALC achieved some of the bene&#26112;&#26880;ts of the coaching Method.</p><p>Another key &#26112;&#26880;nding from RQ3 is that ALC improved student engagement but did not impact student learning. A notable category of responses in Figure <ref type="figure">2</ref> discussed the engaging nature of Active Live Coding. Our analysis using the BERI protocol revealed an interesting e&#26112;&#26112;ect of Active Live Coding. Speci&#26112;&#26880;cally, Figure <ref type="figure">5</ref> shows that engagement starts and ends at a high rate, but we saw a lower engagement rate about 30 minutes into the lecture. From about 30 minutes to 70 minutes in the Traditional Live Coding lecture, the engagement rate hovers between 60 to 70 percent as students observe the instructor live coding. However, the large spike due to Active Live Coding resulted in a persistent engagement e&#26112;&#26112;ect where students had a heightened engagement rate even 20 minutes a&#26112;&#29696;er Active Live Coding. A key reason for this heightened engagement e&#26112;&#26112;ect, which can be partially explained by the &#26112;&#26880;ndings in Figure <ref type="figure">2</ref>, is that students can see the instructor's solution and get immediate feedback about whether their approach was correct. In other words, the 20 minutes following Active Live Coding are important for students' to identify whether they were correct and to compare their solution to the instructor's solution.</p><p>&#21504;&#26624;is persistent engagement e&#26112;&#26112;ect is highlighted by Figure <ref type="figure">6</ref>, which showed a statistically signi&#26112;&#26880;cant di&#26112;&#26112;erence in engagement levels before and a&#26112;&#29696;er Active Live Coding. 7.1.3 Theoretical implications of our results. Overall, our results are unexpected. Based on Cognitive Apprenticeship and the ICAP Framework, one may predict that students' in the Active Live Coding group would exhibit greater adherence to programming processes and learn more than their Traditional Live Coding counterparts. Since Active Live Coding engages more Methods of Cognitive Apprenticeship, we would have expected students in the ALC group to adhere to incremental development and debug errors more e&#26112;&#52992;ciently than students in the TLC group. However, the results of RQ1 and RQ2 showed that students exhibited similar programming processes and course performance in the two groups despite their lectures engaging the articulation and re&#26112;&#27648;ection Methods of Cognitive Apprenticeship. Previous work on active learning, such as Peer Instruction, in computing education has shown empirical improvements to retention and learning gain <ref type="bibr">[23]</ref> and failure rates <ref type="bibr">[22]</ref>. &#21504;&#26624;ese works by Porter et al. showed that students bene&#26112;&#26880;ted from the peer discussion <ref type="bibr">[23]</ref>, which is also a critical component of Active Live Coding. However, our results did not show similar results to the &#26112;&#26880;ndings from Porter et al., motivating a deeper investigation into whether Active Live Coding improves student learning.</p><p>Similarly, the ICAP Framework predicts that activities with a higher level of engagement will result in more student learning. However, despite &#26112;&#26880;nding a higher engagement rate in ALC, our results from RQ2 did not show an increase in student learning as a result of this higher engagement. In fact, the most granular metric related to student learning during lecture-the learning gain analysis shown in Table <ref type="table">18</ref>-showed no statistically signi&#26112;&#26880;cant di&#26112;&#26112;erence in the learning gain during lecture between the two groups, although the TLC had a higher average learning gain throughout the term. &#21504;&#26624;e comparison of learning gain via the pre-and post-lecture questions, which are multiple-choice questions that are code tracing or conceptual questions, may not be the best way to assess student learning for ALC, which asks students to write code. However, even the metrics related to correctness on programming tasks in Table <ref type="table">13</ref> do not show the ALC students outperforming the TLC students. Overall, we saw almost no impact of ALC on student learning, raising questions about why the higher level of engagement did not translate to higher student learning.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.2">Threats and Limitations</head><p>&#21504;&#26624;e main factor that threatens the internal validity of our work is the experience of the instructor who taught both groups. &#21504;&#26624;e instructor has been an instructor for six years and has regularly used Traditional Live Coding all six years. In contrast, this was the instructor's &#26112;&#26880;rst time ever using Active Live Coding. Upon re&#26112;&#27648;ection, the instructor mentioned that the timing of the Active Live Coding lecture was di&#26112;&#52992;cult to manage the active coding and peer discussion portion takes roughly 10 of the 80 total minutes. As a result, the Traditional Live Coding students may have bene&#26112;&#26880;ted from the instructor's experience while the Active Live Coding students may have su&#26112;&#26112;ered from the instructor's lack of experience with the lecture style.</p><p>A second threat to internal validity is the di&#26112;&#26112;erence between the two lectures in terms of the time of day and the student makeup. For example, one issue we noticed is that many more Math majors were enrolled in the Traditional Live Coding lecture because a required Math course was o&#26112;&#26112;ered at 9:30am on Tuesdays and &#21504;&#26624;ursdays-the same times as the Active Live Coding lecture. &#21504;&#26624;ough our analysis took this speci&#26112;&#26880;c factor into account, there could easily be other selection biases we did not detect. Although students did not know that there would be any di&#26112;&#26112;erence between lecture sections or that the 9:30am lecture would be ALC while 11am would be TLC, we were unable to randomly assign students to the lecture sections, as Raj et al. had done <ref type="bibr">[25]</ref>.</p><p>A third threat to internal validity is that we did not track whether students completed the active live coding activity. While the classroom observer consistently saw high engagement during this part of lecture, we are unable to &#26112;&#26880;nd out how many students actually a&#29696;&#29696;empted the activity. Our results might be di&#26112;&#26112;erent if students earned a grade for the active live coding activity, or if the activity was graded on correctness. &#21504;&#26624;e main factor that threatens the external validity of this work is the instructor e&#26112;&#26112;ect. Di&#26112;&#26112;erent instructors may see varying levels of student success. In fact, one of the takeaways from Porter et al.'s work on Peer Instruction is that di&#26112;&#26112;erent instructors saw varying levels of bene&#26112;&#26880;ts of using Peer Instruction <ref type="bibr">[22]</ref>, which may also be the case for Active Live Coding or even Traditional Live Coding. &#21504;&#26624;erefore, replication studies with di&#26112;&#26112;erent instructors can help create a solid basis of empirical &#26112;&#26880;ndings related to the impact of Active Live Coding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.3">Future Work</head><p>&#21504;&#26624;is work has occurred a&#26112;&#29696;er a long line of prior work to evaluate traditional live coding <ref type="bibr">[25,</ref><ref type="bibr">27,</ref><ref type="bibr">30,</ref><ref type="bibr">33,</ref><ref type="bibr">36,</ref><ref type="bibr">39]</ref>. &#21504;&#26624;e overwhelming &#26112;&#26880;nding from these works is that there has been no signi&#26112;&#26880;cant di&#26112;&#26112;erence between traditional live coding and static-code examples. Similarly, our &#26112;&#26880;ndings in this study indicate a similar e&#26112;&#26112;ect of Active and Traditional Live Coding compared to Traditional Live Coding. A useful avenue of future work could investigate why Traditional Live Coding and Active Live Coding do not result in improved student learning. It may be the case that some factors during lecture, such as student engagement, distractions, cognitive load, or other factors, may mitigate any potential learning gain from live coding. A qualitative approach to understanding how students process a live coding example may help us understand why we have not seen bene&#26112;&#26880;ts from the activity. Similarly, our experimental design considers a course-long intervention with programming process data and course outcome data collected from summative assessments. Future work may consider alternative data to analyze that may shed light on the di&#26112;&#26112;erences between the two pedagogical techniques.</p><p>We have not found any existing empirical evaluations of Active Live Coding. Our experiment only investigates the impact of a single instructor using Active Live Coding throughout the term. As a result, future works may use methods such as lab studies or term-long interventions with a di&#26112;&#26112;erent instructor to further investigate the impact of Active Live Coding. Additional analyses on Active Live Coding may also explore other outcomes that we did not analyze in our study, such as students' sense of belonging in the course and in the computer science major. A potential bene&#26112;&#26880;t of Active Live Coding, along with other active learning techniques that encourage peer discussion, is that students have a stronger sense of community in the course, resulting in greater feelings of belonging.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="8">CONCLUSION</head><p>While our study is just a single data point in the broader literature related to live coding and active learning, the &#26112;&#26880;ndings of this study can help inform an instructor of the potential e&#26112;&#26112;ects of using Active Live Coding. Speci&#26112;&#26880;cally, Active Live Coding seems to impart similar learning and adherence to programming processes as Traditional Live Coding while also promoting student engagement and peer discussions. Students also mentioned some unique a&#26112;&#26112;ordances of Active Live Coding, such as providing immediate feedback on the correctness of their programming solution, which is not present in a Traditional Live Coding study. &#21504;&#26624;erefore, although we did not detect empirical bene&#26112;&#26880;ts of Active Live Coding compared to Traditional Live Coding, instructors may expect students to have similar perceptions of and engagement with Active Live Coding should they choose to adopt this pedagogy.</p><p>Table 20. Final code book for comparison of perceived benefits of Traditional Live Coding.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Label Description</head></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0"><p>https://pypi.org/project/measure-incremental-development</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>ACM Trans. Comput. Educ.An Empirical Evaluation of Active Live Coding in CS1 &#8226; 25</p></note>
		</body>
		</text>
</TEI>
