Background. Academic help-seeking benefits students’ achievement, but existing literature either studies important factors in students’ selection of all help resources via self-reported surveys or studies their help-seeking behavior in one or two separate help resources via actual help-seeking records. Little is known about whether computing students’ approaches and behavior match, and not much is understood about how they transition sequentially from one help resource to another. Objectives. We aim to study post-secondary computing students’ academic help-seeking approach and behavior. Specifically, we seek to investigate students’ self-reported orders of resource usage and whether these approaches match with students’ actual utilization of help resources. We also examine frequent patterns emerging from students’ chronological help-seeking records in course-affiliated help resources. Context and Study Method. We surveyed students’ self-reported orders of resource usage across 12 offerings of seven courses at two institutions, then analyzed their responses using various help resource dimensions identified by existing works. From two of these courses (an introduction to programming course and a data science course, 11 offerings), we obtained students’ help-seeking records in all course-affiliated help resources, along with code autograder records. We then compared students’ reported orders in these two courses against their actions in the records. Finally, we mined sequences of student help-seeking events from these two courses to reveal frequent sequential patterns. Findings. Students’ reported orders of help resource usage form a progression of clusters where resources in each cluster are more similar to each other by help resource dimensions than to resources outside of their cluster. This progression partially confirms phenomena and decision factors reported by existing literature, but no factor/dimension alone can explain the entire progression. We found students’ actual help-seeking records did not deviate much from their self-reported orders. Mining of the sequential records revealed that help-seeking from course-affiliated human resources led to measurable progress more often than not, and students’ usage of consulting/office hours (mainly run by undergraduate teaching assistants) itself was the best indicator for future usage within the lifespan of the same assignment. Implications. Our results demonstrate that computing students’ help resource selection/utilization is a sophisticated process that should be modeled and analyzed with sufficient awareness of its inherent sequentiality. We identify future research directions through this preliminary analysis, which can lead to a better understanding of computing students’ help-seeking behavior and better resource utilization/management in large-scale instructional contexts.
more »
« less
clubber: removing the bioinformatics bottleneck in big data analyses
Abstract With the advent of modern day high-throughput technologies, the bottleneck in biological discovery has shifted from the cost of doing experiments to that of analyzing results.clubberis our automated cluster-load balancing system developed for optimizing these “big data” analyses. Its plug-and-play framework encourages re-use of existing solutions for bioinformatics problems.clubber’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API, allowing for job monitoring and result retrieval. We usedclubberto speed up our pipeline for annotating molecular functionality of metagenomes. Here, we analyzed the Deepwater Horizon oil-spill study data to quantitatively show that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 min) clearly illustrate the importance ofclubberin the everyday computational biology environment.
more »
« less
- Award ID(s):
- 1553289
- PAR ID:
- 10526565
- Publisher / Repository:
- De Gruyter
- Date Published:
- Journal Name:
- Journal of Integrative Bioinformatics
- Volume:
- 14
- Issue:
- 2
- ISSN:
- 1613-4516
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
With the increase in data-driven analytics, the demand for high performing computing resources has risen. There are many high-performance computing centers providing cyberinfrastructure (CI) for academic research. However, there exists access barriers in bringing these resources to a broad range of users. Users who are new to data analytics field are not yet equipped to take advantage of the tools offered by CI. In this paper, we propose a framework to lower the access barriers that exist in bringing the high-performance computing resources to users that do not have the training to utilize the capability of CI. The framework uses divide-and-conquer (DC) paradigm for data-intensive computing tasks. It consists of three major components - user interface (UI), parallel scripts generator (PSG) and underlying cyberinfrastructure (CI). The goal of the framework is to provide a user-friendly method for parallelizing data-intensive computing tasks with minimal user intervention. Some of the key design goals are usability, scalability and reproducibility. The users can focus on their problem and leave the parallelization details to the framework.more » « less
-
Large-scale enterprise computing systems are growing rapidly, to address the increasing demand for data processing; however, in many cases, the computing resources in a single data center may not be sufficient for critical data-centric workloads, and important factors, such as space limitations, power availability, or company policies, limit the possibilities of expanding the data center's resources. In this paper, we explore the potential of harvesting spare computing resources across geo-distributed data centers with fast fabric interconnection for real-world enterprise applications. We specifically characterize the computing resource utilization of four large-scale production data centers, and we show how to efficiently combine local storage and computing clusters with remote and elastic computation resources. The primary challenge is incorporating the available remote computing resources efficiently. To achieve this goal, we propose leveraging the capabilities of Kubernetes-based elastic computing clusters to utilize the spare computing resources across geo-distributed data centers for Big Data applications. We also provide an experimental performance evaluation based on real-use case scenarios via an empirical execution and a simulation, which shows that the proposed system can accelerate Big Data services by employing existing computing resources more efficiently across geo-distributed data centers.more » « less
-
Abstract It is well understood that differences in the cues used by consumers and their resources in fluctuating environments can give rise to trophic mismatches governing the emergent effects of global change. Trophic mismatches caused by changes in consumer energetics during periods of low resource availability have received far less attention, although this may be common for consumers during winter when primary producers are limited by light. Even less is understood about these dynamics in marine ecosystems, where consumers must cope with energetically costly changes in CO2‐driven carbonate chemistry that will be most pronounced in cold temperatures. This may be especially important for calcified marine herbivores, such as the pinto abalone (Haliotis kamschatkana).H. kamschatkanaare of high management concern in the North Pacific due to the active recreational fishery and their importance among traditional cultures, and research suggests they may require more energy to maintain their calcified shells and acid/base balance with ocean acidification. Here we use field surveys to demonstrate seasonal mismatches in the exposure of marine consumers to low pH and algal resource identity during winter in a subpolar, marine ecosystem. We then use these data to test how the effects of exposure to seasonally relevant pH conditions onH. kamschatkanaare mediated by seasonal resource identity. We find that exposure to projected future winter pH conditions decreases metabolism and growth, and this effect on growth is pronounced when their diet is limited to the algal species available during winter. Our results suggest that increases in the energetic demands of pinto abalone caused by ocean acidification during winter will be exacerbated by seasonal shifts in their resources. These findings have profound implications for other marine consumers and highlight the importance of considering fluctuations in exposure and resources when inferring the emergent effects of global change.more » « less
-
Abstract Long‐read sequencing is driving a new reality for genome science in which highly contiguous assemblies can be produced efficiently with modest resources. Genome assemblies from long‐read sequences are particularly exciting for understanding the evolution of complex genomic regions that are often difficult to assemble. In this study, we utilized long‐read sequencing data to generate a high‐quality genome assembly for an Antarctic eelpout,Ophthalmolycus amberensis, the first for the globally distributed family Zoarcidae. We used this assembly to understand howO. amberensishas adapted to the harsh Southern Ocean and compared it to another group of Antarctic fishes: the notothenioids. We showed that selection has largely acted on different targets in eelpouts relative to notothenioids. However, we did find some overlap; in both groups, genes involved in membrane structure, thermal tolerance and vision have evidence of positive selection. We found evidence for historical shifts of transposable element activity inO. amberensisand other polar fishes, perhaps reflecting a response to environmental change. We were specifically interested in the evolution of two complex genomic loci known to underlie key adaptations to polar seas: haemoglobin and antifreeze proteins (AFPs). We observed unique evolution of the haemoglobin MN cluster in eelpouts and related fishes in the suborder Zoarcoidei relative to other Perciformes. For AFPs, we identified the first species in the suborder with no evidence ofafpIIIsequences (Cebidichthys violaceus) in the genomic region where they are found in all other Zoarcoidei, potentially reflecting a lineage‐specific loss of this cluster. Beyond polar fishes, our results highlight the power of long‐read sequencing to understand genome evolution.more » « less
An official website of the United States government

