NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Understanding Highly Configurable Storage for Diverse Workloads

https://doi.org/10.1109/CLUSTERWorkshops61563.2024.00023

Kogiou, Olga; Devarajan, Hariharan; Wang, Chen; Yu, Weikuan; Mohror, Kathryn (September 2024, IEEE)

Full Text Available
Facilitating the Bootstrapping of a New ISA

https://doi.org/10.1145/3589610.3596282

Mortensen, Abigail; Pomerville, Scott; Whalley, David; Onder, Soner; Uh, Gang-Ryung (June 2023, Languages, Compilers, and Tools for Embedded Systems)

Implementation of a new instruction set architecture (ISA) is a non-trivial task which involves significant modifications to the system software, such as the compiler, the assembler, and the linker. This task also includes modifying and verifying functional and cycle accurate simulators to facilitate correct simulation and performance evaluation of programs under the new ISA. Isolating errors in these software components becomes extremely challenging and demands automated and semi-automated mechanisms since neither the compilation infrastructure nor the simulation infrastructure can be trusted as both parties have been heavily modified. Bootstrapping a new ISA is very common in embedded systems since there is a greater variety of embedded ISAs due to often not having a need to support backward compatibility of executables. In this paper, we present the tools and the verification mechanisms we have implemented to support the development of a number of related, but distinct ISAs. These ISAs are similar in complexity to the RISC-V ISA, and range from simple pipelined and superscalar processor ISAs, to a complete VLIW ISA. Our work in developing the system software and simulators for these ISAs demonstrate that a step-by-step semi-automated approach which relies on simple invariants can facilitate effective bootstrapping of the complete system software and the simulator infrastructure.
more » « less
Full Text Available
SVAGC: Garbage Collection with a Scalable Virtual Address Swapping Technique

https://doi.org/10.1109/CLUSTER51413.2022.00047

Ataie, Ismail; Yu, Weikuan (September 2022, 2022 IEEE International Conference on Cluster Computing (CLUSTER))

Managed programming languages including Java and Scala are very popular for data analytics and mobile applications. However, they often face challenging issues due to the overhead caused by the automatic memory management to detect and reclaim free available memory. It has been observed that during their Garbage Collection (GC), excessively long pauses can account for up to 40 % of the total execution time. Therefore, mitigating the GC overhead has been an active research topic to satisfy today's application requirements. This paper proposes a new technique called SwapVA to improve data copying in the copying/moving phases of GCs and reduce the GC pause time, thereby mitigating the issue of GC overhead. Our contribution is twofold. First, a SwapVA system call is introduced as a zero-copy technique to accelerate the GC copying/moving phase. Second, for the demonstration of its effectiveness, we have integrated SwapVA into SVAGC as an implementation of scalable Full GC on multi-core systems. Based on our results, the proposed solutions can dramatically reduce the GC pause in applications with large objects by as much as 70.9% and 97%, respectively, in the Sparse.large/4 (one quarter of the default input size) and Sigverify benchmarks.
more » « less
Full Text Available
Experience with Integrating Computer Science in Middle School Mathematics

https://doi.org/10.1145/3502718.3524787

Ashley Gannon, Mohsen Gavahi (July 2022, Proceedings of the 27th ACM Conference on Innovation and Technology in Computer Science Education Vol 1 (ITiCSE 2022))

The Florida State University (FSU) Computer Science Integrated with Mathematics in Middle Schools (CSIMMS) project explores the feasibility and effectiveness of integrating Computer Science (CS) into middle school general mathematics courses. Through Design Based Research, we developed and tested 13 teaching modules that integrate CS concepts into general middle school mathematics courses, grades 6, 7, and 8, beginning in 2017. In this paper, we discuss our experience with integrating computer science into middle school mathematics and report our preliminary findings.
more » « less
Full Text Available
DFMan: A Graph-based Optimization of Dataflow Scheduling on High-Performance Computing Systems

https://doi.org/10.1109/IPDPS53621.2022.00043

Chowdhury, Fahim; Di Natale, Francesco; Moody, Adam; Mohror, Kathryn; Yu, Weikuan (May 2022, 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS))

Scientific research and development campaigns are materialized by workflows of applications executing on high-performance computing (HPC) systems. These applications con-sist of tasks that can have inter- or intra-application flows of data to achieve the research goals successfully. These dataflows create dependencies among the tasks and cause resource con-tention on shared storage systems, thus limiting the aggregated I/O bandwidth achieved by the workflow. However, these I/O performance issues are often solved by tedious and manual efforts that demand holistic knowledge about the data dependencies in the workflow and the information about the infrastructure being utilized. Taking this into consideration, we design DFMan, a graph-based dataflow management and optimization framework for maximizing I/O bandwidth by leveraging the powerful storage stack on HPC systems to manage data sharing optimally among the tasks in the workflows. In particular, we devise a graph-based optimization algorithm that can leverage an intuitive graph representation of dataflow- and system-related information, and automatically carry out co-scheduling of task and data placement. According to our experiments, DFMan optimizes a wide variety of scientific workflows such as Hurricane 3D on Cloud Model 1 (CM1), Montage Carina Nebula (NGC3372), and an emulated dataflow kernel of the Multiscale Machine-learned Modeling Infrastructure (MuMMI I/O) on the Lassen supercomputer, and improves their aggregated I/O bandwidth by up to 5.42 x, 2.12 x and 1.29 x, respectively, compared to the baseline bandwidth.
more » « less
Full Text Available
Accurate classification of depression through optimized machine learning models on high-dimensional noisy data

https://doi.org/10.1016/j.bspc.2021.103237

Fang, Xingang; Klawohn, Julia; De Sabatino, Alexander; Kundnani, Harsh; Ryan, Jonathan; Yu, Weikuan; Hajcak, Greg (January 2022, Biomedical Signal Processing and Control)

Full Text Available
Encrypted All-reduce on Multi-core Clusters

https://doi.org/10.1109/IPCCC51483.2021.9679399

Gavahi, Mohsen; Naser, Abu; Wu, Cong; Lahijani, Mehran Sadeghi; Wang, Zhi; Yuan, Xin (October 2021, 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC))

Full Text Available
A Simulation Study of Hardware Parameters for Future GPU-based HPC Platforms

https://doi.org/10.1109/IPCCC51483.2021.9679359

Bhowmik, Saptarshi; Jain, Nikhil; Yuan, Xin; Bhatele, Abhinav (October 2021, 2021 IEEE International Performance, Computing, and Communications Conference (IPCCC))

Full Text Available
O(1) Communication for Distributed SGD through Two-Level Gradient Averaging

https://doi.org/10.1109/Cluster48925.2021.00054

Bhattacharya, Subhadeep; Yu, Weikuan; Chowdhury, Fahim Tahmid; Mohror, Kathryn (September 2021, 2021 IEEE International Conference on Cluster Computing (CLUSTER))

Full Text Available
ROBOTune: High-Dimensional Configuration Tuning for Cluster-Based Data Analytics

https://doi.org/10.1145/3472456.3472518

Khan, Md Muhib; Yu, Weikuan (August 2021, ICPP 2021: 50th International Conference on Parallel Processing)

Full Text Available

« Prev Next »

Search for: All records