skip to main content


Search for: All records

Creators/Authors contains: "Prasad, Sushil"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Bogaerts, Steven ; Prasad, Sushil K (Ed.)
    EduPar-23 proceedings editorial and oarganization 
    more » « less
  2. Welcome to the 4 th Workshop on Education for High Performance Computing (EduHiPC 2022). The EduHiPC 2022 workshop, held in conjunction with the IEEE International Conference on High Performance Computing Data & Analytics (HiPC 2022), is devoted to the development and assessment of educational and curricular innovations and resources for undergraduate and graduate education in Parallel and Distributed Computing (PDC) and High Performance Computing (HPC). EduHiPC brings together individuals from academia, industry, and other educational and research institutes to explore new ideas, challenges, and experiences related to PDC pedagogy and curricula. The workshop is designed in coordination with the IEEE TCPP curriculum initiative on parallel and distributed computing ( hitps://tcpp.cs.gsu .edu/curriculum/) for undergraduates majoring in computer science and computer engineering. It is supported by C-DAC, India and the US National Science Foundation (NSF) supported Center for Parallel and Distributed Computing Curriculum Development and Educational Resources (CDER). Details for attending the workshop are available on the HiPC webpage (HiPC). The effect of pandemic on academic and research community seems now to be globally receding as was evident from the enthusiastic in-person participation of conference delegates. Please visit the EduHiPC-22 webpage for the complete online proceedings, including copies of papers and presentation slides: EduHiPC 2022 | NSF/IEEE-TCPP Curriculum Initiative. 
    more » « less
  3. Parallel and distributed computing (PDC) has become pervasive in all aspects of computing, and thus it is essential that students include parallelism and distribution in the computational thinking that they apply to problem solving, from the very beginning. Computer science education is still teaching to a 20th century model of algorithmic problem solving. Sequence, branch, and loop are taught in our early courses as the only organizing principles needed for algorithms, and we invest considerable time in showing how best to sequentially process large volumes of data. All computing devices that students use currently have multiple cores as well as a GPU in many cases. Most of their favorite applications use multiple cores and numbers of distributed processors. Often concurrency offers simpler solutions than sequential approaches. Industry is desperate for software engineers who think naturally in terms of exploiting these capabilities, rather than seeing them as an exotic upper-level topic that gets layered over a sequential solution. However, we are still teaching students to solve problems using sequential thinking. In this workshop we overview key PDC concepts and provide examples of how they may naturally be incorporated in early computing classes. We will introduce plugged and unplugged curriculum modules that have been successfully integrated in existing computing classes at multiple institutions. We will highlight the upcoming summer training workshop, for which we have funding to support attendance, as well as other CDER (Center for Parallel and Distributed Computing Curriculum Development and Educational Resources) activities. 
    more » « less
  4. Decision trees and tree ensembles are popular supervised learning models on tabular data. Two recent research trends on tree models stand out: (1) bigger and deeper models with many trees, and (2) scalable distributed training frameworks. However, existing implementations on distributed systems are IO-bound leaving CPU cores underutilized. They also only find best node-splitting conditions approximately due to row-based data partitioning scheme. In this paper, we target the exact training of tree models by effectively utilizing the available CPU cores. The resulting system called TreeServer adopts a column-based data partitioning scheme to minimize communication, and a node-centric task-based engine to fully explore the CPU parallelism. Experiments show that TreeServer is up to 10x faster than models in Spark MLlib. We also showcase TreeServer's high training throughput by using it to build big "deep forest" models. 
    more » « less
  5. This special session will report on the updated NSF/IEEE-TCPP Curriculum on Parallel and Distributed Computing released in Nov 2020 by the Center for Parallel and Distributed Computing Curricu- lum Development and Educational Resources (CDER). The purpose of the special session is to obtain SIGCSE community feedback on this curriculum in a highly interactive manner employing the hybrid modality and supported by a full-time CDER booth for the duration of SIGCSE. In this era of big data, cloud, and multi- and many-core systems, it is essential that the computer science (CS) and computer engineering (CE) graduates have basic skills in par- allel and distributed computing (PDC). The topics are primarily organized into the areas of architecture, programming, and algo- rithms topics. A set of pervasive concepts that percolate across area boundaries are also identified. Version 1 of this curriculum was released in December 2012. That curriculum guideline has over 140 early adopter institutions worldwide and has been incorpo- rated into the 2013 ACM/IEEE Computer Science curricula. This Version-II represents a major revision. The updates have focused on enhancing coverage related to the topical aspects of Big Data, Energy, and Distributed Computing. The session will also report on related CDER activities including a workshop series on a PDC institute conceptualization, developing a CE-oriented version of the curriculum, and identifying a minimal set of PDC topics aligned with ABET’s exposure-level PDC require- ments. The interested SIGCSE audience includes educators, authors,publishers, curriculum committee members, department chairs and administrators, professional societies, and the computing industry. 
    more » « less
  6. In recent times, geospatial datasets are growing in terms of size, complexity and heterogeneity. High performance systems are needed to analyze such data to produce actionable insights in an efficient manner. For polygonal a.k.a vector datasets, operations such as I/O, data partitioning, communication, and load balancing becomes challenging in a cluster environment. In this work, we present MPI-Vector-IO, a parallel I/O library that we have designed using MPI-IO specifically for partitioning and reading irregular vector data formats such as Well Known Text. It makes MPI aware of spatial data, spatial primitives and provides support for spatial data types embedded within collective computation and communication using MPI message-passing library. These abstractions along with parallel I/O support are useful for parallel Geographic Information System (GIS) application development on HPC platforms. Performance evaluation is done on Lustre and GPFS filesystems. MPI-Vector-IO scales well with MPI processes and file size and achieves bandwidth up to 22 GB/s for common spatial data access patterns. We observed that independent file read functions performed better than collective functions in MPI-IO for contiguous access pattern on Lustre. In general, the I/O is improved by one to two orders of magnitude over real-world datasets using up to 1152 CPU cores. Spatial Join query is used as an exemplar to demonstrate an end-to-end application using MPI-Vector-IO. 
    more » « less
  7. This special issue is devoted to progress in one of the most important challenges facing computing education.The work published here is of relevance to those who teach computing related topics at all levels, with greatest implications for undergraduate education. Parallel and distributed computing (PDC) has become ubiquitous to the extent that even casual users feel their impact. This necessitates that every programmer understands how parallelism and a distributed environment affect problem solving. Thus,teaching only traditional, sequential programming is no longer adequate. For this reason, it is essential to impart a range of PDC and high performance computing (HPC) knowledge and skills at various levels within the educational fabric woven by Computer Science (CS), Computer Engineering (CE), and related computational science and engineering curricula. This special issue sought high quality contributions in the fields of PDC and HPC education. Submissions were on the topics of EduPar2016, Euro-EduPar2016 and EduHPC2016 workshops,but the submission was open to all. This special issue includes 12 paper spanning pedagogical techniques, tools and experiences. 
    more » « less