skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: The CROSS Incubator: A Case Study for funding and training RSEs
The incubator and research projects sponsored by the Center for Research in Open Source Software (CROSS, cross.ucsc.edu) at UC Santa Cruz have been very effective at promoting the professional and technical development of research software engineers. Carlos Maltzahn founded CROSS in 2015 with a generous gift of $2,000,000 from UC Santa Cruz alumnus Dr. Sage Weil and founding memberships of Toshiba America Electronic Components, SK Hynix Memory Solutions, and Micron Technology. Over the past five years, CROSS funding has enabled PhD students to not only create re- search software projects but also learn how to draw in new contributors and leverage established open source software communities. This position paper will present CROSS fellowships as case studies for how university-led open source projects can create a real- world, reproducible model for effectively training, funding and sup- porting research software engineers.  more » « less
Award ID(s):
1705021
PAR ID:
10338551
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
RSE-HPC – Introduction: Research Software Engineers in HPC: Creating Community, Building Careers, Addressing Challenges
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Ceph is an open source distributed storage system that is object-based and massively scalable. Ceph provides developers with the capability to create data interfaces that can take advantage of local CPU and memory on the storage nodes (Ceph Object Storage Devices). These interfaces are powerful for application developers and can be created in C, C++, and Lua. Skyhook is an open source storage and database project in the Center for Research in Open Source Software at UC Santa Cruz. Skyhook uses these capabilities in Ceph to create specialized read/write interfaces that leverage IO and CPU within the storage layer toward database processing and management. Specifically, we develop methods to apply predicates locally as well as additional metadata and indexing capabilities using Ceph's internal indexing mechanism built on top of RocksDB. Skyhook's approach helps to enable scale-out of a single node database system by scaling out the storage layer. Our results show the performance benefits for some queries indeed scale well as the storage layer scales out. 
    more » « less
  2. The Skyhook Data Management project (SkyhookDM.com) at the Center for Research in Open Source Software (cross.ucsc.edu) at UC Santa Cruz implements customized extensions through Ceph's object class interface that enables offloading database operations to the storage system. In our previous Vault '19 talk, we showed how SkyhookDM can transparently scale out databases. The SkyhookDM Ceph extensions are an example of our 'programmable storage' research efforts at UCSC, and can be accessed through commonly available external/foreign table database interfaces. Utilizing fast in-memory serialization libraries such as Google Flatbuffers and Apache Arrow, SkyhookDM currently implements common database functions such as SELECT, PROJECT, AGGREGATE, and indexing inside Ceph, along with lower-level data manipulations such as transforming data from row to column formats on RADOS servers. In this talk, we will present three of our latest developments on the SkyhookDM project since Vault '19. First, SkyhookDM can be used to also offload operations of access libraries that support plugins for backends, such as HDF5 and its Virtual Object Layer. Second, in addition to row-oriented data format using Google's Flatbuffers, we have added support for column-oriented data formats using the Apache Arrow library within our Ceph extensions. Third, we added dynamic switching between row and column data formats within Ceph objects, a first step towards physical design management in storage systems, similar to physical design tuning in database systems. 
    more » « less
  3. Open source is ubiquitous and many projects act as critical in- frastructure, yet funding and sustaining the whole ecosystem is challenging. While there are many different funding models for open source and concerted efforts through foundations, donation platforms like PayPal, Patreon, and OpenCollective are popular and low-bar platforms to raise funds for open-source development. With a mixed-method study, we investigate the emerging and largely unexplored phenomenon of donations in open source. Specifically, we quantify how commonly open-source projects ask for donations, statistically model characteristics of projects that ask for and re- ceive donations, analyze for what the requested funds are needed and used, and assess whether the received donations achieve the intended outcomes. We find 25,885 projects asking for donations on GitHub, often to support engineering activities; however, we also find no clear evidence that donations influence the activity level of a project. In fact, we find that donations are used in a multitude of ways, raising new research questions about effective funding. 
    more » « less
  4. null (Ed.)
    Students’ experience with software testing in undergraduate computing courses is often relatively shallow, as compared to the importance of the topic. This experience report describes introducing industrial-strength testing into CMPSC 156, an upper division course in software engineering at UC Santa Barbara . We describe our efforts to modify our software engineering course to introduce rigorous test-coverage requirements into full-stack web development projects, requirements similar to those the authors had experienced in a professional software development setting. We present student feedback on the course and coverage metrics for the projects. We reflect on what about these changes worked (or didn’t), and provide suggestions for other instructors that would like to give their students a deeper experience with software testing in their software engineering courses. 
    more » « less
  5. Open source is ubiquitous and many projects act as critical infrastructure, yet funding and sustaining the whole ecosystem is challenging. While there are many dierent funding models for open source and concerted eorts through foundations, donation platforms like PayPal, Patreon, and OpenCollective are popular and low-bar platforms to raise funds for open-source development. With a mixed-method study, we investigate the emerging and largely unexplored phenomenon of donations in open source. Specically, we quantify how commonly open-source projects ask for donations, statistically model characteristics of projects that ask for and receive donations, analyze for what the requested funds are needed and used, and assess whether the received donations achieve the intended outcomes. We nd 25,885 projects asking for donations on GitHub, often to support engineering activities; however, we also nd no clear evidence that donations inuence the activity level of a project. In fact, we nd that donations are used in a multitude of ways, raising new research questions about eective funding. 
    more » « less