This paper investigates the return on investment (ROI) in cyberinfrastructure (CI) facilities and services by comparing the value of end products created to the cost of operations. We assessed the cost of a US CI facility called XSEDE and the value of the end products created using this facility, categorizing end products according to the International Integrated Reporting Framework. The US federal government invested approximately $0.3B in operating the XSEDE ecosystem from 2016–2022. The estimated value of end products facilitated by XSEDE ranges from around $4.7B to $22.7B or more. Credit for the majority of these end products is shared among various contributors, including the XSEDE ecosystem. Granting the XSEDE ecosystem a seemingly reasonable percentage of credit for its contributions to end product creation suggests that the return on federal investment in the XSEDE ecosystem, in terms of value of end products created, was greater than one and possibly far greater than one. The Framework proved useful for addressing this question. Earlier work showed that the value of services provided by XSEDE was significantly greater than the cost of those services to the US federal government—a positive return on investment for delivery of services. Analyzing the financial efficiency of operations and the financial value of end products are two means for assessing the success of CI facilities in financial terms. Financial analyses should be used as one of many approaches for evaluating the success of CI facilities.
This paper uses accounting concepts—particularly the concept of Return on Investment (ROI)—to reveal the quantitative value of scientific research pertaining to a major US cyberinfrastructure project (XSEDE—the eXtreme Science and Engineering Discovery Environment). XSEDE provides operational and support services for advanced information technology systems, cloud systems, and supercomputers supporting non-classified US research, with an average budget for XSEDE of US$20M+ per year over the period studied (2014–2021). To assess the financial effectiveness of these services, we calculated a proxy for ROI, and converted quantitative measures of XSEDE service delivery into financial values using costs for service from the US marketplace. We calculated two estimates of ROI: a Conservative Estimate, functioning as a lower bound and using publicly available data for a lower valuation of XSEDE services; and a Best Available Estimate, functioning as a more accurate estimate, but using some unpublished valuation data. Using the largest dataset assembled for analysis of ROI for a cyberinfrastructure project, we found a Conservative Estimate of ROI of 1.87, and a Best Available Estimate of ROI of 3.24. Through accounting methods, we show that XSEDE services offer excellent value to the US government, that the services offered uniquely by XSEDE (that is, not otherwise available for purchase) were the most valuable to the facilitation of US research activities, and that accounting-based concepts hold great value for understanding the mechanisms of scientific research generally.
more » « less- NSF-PAR ID:
- 10397646
- Publisher / Repository:
- Springer Science + Business Media
- Date Published:
- Journal Name:
- Scientometrics
- Volume:
- 128
- Issue:
- 6
- ISSN:
- 0138-9130
- Page Range / eLocation ID:
- p. 3225-3255
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
Abstract The Allocations Service for the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program is charged with accepting, reviewing, and processing researchers’ requests to use resources that are integrated into the ACCESS ecosystem. We present as a case study the metrics framework used to evaluate the Allocations Service project, a metrics framework that aligns with the project’s goals and identifies key performance indicators (KPIs). Several of our top-level KPIs reflect complex concepts and are composite measures built from suites of metrics compiled from two primary sources: a well-instrumented allocations and accounting system and an annual survey of the ACCESS researcher community. This approach allows us to describe and measure complex concepts such as “democratization” and “ecosystem access time” in a quantitative manner and to target improvements to project activities. The metrics framework is augmented by metrics to measure the performance of the project team, to describe general ecosystem and allocations activity, and to capture publications from the researcher community. We used this framework to gather and present data as part of the ACCESS Allocations Service first annual NSF panel review. The metrics were largely successful at communicating our progress, but we also encountered a few unexpected technical issues with the data and calculations themselves, which we are continuing to refine. Presented here as a case study, this approach to a metrics framework for the Allocations Service has proved valuable in complementing more subjective descriptions of the project, its accomplishments, and progress toward our goals.
-
The landscape of research in science and engineering is heavily reliant on computation and data processing. There is continued and expanded usage by disciplines that have historically used advanced computing resources, new usage by disciplines that have not traditionally used HPC, and new modalities of the usage in Data Science, Machine Learning, and other areas of AI. Along with these new patterns have come new advanced computing resource methods and approaches, including the availability of commercial cloud resources. The Coalition for Academic Scientific Computation (CASC) has long been an advocate representing the needs of academic researchers using computational resources, sharing best practices and offering advice to create a national cyberinfrastructure to meet US science, engineering, and other academic computing needs. CASC has completed the first of what we intend to be an annual survey of academic cloud and data center usage and practices in analyzing return on investment in cyberinfrastructure. Critically important findings from this first survey include the following: many of the respondents are engaged in some form of analysis of return in research computing investments, but only a minority currently report the results of such analyses to their upper-level administration. Most respondents are experimenting with use of commercial cloud resources but no respondent indicated that they have found use of commercial cloud services to create financial benefits compared to their current methods. There is clear correlation between levels of investment in research cyberinfrastructure and the scale of both cpu core-hours delivered and the financial level of supported research grants. Also interesting is that almost every respondent indicated that they participate in some sort of national cooperative or nationally provided research computing infrastructure project and most were involved in academic computing-related organizations, indicating a high degree of engagement by institutions of higher education in building and maintaining national research computing ecosystems. Institutions continue to evaluate cloud-based HPC service models, despite having generally concluded that so far cloud HPC is too expensive to use compared to their current methods.more » « less
-
Large scientific facilities are unique and complex infrastructures that have become fundamental instruments for enabling high quality, world-leading research to tackle scientific problems at unprecedented scales. Cyberinfrastructure (CI) is an essential component of these facilities, providing the user community with access to data, data products, and services with the potential to transform data into knowledge. However, the timely evolution of the CI available at large facilities is challenging and can result in science communities requirements not being fully satisfied. Furthermore, integrating CI across multiple facilities as part of a scientific workflow is hard, resulting in data silos. In this paper, we explore how science gateways can provide improved user experiences and services that may not be offered at large facility datacenters. Using a science gateway supported by the Science Gateway Community Institute, which provides subscription-based delivery of streamed data and data products from the NSF Ocean Observatories Initiative (OOI), we propose a system that enables streaming-based capabilities and workflows using data from large facilities, such as the OOI, in a scalable manner. We leverage data infrastructure building blocks, such as the Virtual Data Collaboratory, which provides data and comput- ing capabilities in the continuum to efficiently and collaboratively integrate multiple data-centric CIs, build data-driven workflows, and connect large facilities data sources with NSF-funded CI, such as XSEDE. We also introduce architectural solutions for running these workflows using dynamically provisioned federated CI.more » « less
-
SLATE (Services Layer at the Edge) is a new project that, when complete, will implement “cyberinfrastructure as code” by augmenting the canonical Science DMZ pattern with a generic, programmable, secure and trusted underlayment platform. This platform will host advanced container-centric services needed for higher-level capabilities such as data transfer nodes, software and data caches, workflow services and science gateway components. SLATE will use best-of-breed data center virtualization components, and where available, software defined networking, to enable distributed automation of deployment and service lifecycle management tasks by domain experts. As such it will simplify creation of scalable platforms that connect research teams, institutions and resources to accelerate science while reducing operational costs and development cycle times. Since SLATE will be designed to require only commodity components for its functional layers, its potential for building distributed systems should extend across all data center types and scales, thus enabling creation of ubiquitous, science-driven cyberinfrastructure. By providing automation and programmatic interfaces to distributed HPC backends and other cyberinfrastructure resources, SLATE will amplify the reach of science gateways and therefore the domain communities they support.more » « less