SLATE (Services Layer at the Edge) is a new project that, when complete, will implement “cyberinfrastructure as code” by augmenting the canonical Science DMZ pattern with a generic, programmable, secure and trusted underlayment platform. This platform will host advanced container-centric services needed for higher-level capabilities such as data transfer nodes, software and data caches, workflow services and science gateway components. SLATE will use best-of-breed data center virtualization components, and where available, software defined networking, to enable distributed automation of deployment and service lifecycle management tasks by domain experts. As such it will simplify creation of scalable platforms that connect research teams, institutions and resources to accelerate science while reducing operational costs and development cycle times. Since SLATE will be designed to require only commodity components for its functional layers, its potential for building distributed systems should extend across all data center types and scales, thus enabling creation of ubiquitous, science-driven cyberinfrastructure. By providing automation and programmatic interfaces to distributed HPC backends and other cyberinfrastructure resources, SLATE will amplify the reach of science gateways and therefore the domain communities they support.
more »
« less
A No Code Approach to Infrastructure Provisioning in Support of Science
Communications infrastructures and compute resources are critical to enabling advanced science research projects. Science cyberinfrastructures must meet clear performance requirements, must be adjustable to changing requirements and must facilitate reproducibility. These characteristics can be met by a programmable infrastructure with guaranteed resources such as the BRIDGES infrastructure enabling cross Atlantic research projects. While programmability should be a foundational design principle for research cyberinfrastructures, by itself might not be sufficient to enabling scientists who have no or limited experience with advanced IT technologies operate their testbeds independent of IT support teams. The trend of offering “no code” platforms enabling users without IT core competency to achieve business goals should manifest itself in the context of research and educational infrastructures as well. In this paper we describe the architecture of a “no code” platform which would enable scientists to easily configure and modify a programmable infrastructure by using a large language model-based interface integrated with the composable services language of the infrastructure. The BRIDGES testbed is used as an example for such an integration where the functionality benefits projects operated by large, diverse teams.
more »
« less
- Award ID(s):
- 2029218
- PAR ID:
- 10561978
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 979-8-3503-2458-7
- Page Range / eLocation ID:
- 1 to 2
- Subject(s) / Keyword(s):
- Cyberinfrastructure, AI, LLM, RAG, Virtualization, Science
- Format(s):
- Medium: X
- Location:
- Baltimore, MD, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In today’s Big Data era, data scientists require modern workflows to quickly analyze large-scale datasets using complex codes to maintain the rate of scientific progress. These scientists often rely on available campus resources or off-the-shelf computational systems for their applications. Unified infrastructure or over-provisioned servers can quickly become bottlenecks for specific tasks, wasting time and resources. Composable infrastructure helps solve these problems by providing users with new ways to increase resource utilization. Composable infrastructure disaggregates a computer’s components – CPU, GPU (accelerators), storage and networking – into fluid pools of resources, but typically relies upon infrastructure engineers to architect individual machines. Infrastructure is either managed with specialized command-line utilities, user interfaces or specification files. These management models are cumbersome and difficult to incorporate into data-science workflows. We developed a high-level software API, Composastructure, which, when integrated into modern workflows, can be used by infrastructure engineers as well as data scientists to reorganize composable resources on demand. Composastructure enables infrastructures to be programmable, secure, persistent and reproducible. Our API composes machines, frees resources, supports multi-rack operations, and includes a Python module for Jupyter Notebooks.more » « less
-
We present ConflLlama, demonstrating how efficient fine-tuning of large language models can advance automated classification tasks in political science research. While classification of political events has traditionally relied on manual coding or rigid rule-based systems, modern language models offer the potential for more nuanced, context-aware analysis. However, deploying these models requires overcoming significant technical and resource barriers. We demonstrate how to adapt open-source language models to specialized political science tasks, using conflict event classification as our proof of concept. Through quantization and efficient fine-tuning techniques, we show state-of-the-art performance while minimizing computational requirements. Our approach achieves a macro-averaged AUC of 0.791 and a weighted F1-score of 0.753, representing a 37.6% improvement over the base model, with accuracy gains of up to 1463% in challenging classifications. We offer a roadmap for political scientists to adapt these methods to their own research domains, democratizing access to advanced NLP capabilities across the discipline. This work bridges the gap between cutting-edge AI developments and practical political science research needs, enabling broader adoption of these powerful analytical tools.more » « less
-
Temporal Logic (TL) bridges the gap between natural language and formal reasoning in the field of complex systems verification. However, in order to leverage the expressivity entailed by TL, the syntax and semantics must first be understood—a large task in itself. This significant knowledge gap leads to several issues: (1) the likelihood of adopting a TL-based verification method is decreased, and (2) the chance of poorly written and inaccurate requirements is increased. In this ongoing work, we present the Pythonic Formal Requirements Language (PyFoReL) tool: a Domain-Specific Language inspired by the programming language Python to simplify the elicitation of TL-based requirements for engineers and non-experts.more » « less
-
The project mission was to organize a workshop aimed to explore how the US data science community can cooperate with and benefit from collaborations with partners in Serbia and the West Balkan region. The scope included fundamental data science methods and high-impact applications related to big data processing, security and privacy in critical infrastructures, biomedical informatics, and computational archeology. The proposed workshop facilitated closing the gap between data science research in the US and Serbia and the region and brought together data scientists with researchers from disciplines that until recently had little exposure to data science methods, potentially enabling collaborative breakthroughs in those scientific fields. A large fraction of participants from both sides were early career researchers including advanced level graduate students, postdoctoral research associates, and assistant/associate professors within 10 years of obtaining their Ph.D. The participants included a large fraction of female and minority scientists. The workshop objective was achieved by including the following inter-related objectives: (1) Establishing new multidisciplinary international collaborations between data science, mathematics, and sciences that generate big data and require advanced methods; (2) Reinforcing collaboration mechanisms between the NSF and Serbia’s Ministry of Education, Science and Technological Development and organize joint research projects; and (3) Widening the impact of the workshop, by involving researchers and stakeholders from the West Balkan region. The workshop consisted of four tracks, each co-chaired by 3 investigators from the US, Serbia and another West Balkan country. Tangible outcomes from the workshop include a report describing workshop activities for each of four tracks and a proposal recommending research collaboration areas of interest for all parties and determining collaboration mechanisms and programs to facilitate collaboration.more » « less
An official website of the United States government

