skip to main content


Title: SciTokens SSH: Token-based Authentication for Remote Login to Scientific Computing Environments
SciTokens SSH is a pluggable authentication module (PAM) that uses JSON Web Tokens (JWTs) for authentication to the Secure Shell (SSH) remote login service. SciTokens SSH supports multiple token issuers with local token verification, so scientific computing providers are not forced to rely on a single OAuth server for token issuance and verification. The decentralized design for SciTokens SSH was motivated by the distributed nature of scientific computing environments, where scientists use computational resources from multiple providers, with a variety of security policies, distributed across the globe.  more » « less
Award ID(s):
1738962
NSF-PAR ID:
10158905
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Practice and Experience in Advanced Research Computing
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In modern healthcare, smart medical devices are used to ensure better and informed patient care. Such devices have the capability to connect to and communicate with the hospital's network or a mobile application over wi-fi or Bluetooth, allowing doctors to remotely configure them, exchange data, or update the firmware. For example, Cardiovascular Implantable Electronic Devices (CIED), more commonly known as Pacemakers, are increasingly becoming smarter, connected to the cloud or healthcare information systems, and capable of being programmed remotely. Healthcare providers can upload new configurations to such devices to change the treatment. Such configurations are often exchanged, reused, and/or modified to match the patient's specific health scenario. Such capabilities, unfortunately, come at a price. Malicious entities can provide a faulty configuration to such devices, leading to the patient's death. Any update to the state or configuration of such devices must be thoroughly vetted before applying them to the device. In case of any adverse events, we must also be able to trace the lineage and propagation of the faulty configuration to determine the cause and liability issues. In a highly distributed environment such as today's hospitals, ensuring the integrity of configurations and security policies is difficult and often requires a complex setup. As configurations propagate, traditional access control and authentication of the healthcare provider applying the configuration is not enough to prevent installation of malicious configurations. In this paper, we argue that a provenance-based approach can provide an effective solution towards hardening the security of such medical devices. In this approach, devices would maintain a verifiable provenance chain that would allow assessing not just the current state, but also the past history of the configuration of the device. Also, any configuration update would be accompanied by its own secure provenance chain, allowing verification of the origin and lineage of the configuration. The ability to protect and verify the provenance of devices and configurations would lead to better patient care, prevent malfunction of the device due to malicious configurations, and allow after-the-fact investigation of device configuration issues. In this paper, we advocate the benefits of such an approach and sketch the requirements, implementation challenges, and deployment strategies for such a provenance-based system. 
    more » « less
  2. The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. In this paper, we introduce SciTokens, open source software to help scientists manage their security credentials more reliably and securely. We describe the SciTokens system architecture, design, and implementation addressing use cases from the Laser Interferometer Gravitational-Wave Observatory (LIGO) Scientific Collaboration and the Large Synoptic Survey Telescope (LSST) projects. We also present our integration with widely-used software that supports distributed scientific computing, including HTCondor, CVMFS, and XrootD. SciTokens uses IETF-standard OAuth tokens for capability-based secure access to remote scientific data. The access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. 
    more » « less
  3. The management of security credentials (e.g., passwords, secret keys) for computational science workflows is a burden for scientists and information security officers. Problems with credentials (e.g., expiration, privilege mismatch) cause workflows to fail to fetch needed input data or store valuable scientific results, distracting scientists from their research by requiring them to diagnose the problems, re-run their computations, and wait longer for their results. SciTokens introduces a capabilities-based authorization infrastructure for distributed scientific computing, to help scientists manage their security credentials more reliably and securely. SciTokens uses IETF-standard OAuth JSON Web Tokens for capability-based secure access to remote scientific data. These access tokens convey the specific authorizations needed by the workflows, rather than general-purpose authentication impersonation credentials, to address the risks of scientific workflows running on distributed infrastructure including NSF resources (e.g., LIGO Data Grid, Open Science Grid, XSEDE) and public clouds (e.g., Amazon Web Services, Google Cloud, Microsoft Azure). By improving the interoperability and security of scientific workflows, SciTokens 1) enables use of distributed computing for scientific domains that require greater data protection and 2) enables use of more widely distributed computing resources by reducing the risk of credential abuse on remote systems. In this extended abstract, we present the results over the past year of our open source implementation of the SciTokens model and its deployment in the Open Science Grid, including new OAuth support added in the HTCondor 8.8 release series. 
    more » « less
  4. Reed, Daniel A. ; Lifka, David ; Swanson, David ; Amaro, Rommie ; Wilkins-Diehr, Nancy (Ed.)
    This report summarizes the discussions from a workshop convened at NSF on May 30-31, 2018 in Alexandria, VA. The overarching objective of the workshop was to rethink the nature and composition of the NSF-supported computational ecosystem given changing application requirements and resources and technology landscapes. The workshop included roughly 50 participants, drawn from high-performance computing (HPC) centers, campus computing facilities, cloud service providers (academic and commercial), and distributed resource providers. Participants spanned both large research institutions and smaller universities. Organized by Daniel Reed (University of Utah, chair), David Lifka (Cornell University), David Swanson (University of Nebraska), Rommie Amaro (UCSD), and Nancy Wilkins-Diehr (UCSD/SDSC), the workshop was motivated by the following observations. First, there have been dramatic changes in the number and nature of applications using NSF-funded resources, as well as their resource needs. As a result, there are new demands on the type (e.g., data centric) and location (e.g., close to the data or the users) of the resources as well as new usage modes (e.g., on-demand and elastic). Second, there have been dramatic changes in the landscape of technologies, resources, and delivery mechanisms, spanning large scientific instruments, ubiquitous sensors, and cloud services, among others. 
    more » « less
  5. Hypervisors are widely deployed by cloud computing providers to support virtual machines, but their growing complexity poses a security risk, as large codebases contain many vulnerabilities. We present SeKVM, a layered Linux KVM hypervisor architecture that has been formally verified on multiprocessor hardware. Using layers, we isolate KVM's trusted computing base into a small core such that only the core needs to be verified to ensure KVM's security guarantees. Using layers, we model hardware features at different levels of abstraction tailored to each layer of software. Lower hypervisor layers that configure and control hardware are verified using a novel machine model that includes multiprocessor memory management hardware such as multi-level shared page tables, tagged TLBs, and a coherent cache hierarchy with cache bypass support. Higher hypervisor layers that build on the lower layers are then verified using a more abstract and simplified model, taking advantage of layer encapsulation to reduce proof burden. Furthermore, layers provide modularity to reduce verification effort across multiple implementation versions. We have retrofitted and verified multiple versions of KVM on Arm multiprocessor hardware, proving the correctness of the implementations and that they contain no vulnerabilities that can affect KVM's security guarantees. Our work is the first machine-checked proof for a commodity hypervisor using multiprocessor memory management hardware. SeKVM requires only modest KVM modifications and incurs only modest performance overhead versus unmodified KVM on real application workloads. 
    more » « less