Abstract For this study, we present and evaluate an improved agent-based modeling framework, the Forecasting Laboratory for Exploring the Evacuation-system, version 2.0 (FLEE 2.0), designed to investigate relationships between hurricane forecast uncertainty and evacuation outcomes. Presented improvements include doubling its spatial resolution, using a quantitative approach to map real-world data onto the model’s virtual world, and increasing the number of possible risk magnitudes for wind, surge, and rain risk. To assess model realism, we compare FLEE 2.0’s simulated evacuations—specifically its evacuation orders, evacuation rates, and traffic—to available observational data collected during Hurricanes Irma, Dorian, and Ian. FLEE 2.0’s evacuation response is encouraging, given that FLEE 2.0 responds reasonably and differently to all three different types of forecast scenarios. FLEE 2.0 well represents the spatial distribution of observed evacuation rates, and relative to a lower spatial resolution version of the model, FLEE 2.0 better captures sharp gradients in evacuation behaviors across the coastlines and metropolitan areas. Quantitatively evaluating FLEE 2.0’s evacuation rates during Irma establishes model errors, uncertainties, and opportunities for improvement. In summary, this paper increases our confidence in FLEE 2.0, develops a framework for evaluating and improving these types of models, and sets the stage for additional analyses to quantify the impacts of forecast track, intensity, and other positional errors on evacuation. Significance StatementThis paper describes and evaluates an updated version of a modeling system [the Forecasting Laboratory for Exploring the Evacuation-system, version 2.0 (FLEE 2.0)] designed to explore relationships between hurricane forecasts and evacuation impacts. FLEE 2.0’s simulated evacuations compare favorably with different types of observational evacuation data collected during Hurricanes Irma, Dorian, and Ian. A statistical comparison with Irma’s observed evacuation rates highlights uncertainties and opportunities for improvement in FLEE 2.0. In summary, this paper increases our confidence in FLEE 2.0, develops a framework for evaluating these types of models, and provides a foundation for additional work using FLEE 2.0 as a research tool.
more »
« less
virMine 2.0: Identifying Viral Sequences in Microbial Communities
ABSTRACT Here, we present virMine 2.0, the next generation of the virMine software tool. virMine 2.0 uses an exclusion technique to remove nonviral data from sequencing reads and scores the remaining data based on relatedness to viral elements, eliminating the sole dependency on homology identification.
more »
« less
- Award ID(s):
- 1661357
- PAR ID:
- 10374309
- Editor(s):
- Newton, Irene L.
- Date Published:
- Journal Name:
- Microbiology Resource Announcements
- Volume:
- 11
- Issue:
- 5
- ISSN:
- 2576-098X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
MANA-2.0 is a scalable, future-proof design for transparent checkpointing of MPI-based computations. Its network transparency (“network-agnostic”) feature ensures that MANA-2.0 will provide a viable, efficient mechanism for trans-parently checkpointing MPI applications on current and future supercomputers. MANA-2.0 is an enhancement of previous work, the original MANA, which interposes MPI calls, and is a work in progress intended for production deployment. MANA-2.0 implements a series of new algorithms and features that improve MANA's scalability and reliability, enabling transparent checkpoint-restart over thousands of MPI processes. MANA-2.0 is being tested on today's Cori supercomputer at NERSC using Cray MPICH library over the Cray GNI network, but it is designed to work over any standard MPI running over an arbitrary network. Two widely-used HPC applications were selected to demonstrate the enhanced features of MANA-2.0: GROMACS, a molecular dynamics simulation code with frequent point-to-point communication, and VASP, a materials science code with frequent MPI collective communication. Perhaps the most important lesson to be learned from MANA-2.0 is a series of algorithms and data structures for library-based transformations that enable MPI-based computations over MANA-2.0 to reliably survive the checkpoint-restart transition.more » « less
-
The P4 language and programmable switch hardware, like the Intel Tofino, have made it possible for network engineers to write new programs that customize operation of computer networks, thereby improving performance, fault-tolerance, energy use, and security. Unfortunately,possibledoes not meaneasy—there are many implicit constraints that programmers must obey if they wish their programs to compile to specialized networking hardware. In particular, all computations on the same switch must access data structures in a consistent order, or it will not be possible to lay that data out along the switch’s packet-processing pipeline. In this paper, we define Lucid 2.0, a new language and type system that guarantees programs access data in a consistent order and hence arepipeline-safe. Lucid 2.0 builds on top of the original Lucid language, which is also pipeline-safe, but lacks the features needed for modular construction of data structure libraries. Hence, Lucid 2.0 adds (1) polymorphism and ordering constraints for code reuse; (2) abstract, hierarchical pipeline locations and data types to support information hiding; (3) compile-time constructors, vectors and loops to allow for construction of flexible data structures; and (4) type inference to lessen the burden of program annotations. We develop the meta-theory of Lucid 2.0, prove soundness, and show how to encode constraint checking as an SMT problem. We demonstrate the utility of Lucid 2.0 by developing a suite of useful networking libraries and applications that exploit our new language features, including Bloom filters, sketches, cuckoo hash tables, distributed firewalls, DNS reflection defenses, network address translators (NATs) and a probabilistic traffic monitoring service.more » « less
-
Recently, speech foundation models have gained popularity due to their superiority in finetuning downstream ASR tasks. However, models finetuned on certain domains, such as LibriSpeech (adult read speech), behave poorly on other domains (child or noisy speech). One solution could be collecting as much labeled and diverse data as possible for joint finetuning on various domains. However, collecting target domain speech-text paired data and retraining the model is often costly and computationally expensive. In this paper, we introduce a simple yet effective method, speech only adaptation (SOA), based on speech foundation models (Wav2vec 2.0), which requires only speech input data from the target domain. Specifically, the Wav2vec 2.0 feature encoder is continually pretrained with the Wav2vec 2.0 loss on both the source and target domain data for domain adaptation, while the contextual encoder is frozen. Compared to a source domain finetuned model with the feature encoder being frozen during training, we find that replacing the frozen feature encoder with the adapted one provides significant WER improvements to the target domain while preserving the performance of the source domain. The effectiveness of SOA is examined on various low resource or domain mismatched ASR settings, including adult-child and clean-noisy speech.more » « less
-
Abstract Reusing massive collections of publicly available biomedical data can significantly impact knowledge discovery. However, these public samples and studies are typically described using unstructured plain text, hindering the findability and further reuse of the data. To combat this problem, we propose txt2onto 2.0, a general-purpose method based on natural language processing and machine learning for annotating biomedical unstructured metadata to controlled vocabularies of diseases and tissues. Compared to the previous version (txt2onto 1.0), which uses numerical embeddings as features, this new version uses words as features, resulting in improved interpretability and performance, especially when few positive training instances are available. Txt2onto 2.0 uses embeddings from a large language model during prediction to deal with unseen-yet-relevant words related to each disease and tissue term being predicted from the input text, thereby explaining the basis of every annotation. We demonstrate the generalizability of txt2onto 2.0 by accurately predicting disease annotations for studies from independent datasets, using proteomics and clinical trials as examples. Overall, our approach can annotate biomedical text regardless of experimental types or sources. Code, data, and trained models are available at https://github.com/krishnanlab/txt2onto2.0.more » « less
An official website of the United States government

