skip to main content


Search for: All records

Award ID contains: 1738420

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available January 1, 2025
  2. Free, publicly-accessible full text available October 30, 2024
  3. For the next generation of wireless technologies, Orthogonal Frequency Division Multiplexing (OFDM) remains a key signaling technique. Peak-to-Average Power Ratio (PAPR) reduction must be included with OFDM to reduce the detrimental high PAPR exhibited by OFDM. The cost of PAPR reduction techniques stems from adding multiple IFFT iterations, which are computationally expensive and increase latency. We propose a novel PAPR Estimation Technique called PESTNet which reduces the necessary IFFT operations for PAPR reduction techniques by using deep learning to estimate the PAPR before the IFFT is applied. This paper gives a brief background on PAPR in OFDM systems and describes the PESTNet algorithm and the training methodologies. A case study of the estimation model is provided where results demonstrate PESTNet is able to give an accurate estimate of PAPR and can compute large batches of resource grids up to 10 times faster than IFFT based techniques. 
    more » « less
  4. Satellite communication (SATCOM) is a critical infrastructure for tactical networks--especially for the intermittent communication of submarines. To ensure data reliability, recent SATCOM research has begun to embrace several advances, such as low earth orbit (LEO) satellite networks to reduce latency and increase throughput compared to long-distance geostationary (GEO) satellites, and software-defined networking (SDN) to increase network control and security. This paper proposes an SD-LEO constellation for submarines in communication networks. An SD-LEO architecture is proposed, to Denial-of-Service (DoS) attack detection and classification using the extreme gradient boosting (XGBoost) algorithm. Numerical results demonstrate greater than ninety-eight percent in accuracy, precision, recall, and F1-scores. 
    more » « less
  5. As the design space for high-performance computer (HPC) systems grows larger and more complex, modeling and simulation (MODSIM) techniques become more important to better optimize systems. Furthermore, recent extreme-scale systems and newer technologies can lead to higher system fault rates, which negatively affect system performance and other metrics. Therefore, it is important for system designers to consider the effects of faults and fault-tolerance (FT) techniques on system design through MODSIM. BE-SST is an existing MODSIM methodology and workflow that facilitates preliminary exploration & reduction of large design spaces, particularly by highlighting areas of the space for detailed study and pruning less optimal areas. This paper presents the overall methodology for adding fault-tolerance awareness (FT-awareness) into BE-SST. We present the process used to extend BE-SST, enabling the creation of models that predict the time needed to perform a checkpoint instance for the given system configuration. Additionally, this paper presents a case study where a full HPC system is simulated using BE-SST, including application, hardware, and checkpointing. We validate the models and simulation against actual system measurements, finding an average percent error of less than 17% for the instance models and about 20% for system simulation, a level of accuracy acceptable for initial exploration and pruning of the design space. Finally, we show how FT-aware simulation results are used for comparing FT levels in the design space. 
    more » « less
  6. null (Ed.)
    Large Convolutional Neural Networks (CNNs) are often pruned and compressed to reduce the amount of parameters and memory requirement. However, the resulting irregularity in the sparse data makes it difficult for FPGA accelerators that contains systolic arrays of Multiply-and-Accumulate (MAC) units, such as Intel’s FPGA-based Deep Learning Accelerator (DLA), to achieve their maximum potential. Moreover, FPGAs with low-bandwidth off-chip memory could not satisfy the memory bandwidth requirement for sparse matrix computation. In this paper, we present 1) a sparse matrix packing technique that condenses sparse inputs and filters before feeding them into the systolic array of MAC units in the Intel DLA, and 2) a customization of the Intel DLA which allows the FPGA to efficiently utilize a high bandwidth memory (HBM2) integrated in the same package. For end-to-end inference with randomly pruned ResNet-50/MobileNet CNN models, our experiments demonstrate 2.7x/3x performance improvement compared to an FPGA with DDR4, 2.2x/2.1x speedup against a server-class Intel SkyLake CPU, and comparable performance with 1.7x/2x power efficiency gain as compared to an NVidia V100 GPU. 
    more » « less
  7. null (Ed.)
    AI and deep learning are experiencing explosive growth in almost every domain involving analysis of big data. Deep learning using Deep Neural Networks (DNNs) has shown great promise for such scientific data analysis applications. However, traditional CPU-based sequential computing without special instructions can no longer meet the requirements of mission-critical applications, which are compute-intensive and require low latency and high throughput. Heterogeneous computing (HGC), with CPUs integrated with GPUs, FPGAs, and other science-targeted accelerators, offers unique capabilities to accelerate DNNs. Collaborating researchers at SHREC1at the University of Florida, CERN Openlab, NERSC2at Lawrence Berkeley National Lab, Dell EMC, and Intel are studying the application of heterogeneous computing (HGC) to scientific problems using DNN models. This paper focuses on the use of FPGAs to accelerate the inferencing stage of the HGC workflow. We present case studies and results in inferencing state-of-the-art DNN models for scientific data analysis, using Intel distribution of OpenVINO, running on an Intel Programmable Acceleration Card (PAC) equipped with an Arria 10 GX FPGA. Using the Intel Deep Learning Acceleration (DLA) development suite to optimize existing FPGA primitives and develop new ones, we were able accelerate the scientific DNN models under study with a speedup from 2.46x to 9.59x for a single Arria 10 FPGA against a single core (single thread) of a server-class Skylake CPU. 
    more » « less
  8. In the past decades, memory devices have been playing catch-up to the improving performance of processors. Although memory performance can be improved by the introduction of various configurations of a memory cache hierarchy, memory remains the performance bottleneck at a system level for big-data analytics and machine learning applications. An emerging solution for this problem is the use of a complementary compute cache architecture, using Compute-in-Memory (CiM) technologies, to bring computation close to memory. CiM implements compute primitives (e.g., arithmetic ops, data-ordering ops) which are simple enough to be embedded in the logic layers of emerging memory devices. Analogous to in-core memory caches, the CiM primitives provide low functionality but high performance by reducing data transfers. In this abstract, we describe a novel methodology to perform design space exploration (DSE) through system-level performance modeling and simulation (MODSIM) of CiM architectures for big-data analytics and machine learning applications. 
    more » « less
  9. null (Ed.)
    Wireless infrastructure is steadily evolving into wireless access for all humans and most devices, from 5G to Internet-of-Things. This widespread access creates the expectation of custom and adaptive services from the personal network to the backbone network. In addition, challenges of scale and interoperability exist across networks, applications and services, requiring an effective wireless network management infrastructure. For this reason Software-Defined Networks (SDN) have become an attractive research area for wireless and mobile systems. SDN can respond to sporadic topology issues such as dropped packets, message latency, and/or conflicting resource management, to improved collaboration between mobile access points, reduced interference and increased security options. Until recently, the main focus on wireless SDN has been a more centralized approach, which has issues with scalability, fault tolerance, and security. In this work, we propose a state of the art WAM-SDN system for large-scale network management. We discuss requirements for large scale wireless distributed WAM-SDN and provide preliminary benchmarking and performance analysis based on our hybrid distributed and decentralized architecture. Keywords: software defined networks, controller optimization, resilience. 
    more » « less
  10. In this paper, we propose a responsive autonomic and data-driven adaptive virtual networking framework (RAvN) to detect and mitigate anomalous network behavior. The proposed detection scheme detects both low rate and high rate denial of service (DoS) attacks using (1) a new Centroid-based clustering technique, (2) a proposed Intragroup variance technique for data features within network traffic (C.Intra) and (3) a multivariate Gaussian distribution model fitted to the constant changes in the IP addresses of the network. RAvN integrates the adaptive reconfigurable features of a popular SDN platform (open networking operating system (ONOS)); the network performance statistics provided by traffic monitoring tools (such as T-shark or sflow-RT); and the analytics and decision-making tools provided by new and current machine learning techniques. The decision making and execution components generate adaptive policy updates (i.e. anomalous mitigation solutions) on-the-fly to the ONOS SDN controller for updating network configurations and flows. In addition, we compare our anomaly detection schemes for detecting low rate and high rate DoS attacks versus a commonly used unsupervised machine learning technique, Kmeans. Kmeans recorded 72.38% accuracy, while the multivariate clustering and the Intra-group variance methods recorded 80.54% and 96.13% accuracy respectively, a significant performance improvement. 
    more » « less