skip to main content

Search for: All records

Creators/Authors contains: "Wang, Jie"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available October 1, 2024
  2. Free, publicly-accessible full text available August 1, 2024
  3. Abstract: Task offloading, which refers to processing (computation-intensive) data at facilitating servers, is an exemplary service that greatly benefits from the fog computing paradigm, which brings computation resources to the edge network for reduced application latency. However, the resource-consuming nature of task execution, as well as the sheer scale of IoT systems, raises an open and challenging question: whether fog is a remedy or a resource drain, considering frequent and massive offloading operations? This question is nontrivial, because participants of offloading processes, i.e., fog nodes, may have diversified technical specifications, while task generators, i.e., task nodes, may employ a variety of criteria to select offloading targets, resulting in an unmanageable space for performance evaluation. To overcome these challenges of heterogeneity, we propose a gravity model that characterizes offloading criteria with various gravity functions, in which individual/system resource consumption can be examined by the device/network effort metrics, respectively. Simulation results show that the proposed gravity model can flexibly describe different offloading schemes in terms of application and node-level behavior. We find that the expected lifetime and device effort of individual tasks decrease as O(1/N) over the network size N , while the network effort decreases much slower, even remain O(1) when load balancing measures are employed, indicating a possible resource drain in the edge network. 
    more » « less
  4. Abstract

    Two-sample tests are important areas aiming to determine whether two collections of observations follow the same distribution or not. We propose two-sample tests based on integral probability metric (IPM) for high-dimensional samples supported on a low-dimensional manifold. We characterize the properties of proposed tests with respect to the number of samples $n$ and the structure of the manifold with intrinsic dimension $d$. When an atlas is given, we propose a two-step test to identify the difference between general distributions, which achieves the type-II risk in the order of $n^{-1/\max \{d,2\}}$. When an atlas is not given, we propose Hölder IPM test that applies for data distributions with $(s,\beta )$-Hölder densities, which achieves the type-II risk in the order of $n^{-(s+\beta )/d}$. To mitigate the heavy computation burden of evaluating the Hölder IPM, we approximate the Hölder function class using neural networks. Based on the approximation theory of neural networks, we show that the neural network IPM test has the type-II risk in the order of $n^{-(s+\beta )/d}$, which is in the same order of the type-II risk as the Hölder IPM test. Our proposed tests are adaptive to low-dimensional geometric structure because their performance crucially depends on the intrinsic dimension instead of the data dimension.

    more » « less
  5. Abstract

    Propagating deformation bands are observed to accommodate the initial plasticity in an as-extruded Mg–1.5Nd alloy under tension using digital-image-correlation. The propagating bands cause an uncommon plateau in the stress–strain response of the alloy prior to restoring a common decreasing work hardening with further straining. Effects of the deformation banding and underlying plateau in the flow stress on small scale yielding are investigated during low cycle fatigue (LCF) and tension of notched specimens. Alternating formation/disappearance of deformation bands in the gauge section of as-extruded LCF specimens during testing is observed to reduce life compared to annealed specimens exhibiting no instabilities. In contrast, the bands deflect the plastic zone ahead of the notch from the principal plane orthogonal to the applied loading inducing positive effect on toughness of the alloy.

    more » « less
  6. With reduced data reuse and parallelism, recent convolutional neural networks (CNNs) create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable architectures for convolutional layers, but without proper optimizations, their efficiency drops dramatically for reasons: 1) the different dimensions within same-type layers, 2) the different convolution layers especially transposed and dilated convolutions, and 3) CNN’s complex dataflow graph. Furthermore, significant overheads arise when integrating FPGAs into machine learning frameworks. Therefore, we present a flexible, composable architecture called FlexCNN, which delivers high computation efficiency by employing dynamic tiling, layer fusion, and data layout optimizations. Additionally, we implement a novel versatile SA to process normal, transposed, and dilated convolutions efficiently. FlexCNN also uses a fully-pipelined software-hardware integration that alleviates the software overheads. Moreover, with an automated compilation flow, FlexCNN takes a CNN in the ONNX representation, performs a design space exploration, and generates an FPGA accelerator. The framework is tested using three complex CNNs: OpenPose, U-Net, and E-Net. The architecture optimizations achieve 2.3 × performance improvement. Compared to a standard SA, the versatile SA achieves close-to-ideal speedups, with up to 15.98 × and 13.42 × for transposed and dilated convolutions, with a 6% average area overhead. The pipelined integration leads to a 5 × speedup for OpenPose. 
    more » « less
  7. Transposable elements (TEs) and the silencing machinery of their hosts are engaged in a germline arms-race dynamic that shapes TE accumulation and, therefore, genome size. In animal species with extremely large genomes (>10 Gb), TE accumulation has been pushed to the extreme, prompting the question of whether TE silencing also deviates from typical conditions. To address this question, we characterize TE silencing via two pathways—the piRNA pathway and KRAB-ZFP transcriptional repression—in the male and female gonads of Ranodon sibiricus , a salamander species with a ∼21 Gb genome. We quantify 1) genomic TE diversity, 2) TE expression, and 3) small RNA expression and find a significant relationship between the expression of piRNAs and TEs they target for silencing in both ovaries and testes. We also quantified TE silencing pathway gene expression in R. sibiricus and 14 other vertebrates with genome sizes ranging from 1 to 130 Gb and find no association between pathway expression and genome size. Taken together, our results reveal that the gigantic R. sibiricus genome includes at least 19 putatively active TE superfamilies, all of which are targeted by the piRNA pathway in proportion to their expression levels, suggesting comprehensive piRNA-mediated silencing. Testes have higher TE expression than ovaries, suggesting that they may contribute more to the species’ high genomic TE load. We posit that apparently conflicting interpretations of TE silencing and genomic gigantism in the literature, as well as the absence of a correlation between TE silencing pathway gene expression and genome size, can be reconciled by considering whether the TE community or the host is currently “on the attack” in the arms race dynamic. 
    more » « less