skip to main content


Title: Data Integration Tasks on Heterogeneous Systems Using OpenCL
In the era of big data, many new algorithms are developed to try and find the most efficient way to perform computations with massive amounts of data. However, what is often overlooked is the preprocessing step for many of these applications. The Data Integration Benchmark Suite (DIBS) was designed to understand the characteristics of dataset transformations in a hardware agnostic way. While on the surface these applications have a high amount of data parallelism, there are caveats in their specification that can potentially affect this characteristic. Even still, OpenCL can be an effective deployment environment for these applications. In this work we take a subset of the data transformations from each category presented in DIBS and implement them in OpenCL to evaluate their performance for heterogeneous systems. For targeting heterogeneous systems, we take a common application and attempt to deploy it to three platforms targetable by OpenCL (CPU, GPU, and FPGA). The applications are evaluated by their average transformation data rate. We illustrate the advantages of each compute device in the data integration space along with different communications schemes allowed for host/device communication in the OpenCL platform.  more » « less
Award ID(s):
1763503 1527510
NSF-PAR ID:
10108237
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proc. of 7th International Workshop on OpenCL
Page Range / eLocation ID:
1 to 1
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Python's ease of use and rich collection of numeric libraries make it an excellent choice for rapidly developing scientific applications. However, composing these libraries to take advantage of complex heterogeneous nodes is still difficult. To simplify writing multi-device code, we created Parla, a heterogeneous task-based programming framework that fully supports Python's scientific programming stack. Parla's API is based on Python decorators and allows users to wrap code in Parla tasks for parallel execution. Parla arrays enable automatic movement of data between devices. The Parla runtime handles resource-aware mapping, scheduling, and execution of tasks. Compared to other Python tasking systems, Parla is unique in its parallelization of tasks within a single process, its GPU context and resource-aware runtime, and its design around gradual adoption to provide easy migration of and integration into existing Python applications. We show that Parla can achieve performance competitive with hand-optimized code while improving ease of development. 
    more » « less
  2. Colloidal nanocrystals (NCs) have emerged as a diverse class of materials with tunable composition, size, shape, and surface chemistry. From their facile syntheses to unique optoelectronic properties, these solution-processed nanomaterials are a promising alternative to materials grown as bulk crystals or by vapor-phase methods. However, the integration of colloidal nanomaterials in real-world devices is held back by challenges in making patterned NC films with the resolution, throughput, and cost demanded by device components and applications. Therefore, suitable approaches to pattern NCs need to be established to aid the transition from individual proof-of-concept NC devices to integrated and multiplexed technological systems. In this Account, we discuss the development of stimuli-sensitive surface ligands that enable NCs to be patterned directly with good pattern fidelity while retaining desirable properties. We focus on rationally selected ligands that enable changes in the NC dispersibility by responding to light, electron beam, and/or heat. First, we summarize the fundamental forces between colloidal NCs and discuss the principles behind NC stabilization/destabilization. These principles are applied to understanding the mechanisms of the NC dispersibility change upon stimuli-induced ligand modifications. Six ligand-based patterning mechanisms are introduced: ligand cross-linking, ligand decomposition, ligand desorption, in situ ligand exchange, ion/ligand binding, and ligand-aided increase of ionic strength. We discuss examples of stimuli-sensitive ligands that fall under each mechanism, including their chemical transformations, and address how these ligands are used to pattern either sterically or electrostatically stabilized colloidal NCs. Following that, we explain the rationale behind the exploration of different types of stimuli, as well as the advantages and disadvantages of each stimulus. We then discuss relevant figures-of-merit that should be considered when choosing a particular ligand chemistry or stimulus for patterning NCs. These figures-of-merit pertain to either the pattern quality (e.g., resolution, edge and surface roughness, layer thickness), or to the NC material quality (e.g., photo/electro-luminescence, electrical conductivity, inorganic fraction). We outline the importance of these properties and provide insights on optimizing them. Both the pattern quality and NC quality impact the performance of patterned NC devices such as field-effect transistors, light-emitting diodes, color-conversion pixels, photodetectors, and diffractive optical elements. We also give examples of proof-of-concept patterned NC devices and evaluate their performance. Finally, we provide an outlook on further expanding the chemistry of stimuli-sensitive ligands, improving the NC pattern quality, progress toward 3D printing, and other potential research directions. Ultimately, we hope that the development of a patterning toolbox for NCs will expedite their implementation in a broad range of applications. 
    more » « less
  3. null (Ed.)
    Because of the increasing demand for intensive computation in deep neural networks, researchers have developed both hardware and software mechanisms to reduce the compute and memory burden. A widely adopted approach is to use mixed precision data types. However, it is hard to benefit from mixed precision without hardware specialization because of the overhead of data casting. Recently, hardware vendors offer tensorized instructions specialized for mixed-precision tensor operations, such as Intel VNNI, Nvidia Tensor Core, and ARM DOT. These instructions involve a new computing idiom, which reduces multiple low precision elements into one high precision element. The lack of compilation techniques for this emerging idiom makes it hard to utilize these instructions. In practice, one approach is to use vendor-provided libraries for computationally-intensive kernels, but this is inflexible and prevents further optimizations. Another approach is to manually write hardware intrinsics, which is error-prone and difficult for programmers. Some prior works tried to address this problem by creating compilers for each instruction. This requires excessive efforts when it comes to many tensorized instructions. In this work, we develop a compiler framework, UNIT, to unify the compilation for tensorized instructions. The key to this approach is a unified semantics abstraction which makes the integration of new instructions easy, and the reuse of the analysis and transformations possible. Tensorized instructions from different platforms can be compiled via UNIT with moderate effort for favorable performance. Given a tensorized instruction and a tensor operation, UNIT automatically detects the applicability of the instruction, transforms the loop organization of the operation, and rewrites the loop body to take advantage of the tensorized instruction. According to our evaluation, UNIT is able to target various mainstream hardware platforms. The generated end-to-end inference model achieves 1.3 x speedup over Intel oneDNN on an x86 CPU, 1.75x speedup over Nvidia cuDNN on an Nvidia GPU, and 1.13x speedup over a carefully tuned TVM solution for ARM DOT on an ARM CPU. 
    more » « less
  4. Wickert, A. (Ed.)

    Abstract. Progress in better understanding and modeling Earth surface systems requires an ongoing integration of data and numerical models. Advances are currently hampered by technical barriers that inhibit finding, accessing, and executing modeling software with related datasets. We propose a design framework for Data Components, which are software packages that provide access to particular research datasets or types of data. Because they use a standard interface based on the Basic Model Interface (BMI), Data Components can function as plug-and-play components within modeling frameworks to facilitate seamless data–model integration. To illustrate the design and potential applications of Data Components and their advantages, we present several case studies in Earth surface processes analysis and modeling. The results demonstrate that the Data Component design provides a consistent and efficient way to access heterogeneous datasets from multiple sources and to seamlessly integrate them with various models. This design supports the creation of open data–model integration workflows that can be discovered, accessed, and reproduced through online data sharing platforms, which promotes data reuse and improves research transparency and reproducibility.

     
    more » « less
  5. null (Ed.)
    The physical architecture of materials plays an integral role in determining material properties and functionality. While many processing techniques now exist for fabricating parts of any shape or size, a couple of techniques have emerged as facile and effective methods for creating unique structures: dealloying and additive manufacturing. This review discusses progress and challenges in the integration of dealloying techniques with the additive manufacturing (AM) platform to take advantage of the material processing capabilities established by each field. These methods are uniquely complementary: not only can we use AM to make nanoporous metals of complex, customized shapes—for instance, with applications in biomedical implants and microfluidics—but dealloying can occur simultaneously during AM to produce unique composite materials with nanoscale features of two interpenetrating phases. We discuss the experimental challenges of implementing these processing methods and how future efforts could be directed to address these difficulties. Our premise is that combining these synergistic techniques offers both new avenues for creating 3D functional materials and new functional materials that cannot be synthesized any other way. Dealloying and AM will continue to grow both independently and together as the materials community realizes the potential of this compelling combination. 
    more » « less