Search for: All records

Award ID contains: 2211557

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Detecting political biases of named entities and hashtags on Twitter

https://doi.org/10.1140/epjds/s13688-023-00386-6

Xiao, Zhiping; Zhu, Jeffrey; Wang, Yining; Zhou, Pei; Lam, Wen Hong; Porter, Mason A.; Sun, Yizhou (June 2023, EPJ Data Science)

Abstract Ideological divisions in the United States have become increasingly prominent in daily communication. Accordingly, there has been much research on political polarization, including many recent efforts that take a computational perspective. By detecting political biases in a text document, one can attempt to discern and describe its polarity. Intuitively, the named entities (i.e., the nouns and the phrases that act as nouns) and hashtags in text often carry information about political views. For example, people who use the term “pro-choice” are likely to be liberal and people who use the term “pro-life” are likely to be conservative. In this paper, we seek to reveal political polarities in social-media text data and to quantify these polarities by explicitly assigning a polarity score to entities and hashtags. Although this idea is straightforward, it is difficult to perform such inference in a trustworthy quantitative way. Key challenges include the small number of known labels, the continuous spectrum of political views, and the preservation of both a polarity score and a polarity-neutral semantic meaning in an embedding vector of words. To attempt to overcome these challenges, we propose thePolarity-awareEmbeddingMulti-task learning (PEM) model. This model consists of (1) a self-supervised context-preservation task, (2) an attention-based tweet-level polarity-inference task, and (3) an adversarial learning task that promotes independence between an embedding’s polarity component and its semantic component. Our experimental results demonstrate that ourPEMmodel can successfully learn polarity-aware embeddings that perform well at tweet-level and account-level classification tasks. We examine a variety of applications—including a study of spatial and temporal distributions of polarities and a comparison between tweets from Twitter and posts from Parler—and we thereby demonstrate the effectiveness of ourPEMmodel. We also discuss important limitations of our work and encourage caution when applying thePEMmodel to real-world scenarios.
more » « less
Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis

https://doi.org/10.1145/3670474.3685952

Qin, Zongyue; Bai, Yunsheng; Sohrabizadeh, Atefeh; Ding, Zijian; Hu, Ziniu; Sun, Yizhou; Cong, Jason (September 2024, ACM)

In recent years, domain-specific accelerators (DSAs) have gained popularity for applications such as deep learning and autonomous driving. To facilitate DSA designs, programmers use high-level synthesis (HLS) to compile a high-level description written in C/C++ into a design with low-level hardware description languages that eventually synthesize DSAs on circuits. However, creating a highquality HLS design still demands significant domain knowledge, particularly in microarchitecture decisions expressed as pragmas. Thus, it is desirable to automate such decisions with the help of machine learning for predicting the quality of HLS designs, requiring a deeper understanding of the program that consists of original code and pragmas. Naturally, these programs can be considered as sequence data. In addition, these programs can be compiled and converted into a control data flow graph (CDFG). But existing works either fail to leverage both modalities or combine the two in shallow or coarse ways. We propose ProgSG, a model that allows interaction between the source code sequence modality and the graph modality in a deep and fine-grained way. To alleviate the scarcity of labeled designs, a pre-training method is proposed based on a suite of compiler’s data flow analysis tasks. Experimental results show that ProgSG reduces the RMSE of design performance predictions by up to 22%, and identifies designs with an average of 1.10× and 1.26× (up to 8.17× and 13.31×) performance improvement in design space exploration (DSE) task compared to HARP and AutoDSE, respectively.
more » « less
Full Text Available
Learning to Compare Hardware Designs for High-Level Synthesis

https://doi.org/10.1145/3670474.3685940

Bai, Yunsheng; Sohrabizadeh, Atefeh; Ding, Zijian; Liang, Rongjian; Li, Weikai; Wang, Ding; Ren, Haoxing; Sun, Yizhou; Cong, Jason (September 2024, ACM)

High-level synthesis (HLS) is an automated design process that transforms high-level code into optimized hardware designs, enabling rapid development of efficient hardware accelerators for various applications such as image processing, machine learning, and signal processing. To achieve optimal performance, HLS tools rely on pragmas, which are directives inserted into the source code to guide the synthesis process, and these pragmas can have various settings and values that significantly impact the resulting hardware design. State-of the-art ML-based HLS methods, such as harp, first train a deep learning model, typically based on graph neural networks (GNNs) applied to graph-based representations of the source code and its pragmas. They then perform design space exploration (DSE) to explore the pragma design space, rank candidate designs using the trained model, and return the top designs as the final designs. However, traditional DSE methods face challenges due to the highly nonlinear relationship between pragma settings and performance metrics, along with complex interactions between pragmas that affect performance in non-obvious ways. To address these challenges, we propose compareXplore, a novel approach that learns to compare hardware designs for effective HLS optimization. compareXplore introduces a hybrid loss function that combines pairwise preference learning with pointwise performance prediction, enabling the model to capture both relative preferences and absolute performance values. Moreover, we introduce a novel Node Difference Attention module that focuses on the most informative differences between designs, enhancing the model’s ability to identify critical pragmas impacting performance. compareXplore adopts a two-stage DSE approach, where a pointwise prediction model is used for the initial design pruning, followed by a pairwise comparison stage for precise performance verification. Experimental results demonstrate that compareXplore achieves significant improvements in ranking metrics and generates high quality HLS results for the selected designs, outperforming the existing state-of-the-art method.
more » « less
Full Text Available
Towards a Comprehensive Benchmark for High-Level Synthesis Targeted to FPGAs

Bai, Yunsheng; Sohrabizadeh, Atefeh; Qin, Zongyue; Hu, Ziniu; Sun, Yizhou; Cong, Jason (December 2023, 37th Conference on Neural Information Processing Systems (NeurIPS 2023))

High-level synthesis (HLS) aims to raise the abstraction layer in hardware design, enabling the design of domain-specific accelerators (DSAs) targeted for field- programmable gate arrays (FPGAs) using C/C++ instead of hardware description languages (HDLs). Compiler directives in the form of pragmas play a crucial role in modifying the microarchitecture within the HLS framework. However, the number of possible microarchitectures grows exponentially with the number of pragmas. Moreover, the evaluation of each candidate design using the HLS tool consumes significant time, ranging from minutes to hours, leading to a slow optimization process. To accelerate this process, machine learning models have been used to predict design quality in milliseconds. However, existing open-source datasets for training such models are limited in terms of design complexity and available optimizations. In this paper, we present HLSYN, a new benchmark that addresses these limitations. It contains more complex programs with a wider range of optimization pragmas, making it a comprehensive dataset for training and evaluating design quality prediction models. The HLSYN benchmark consists of 42 unique programs/kernels, each of which has many different pragma configurations, resulting in over 42,000 labeled designs. We conduct an extensive comparison of state-of-the-art baselines to assess their effectiveness in predicting design quality. As an ongoing project, we anticipate expanding the HLSYN benchmark in terms of both quantity and variety of programs to further support the development of this field.
more » « less
Full Text Available
Robust GNN-Based Representation Learning for HLS

https://doi.org/10.1109/ICCAD57390.2023.10323853

Sohrabizadeh, Atefeh; Bai, Yunsheng; Sun, Yizhou; Cong, Jason (October 2023, IEEE)

The efficient and timely optimization of microarchitecture for a target application is hindered by the long evaluation runtime of a design candidate, creating a serious burden. To tackle this problem, researchers have started using learning algorithms such as graph neural networks (GNNs) to accelerate the process by developing a surrogate of the target tool. However, challenges arise when developing such models for HLS tools due to the program's long dependency range and deeply coupled input program and transformations (i.e., pragmas). To address them, in this paper, we present HARP ( H ierarchical A ugmentation for R epresentation with P ragma optimization) with a novel hierarchical graph representation of the HLS design by introducing auxiliary nodes to include high-level hierarchical information about the design. Additionally, HARP decouples the representation of the program and its transformations and includes a neural pragma transformer (NPT) approach to facilitate a more systematic treatment of this process. Our proposed graph representation and model architecture of HARP not only enhance the performance of the model and design space exploration based on it but also improve the model's transfer learning capability, enabling easier adaptation to new environments 1 1 All materials available at https://github.com/UCLA-VAST/HARP.
more » « less
Full Text Available
Generalizing Graph ODE for Learning Complex System Dynamics across Environments

https://doi.org/10.1145/3580305.3599362

Huang, Zijie; Sun, Yizhou; Wang, Wei (August 2023, Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’23))

Learning multi-agent system dynamics has been extensively studied for various real-world applications, such as molecular dynamics in biology, multi-body system in physics, and particle dynamics in material science. Most of the existing models are built to learn single system dynamics, which learn the dynamics from observed historical data and predict the future trajectory. In practice, however, we might observe multiple systems that are generated across different environments, which differ in latent exogenous factors such as temperature and gravity. One simple solution is to learn multiple environment-specific models, but it fails to exploit the potential commonalities among the dynamics across environments and offers poor prediction results where per-environment data is sparse or limited. Here, we present GG-ODE (Generalized Graph Ordinary Differential Equations), a machine learning framework for learning continuous multi-agent system dynamics across environments. Our model learns system dynamics using neural ordinary differential equations (ODE) parameterized by Graph Neural Networks (GNNs) to capture the continuous interaction among agents. We achieve the model generalization by assuming the dynamics across different environments are governed by common physics laws that can be captured via learning a shared ODE function. The distinct latent exogenous factors learned for each environment are incorporated into the ODE function to account for their differences. To improve model performance, we additionally design two regularization losses to (1) enforce the orthogonality between the learned initial states and exogenous factors via mutual information minimization; and (2) reduce the temporal variance of learned exogenous factors within the same system via contrastive learning. Experiments over various physical simulations show that our model can accurately predict system dynamics, especially in the long range, and can generalize well to new systems with few observations.
more » « less
Full Text Available
CF-GODE: Continuous-Time Causal Inference for Multi-Agent Dynamical Systems

https://doi.org/10.1145/3580305.3599272

Jiang, Song; Huang, Zijie; Luo, Xiao; Sun, Yizhou (August 2023, KDD’23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Multi-agent dynamical systems refer to scenarios where multiple units (aka agents) interact with each other and evolve collectively over time. For instance, people’s health conditions are mutually influenced. Receiving vaccinations not only strengthens the longterm health status of one unit but also provides protection for those in their immediate surroundings. To make informed decisions in multi-agent dynamical systems, such as determining the optimal vaccine distribution plan, it is essential for decision-makers to estimate the continuous-time counterfactual outcomes. However, existing studies of causal inference over time rely on the assumption that units are mutually independent, which is not valid for multi-agent dynamical systems. In this paper, we aim to bridge this gap and study how to estimate counterfactual outcomes in multi-agent dynamical systems. Causal inference in a multi-agent dynamical system has unique challenges: 1) Confounders are timevarying and are present in both individual unit covariates and those of other units; 2) Units are affected by not only their own but also others’ treatments; 3) The treatments are naturally dynamic, such as receiving vaccines and boosters in a seasonal manner. To this end, we model a multi-agent dynamical system as a graph and propose a novel model called CF-GODE (CounterFactual Graph Ordinary Differential Equations). CF-GODE is a causal model that estimates continuous-time counterfactual outcomes in the presence of inter-dependencies between units. To facilitate continuous-time estimation,we propose Treatment-Induced GraphODE, a novel ordinary differential equation based on graph neural networks (GNNs), which can incorporate dynamical treatments as additional inputs to predict potential outcomes over time. To remove confounding bias, we propose two domain adversarial learning based objectives that learn balanced continuous representation trajectories, which are not predictive of treatments and interference. We further provide theoretical justification to prove their effectiveness. Experiments on two semi-synthetic datasets confirm that CF-GODE outperforms baselines on counterfactual estimation. We also provide extensive analyses to understand how our model works.
more » « less
Full Text Available
Concept2Box: Joint Geometric Embeddings for Learning Two-View Knowledge Graphs

https://doi.org/10.18653/v1/2023.findings-acl.642

Huang, Zijie; Wang, Daheng; Huang, Binxuan; Zhang, Chenwei; Shang, Jingbo; Liang, Yan; Wang, Zhengyang; Li, Xian; Faloutsos, Christos; Sun, Yizhou; et al (July 2023, Findings of the Association for Computational Linguistics: ACL 2023)

Knowledge graph embeddings (KGE) have been extensively studied to embed large-scale relational data for many real-world applications. Existing methods have long ignored the fact many KGs contain two fundamentally different views: high-level ontology-view concepts and fine-grained instance-view entities. They usually embed all nodes as vectors in one latent space. However, a single geometric representation fails to capture the structural differences between two views and lacks probabilistic semantics towards concepts’ granularity. We propose Concept2Box, a novel approach that jointly embeds the two views of a KG using dual geometric representations. We model concepts with box embeddings, which learn the hierarchy structure and complex relations such as overlap and disjoint among them. Box volumes can be interpreted as concepts’ granularity. Different from concepts, we model entities as vectors. To bridge the gap between concept box embeddings and entity vector embeddings, we propose a novel vector-to-box distance metric and learn both embeddings jointly. Experiments on both the public DBpedia KG and a newly-created industrial KG showed the effectiveness of Concept2Box.
more » « less
Full Text Available
Introducing Semantics into Speech Encoders

https://doi.org/10.18653/v1/2023.acl-long.639

Xu, Derek; Dong, Shuyan; Wang, Changhan; Kim, Suyoun; Lin, Zhaojiang; Liu, Bing; Shrivastava, Akshat; Li, Shang-Wen; Tseng, Liang-Hsuan; Lin, Guan-Ting; et al (July 2023, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics)

Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information. As a result, pipelined supervised automatic speech recognition (ASR) to large language model (LLM) systems achieve state-of-the-art results on semantic spoken language tasks by utilizing rich semantic representations from the LLM. These systems come at the cost of labeled audio transcriptions, which is expensive and time-consuming to obtain. We propose a taskagnostic unsupervised way of incorporating semantic information from LLMs into selfsupervised speech encoders without labeled audio transcriptions. By introducing semantics, we improve existing speech encoder spoken language understanding (SLU) performance by over 5% on intent classification (IC), with modest gains in named entity resolution (NER) and slot filling (SF), and spoken question answering (SQA) FF1 score by over 2%. Our approach, which uses no ASR data, achieves similar performance as methods trained on over 100 hours of labeled audio transcripts, demonstrating the feasibility of unsupervised semantic augmentations to existing speech encoders.
more » « less
Full Text Available
Tab-Cleaner: Weakly Supervised Tabular Data Cleaning via Pre-training for E-commerce Catalog

https://doi.org/10.18653/v1/2023.acl-industry.18

Cheng, Kewei; Li, Xian; Wang, Zhengyang; Zhang, Chenwei; Huang, Binxuan; Xu, Yifan Ethan; Dong, Xin Luna; Sun, Yizhou (July 2023, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics)

Product catalogs, conceptually in the form of text-rich tables, are self-reported by individual retailers and thus inevitably contain noisy facts. Verifying such textual attributes in product catalogs is essential to improve their reliability. However, popular methods for processing free-text content, such as pre-trained language models, are not particularly effective on structured tabular data since they are typically trained on free-form natural language texts. In this paper, we present Tab-Cleaner, a model designed to handle error detection over text-rich tabular data following a pre-training / fine-tuning paradigm. We train Tab-Cleaner on a real-world Amazon Product Catalog table w.r.t millions of products and show improvements over state-of-the-art methods by 16% on PR AUC over attribute applicability classification task and by 11% on PR AUC over attribute value validation task.
more » « less
Full Text Available

« Prev Next »