NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

How the experience of California wildfires shape Twitter climate change framings

https://doi.org/10.1007/s10584-023-03668-0

Ko, Jessie_W Y; Ni, Shengquan; Taylor, Alexander; Chen, Xiusi; Huang, Yicong; Kumar, Avinash; Alsudais, Sadeem; Wang, Zuozhi; Liu, Xiaozhen; Wang, Wei; et al (January 2024, Climatic Change)

Abstract Climate communication scientists search for effective message strategies to engage the ambivalent public in support of climate advocacy. The personal experience of wildfire is expected to render climate change impacts more concretely, pointing to a potential message strategy to engage the public. This study examined Twitter discourse related to climate change during the onset of 20 wildfires in California between the years 2017 and 2021. In this mixed method study, we analyzed tweets geographically and temporally proximal to the occurrence of wildfires to discover framings and examined how frequencies in climate framings changed before and after fires. Results identified three predominant climate framings: linking wildfire to climate change, suggesting climate actions, and attributing climate change to adversities besides wildfires. Mean tweet frequencies linking wildfire to climate change and attributing adversities increased significantly after the onset of fire. While suggesting climate action tweets also increased, the increase was not statistically significant. Temporal analysis of tweet frequencies for the three themes of tweets showed that discussion increased after the onset of a fire but persisted typically no more than 2 weeks. For fires that burned for longer periods of more than a month, external events triggered climate discussions. Our findings contribute to identifying how the personal experience of wildfire shapes Twitter discussion related to climate change, and how these framings change over time during wildfire events, leading to insights into critical time points after wildfire for implementing message strategies to increase public engagement on climate change impacts and policy.
more » « less
Full Text Available
Pasta: A Cost-Based Optimizer for Generating Pipelining Schedules for Dataflow DAGs

https://doi.org/10.1145/3698832

Liu, Xiaozhen; Huang, Yicong; Lin, Xinyuan; Kumar, Avinash; Alsudais, Sadeem; Li, Chen (December 2024, Proceedings of the ACM on Management of Data)

Data analytics tasks are often formulated as data workflows represented as directed acyclic graphs (DAGs) of operators. The recent trend of adopting machine learning (ML) techniques in workflows results in increasingly complicated DAGs with many operators and edges. Compared to the operator-at-a-time execution paradigm, pipelined execution has benefits of reducing the materialization cost of intermediate results and allowing operators to produce results early, which are critical in iterative analysis on large data volumes. Correctly scheduling a workflow DAG for pipelined execution is non-trivial due to the richer semantics of operators and the increasing complexity of DAGs. Several existing data systems adopt simple heuristics to solve the problem without considering costs such as materialization sizes. In this paper, we systematically study the problem of scheduling a workflow DAG for pipelined execution, and develop a novel cost-based optimizer called Pasta for generating a high-quality schedule. The Pasta optimizer is not only general and applicable to a wide variety of cost functions, but also capable of utilizing properties inherent in a broad class of cost functions to improve its performance significantly. We conducted a thorough evaluation of developed techniques on real-world workflows and show the efficiency and efficacy of these solutions.
more » « less
Full Text Available
Texera: A System for Collaborative and Interactive Data Analytics Using Workflows

Wang, Z; Huang, Y; Ni, S; Kumar, A; Alsudais, S; Liu, X; Lin, X; Ding, Y; Li, C (August 2024, VLDB)

Domain experts play an important role in data science, as their knowledge can unlock valuable insights from data. As they often lack technical skills required to analyze data, they need collaborations with technical experts. In these joint efforts, productive collaborations are critical not only in the phase of constructing a data science task, but more importantly, during the execution of a task. This need stems from the inherent complexity of data science, which often involves user-defined functions or machine-learning operations. Consequently, collaborators want various interactions during runtime, such as pausing/resuming the execution, inspecting an operator's state, and modifying an operator's logic. To achieve the goal, in the past few years we have been developing an open-source system called Texera to support collaborative data analytics using GUI-based workflows as cloud services. In this paper, we present a holistic view of several important design principles we followed in the design and implementation of the system. We focus on different methods of sending messages to running workers, how these methods are adopted to support various runtime interactions from users, and their trade-offs on both performance and consistency. These principles enable Texera to provide powerful user interactions during a workflow execution to facilitate efficient collaborations in data analytics.
more » « less
Full Text Available
Brain image data processing using collaborative data workflows on Texera

https://doi.org/10.3389/fncir.2024.1398884

Ding, Yunyan; Huang, Yicong; Gao, Pan; Thai, Andy; Chilaparasetti, Atchuth Naveen; Gopi, M; Xu, Xiangmin; Li, Chen (July 2024, Frontiers in Neural Circuits)

In the realm of neuroscience, mapping the three-dimensional (3D) neural circuitry and architecture of the brain is important for advancing our understanding of neural circuit organization and function. This study presents a novel pipeline that transforms mouse brain samples into detailed 3D brain models using a collaborative data analytics platform called “Texera.” The user-friendly Texera platform allows for effective interdisciplinary collaboration between team members in neuroscience, computer vision, and data processing. Our pipeline utilizes the tile images from a serial two-photon tomography/TissueCyte system, then stitches tile images into brain section images, and constructs 3D whole-brain image datasets. The resulting 3D data supports downstream analyses, including 3D whole-brain registration, atlas-based segmentation, cell counting, and high-resolution volumetric visualization. Using this platform, we implemented specialized optimization methods and obtained significant performance enhancement in workflow operations. We expect the neuroscience community can adopt our approach for large-scale image-based data processing and analysis.
more » « less
Full Text Available
Demonstration of Udon: Line-by-line Debugging of User-Defined Functions in Data Workflows

https://doi.org/10.1145/3626246.3654756

Huang, Yicong; Wang, Zuozhi; Li, Chen (June 2024, ACM)

Full Text Available
Data Science Tasks Implemented with Scripts versus GUI-Based Workflows: The Good, the Bad, and the Ugly

https://doi.org/10.1109/icdew61823.2024.00040

Taylor, Alexander K; Huang, Yicong; Hao, Junheng; Lin, Xinyuan; Chen, Xiusi; Wang, Wei; Li, Chen (May 2024, IEEE)

Full Text Available
Wording Matters: the Effect of Linguistic Characteristics and Political Ideology on Resharing of COVID-19 Vaccine Tweets

https://doi.org/10.1145/3637876

Borghouts, Judith; Huang, Yicong; Hopfer, Suellen; Li, Chen; Mark, Gloria (January 2024, ACM Transactions on Computer-Human Interaction)

Social media platforms are frequently used to share information and opinions around vaccinations. The more often a message is reshared, the wider the reach of the message and potential influence it may have on shaping people’s opinions to get vaccinated or not. We used a negative binomial regression to investigate whether a message’s linguistic characteristics (degree of concreteness, emotional arousal, and sentiment) and user characteristics (political ideology and number of followers) may influence users’ decisions to reshare tweets related to the COVID-19 vaccine. We analyzed US English-language tweets related to the COVID-19 vaccine between May 2020 and October 2021 (N = 236,054). Tweets with positive and high-arousal words were more often retweeted than negative, low-arousal tweets. Tweets with abstract words were more often retweeted than tweets with concrete words. In addition, while Liberal users were more likely to have tweets with a positive sentiment reshared, Conservative users were more likely to have tweets with a negative sentiment reshared. Our results can inform public health messaging on how to best phrase vaccine information to impact engagement and information resharing, and potentially persuade a wider set of people to get vaccinated.
more » « less
Full Text Available
The Marketing and Perceptions of Non-Tobacco Blunt Wraps on Twitter

https://doi.org/10.1080/10826084.2023.2280572

Rhee, Joshua U; Huang, Yicong; Soroosh, Aurash J; Alsudais, Sadeem; Ni, Shengquan; Kumar, Avinash; Paredes, Jacob; Li, Chen; Timberlake, David S (November 2023, Substance Use & Misuse)

Full Text Available
Raven: Accelerating Execution of Iterative Data Analytics by Reusing Results of Previous Equivalent Versions

https://doi.org/10.1145/3597465.3605219

Alsudais, Sadeem; Kumar, Avinash; Li, Chen (January 2023, HILDA Workshop at SIGMOD 2023)

Using GUI-based workflows for data analysis is an iterative process. During each iteration, an analyst makes changes to the workflow to improve it, generating a new version each time. The results produced by executing these versions are materialized to help users refer to them in the future. In many cases, a new version of the workflow, when submitted for execution, produces a result equivalent to that of a previous one. Identifying such equivalence can save computational resources and time by reusing the materialized result. One way to optimize the performance of executing a new version is to compare the current version with a previous one and test if they produce the same results using a workflow version equivalence verifier. As the number of versions grows, this testing can become a computational bottleneck. In this paper, we present Raven, an optimization framework to accelerate the execution of a new version request by detecting and reusing the results of previous equivalent versions with the help of a version equivalence verifier. Raven ranks and prunes the set of prior versions to quickly identify those that may produce an equivalent result to the version execution request. Additionally, when the verifier performs computation to verify the equivalence of a version pair, there may be a significant overlap with previously tested version pairs. Raven identifies and avoids such repeated computations by extending the verifier to reuse previous knowledge of equivalence tests. We evaluated the effectiveness of Raven compared to baselines on real workflows and datasets.
more » « less
Full Text Available
Understanding underlying moral values and language use of COVID-19 vaccine attitudes on twitter

Judith Borghouts, Yicong Huang (January 2023, PNAS nexus)

Public sentiment toward the COVID-19 vaccine as expressed on social media can interfere with communication by public health agencies on the importance of getting vaccinated. We investigated Twitter data to understand differences in sentiment, moral values, and language use between political ideologies on the COVID-19 vaccine. We estimated political ideology, conducted a sentiment analysis, and guided by the tenets of moral foundations theory (MFT), we analyzed 262,267 English language tweets from the United States containing COVID-19 vaccine-related keywords between May 2020 and October 2021. We applied the Moral Foundations Dictionary and used topic modeling and Word2Vec to understand moral values and the context of words central to the discussion of the vaccine debate. A quadratic trend showed that extreme ideologies of both Liberals and Conservatives expressed a higher negative sentiment than Moderates, with Conservatives expressing more negative sentiment than Liberals. Compared to Conservative tweets, we found the expression of Liberal tweets to be rooted in a wider set of moral values, associated with moral foundations of care (getting the vaccine for protection), fairness (having access to the vaccine), liberty (related to the vaccine mandate), and authority (trusting the vaccine mandate imposed by the government). Conservative tweets were found to be associated with harm (around safety of the vaccine) and oppression (around the government mandate). Furthermore, political ideology was associated with the expression of different meanings for the same words, e.g. “science” and “death.” Our results inform public health outreach communication strategies to best tailor vaccine information to different groups.
more » « less
Full Text Available

« Prev Next »

Search for: All records