- Award ID(s):
- 1838193
- PAR ID:
- 10494815
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII)
- ISBN:
- 979-8-3503-2743-4
- Page Range / eLocation ID:
- 1 to 8
- Format(s):
- Medium: X
- Location:
- Cambridge, MA, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
Espinosa-Anke, Luis ; Martín-Vide, Carlos ; Spasić, Irena (Ed.)Algorithmic journalism refers to automatic AI-constructed news stories. There have been successful commercial implementations for news stories in sports, weather, financial reporting and similar domains with highly structured, well defined tabular data sources. Other domains such as local reporting have not seen adoption of algorithmic journalism, and thus no automated reporting systems are available in these categories which can have important implications for the industry. In this paper, we demonstrate a novel approach for producing news stories on government legislative activity, an area that has not widely adopted algorithmic journalism. Our data source is state legislative proceedings, primarily the transcribed speeches and dialogue from floor sessions and committee hearings in US State legislatures. Specifically, we create a library of potential events called phenoms. We systematically analyze the transcripts for the presence of phenoms using a custom partial order planner. Each phenom, if present, contributes some natural language text to the generated article: either stating facts, quoting individuals or summarizing some aspect of the discussion. We evaluate two randomly chosen articles with a user study on Amazon Mechanical Turk with mostly Likert scale questions. Our results indicate a high degree of achievement for accuracy of facts and readability of final content with 13 of 22 users in the first article and 19 of 20 subjects of the second article agreeing or strongly agreeing that the articles included the most important facts of the hearings. Other results strengthen this finding in terms of accuracy, focus and writing quality.more » « less
-
Music is one of the most universal forms of communication and entertainment across cultures. This can largely be credited to the sense of synesthesia, or the combining of senses. Based on this concept of synesthesia, we want to explore whether generative AI can create visual representations for music. The aim is to inspire the user’s imagination and enhance the user experience when enjoying music. Our approach has the following steps: (a) Music is analyzed and classified into multiple dimensions (including instruments, emotion, tempo, pitch range, harmony, and dynamics) to produce textual descriptions. (b) The texts form inputs of machine models that can predict the genre of the input audio. (c) The prompts are inputs of generative machine models to create visual representations. The visual representations are continuously updated as the music plays, ensuring that the visual effects aptly mirror the musical changes. A comprehensive user study with 88 users confirms that our approach is able to generate visual art reflecting the music pieces. From a list of images covering both abstract images and realistic images, users considered that our system-generated images can better represent pieces of music than human-chosen images. It suggests that generative arts can become a promising method to enhance users' listening experience while enjoying music. Our method provides a new approach to visualize music and to enjoy music through generative arts.more » « less
-
Humans routinely extract important information from images and videos, relying on their gaze. In contrast, computational systems still have difficulty annotating important visual information in a human-like manner, in part because human gaze is often not included in the modeling process. Human input is also particularly relevant for processing and interpreting affective visual information. To address this challenge, we captured human gaze, spoken language, and facial expressions simultaneously in an experiment with visual stimuli characterized by subjective and affective content. Observers described the content of complex emotional images and videos depicting positive and negative scenarios and also their feelings about the imagery being viewed. We explore patterns of these modalities, for example by comparing the affective nature of participant-elicited linguistic tokens with image valence. Additionally, we expand a framework for generating automatic alignments between the gaze and spoken language modalities for visual annotation of images. Multimodal alignment is challenging due to their varying temporal offset. We explore alignment robustness when images have affective content and whether image valence influences alignment results. We also study if word frequency-based filtering impacts results, with both the unfiltered and filtered scenarios performing better than baseline comparisons, and with filtering resulting in a substantial decrease in alignment error rate. We provide visualizations of the resulting annotations from multimodal alignment. This work has implications for areas such as image understanding, media accessibility, and multimodal data fusion.more » « less
-
We aim to develop methods for understanding how multimedia news exposure can affect people’s emotional responses, and we especially focus on news content related to gun violence, a very important yet polarizing issue in the U.S. We created the dataset NEmo+ by significantly extending the U.S. gun violence news-to-emotions dataset, BU-NEmo, from 320 to 1,297 news headline and lead image pairings and collecting 38,910 annotations in a large crowdsourcing experiment. In curating the NEmo+ dataset, we developed methods to identify news items that will trigger similar versus divergent emotional responses. For news items that trigger similar emotional responses, we compiled them into the NEmo+-Consensus dataset. We benchmark models on this dataset that predict a person’s dominant emotional response toward the target news item (single-label prediction). On the full NEmo+ dataset, containing news items that would lead to both differing and similar emotional responses, we also benchmark models for the novel task of predicting the distribution of evoked emotional responses in humans when presented with multi-modal news content. Our single-label and multi-label prediction models outperform baselines by large margins across several metrics.more » « less
-
Abstract Meta‐analytic techniques for mining the neuroimaging literature continue to exert an impact on our conceptualization of functional brain networks contributing to human emotion and cognition. Traditional theories regarding the neurobiological substrates contributing to affective processing are shifting from regional‐ towards more network‐based heuristic frameworks. To elucidate differential brain network involvement linked to distinct aspects of emotion processing, we applied an emergent meta‐analytic clustering approach to the extensive body of affective neuroimaging results archived in the BrainMap database. Specifically, we performed hierarchical clustering on the modeled activation maps from 1,747 experiments in the affective processing domain, resulting in five meta‐analytic groupings of experiments demonstrating whole‐brain recruitment. Behavioral inference analyses conducted for each of these groupings suggested dissociable networks supporting: (1) visual perception within primary and associative visual cortices, (2) auditory perception within primary auditory cortices, (3) attention to emotionally salient information within insular, anterior cingulate, and subcortical regions, (4) appraisal and prediction of emotional events within medial prefrontal and posterior cingulate cortices, and (5) induction of emotional responses within amygdala and fusiform gyri. These meta‐analytic outcomes are consistent with a contemporary psychological model of affective processing in which emotionally salient information from perceived stimuli are integrated with previous experiences to engender a subjective affective response. This study highlights the utility of using emergent meta‐analytic methods to inform and extend psychological theories and suggests that emotions are manifest as the eventual consequence of interactions between large‐scale brain networks.