NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

The Affective Nature of AI-Generated News Images: Impact on Visual Journalism

https://doi.org/10.1109/ACII59096.2023.10388166

Paik, Sejin; Bonna, Sarah; Novozhilova, Ekaterina; Gao, Ge; Kim, Jongin; Wijaya, Derry; Betke, Margrit (September 2023, 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII))

This study explores the affective responses and newsworthiness perceptions of generative AI for visual journalism. While generative AI offers advantages for newsrooms in terms of producing unique images and cutting costs, the potential misuse of AI-generated news images is a cause for concern. For our study, we designed a 3-part news image codebook for affect-labeling news images based on journalism ethics and photography guidelines. We collected 200 news headlines and images retrieved from a variety of U.S. news sources on the topics of gun violence and climate change, generated corresponding news images from DALL-E 2 and asked annotators their emotional responses to the human-selected and AI-generated news images following the codebook. We also examined the impact of modality on emotions by measuring the effects of visual and textual modalities on emotional responses. The findings of this study provide insights into the quality and emotional impact of generative news images produced by humans and AI. Further, results of this work can be useful in developing technical guidelines as well as policy measures for the ethical use of generative AI systems in journalistic production. The codebook, images and annotations are made publicly available to facilitate future research in affective computing, specifically tailored to civic and public-interest journalism.
more » « less
Full Text Available
Prediction of People’s Emotional Response towards Multi-modal News

Gao, Ge; Paik, Sejin; Reardon, Carley; Zhao, Yanling; Guo, Lei; Ishwar, Prakash; Betke, Margrit; Wijaya; Derry Tanti (November 2022, Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers))

We aim to develop methods for understanding how multimedia news exposure can affect people’s emotional responses, and we especially focus on news content related to gun violence, a very important yet polarizing issue in the U.S. We created the dataset NEmo+ by significantly extending the U.S. gun violence news-to-emotions dataset, BU-NEmo, from 320 to 1,297 news headline and lead image pairings and collecting 38,910 annotations in a large crowdsourcing experiment. In curating the NEmo+ dataset, we developed methods to identify news items that will trigger similar versus divergent emotional responses. For news items that trigger similar emotional responses, we compiled them into the NEmo+-Consensus dataset. We benchmark models on this dataset that predict a person’s dominant emotional response toward the target news item (single-label prediction). On the full NEmo+ dataset, containing news items that would lead to both differing and similar emotional responses, we also benchmark models for the novel task of predicting the distribution of evoked emotional responses in humans when presented with multi-modal news content. Our single-label and multi-label prediction models outperform baselines by large margins across several metrics.
more » « less
Full Text Available
An Unsupervised Approach to Discover Media Frames

Lai, Sha; Jiang, Yanru; Guo, Lei; Betke, Margrit; Ishwar, Prakash; Wijaya; Derry Tanti (June 2022, Proceedings of The LREC 2022 workshop on Natural Language Processing for Political Sciences)

Media framing refers to highlighting certain aspect of an issue in the news to promote a particular interpretation to the audience. Supervised learning has often been used to recognize frames in news articles, requiring a known pool of frames for a particular issue, which must be identified by communication researchers through thorough manual content analysis. In this work, we devise an unsupervised learning approach to discover the frames in news articles automatically. Given a set of news articles for a given issue, e.g., gun violence, our method first extracts frame elements from these articles using related Wikipedia articles and the Wikipedia category system. It then uses a community detection approach to identify frames from these frame elements. We discuss the effectiveness of our approach by comparing the frames it generates in an unsupervised manner to the domain-expert-derived frames for the issue of gun violence, for which a supervised learning model for frame recognition exists.
more » « less
Full Text Available
NEmo: An Affective Dataset of Gun Violence News

Reardon, Carley; Paik, Sejin; Gao, Ge; Meet, Parekh; Zhao, Yanling; Guo, Lei; Betke, Margrit; Wijaya, Derry (June 2022, roceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022) Palais du Pharo, Marseille, France, June 20-25, 2022)

Full Text Available
Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning

https://doi.org/10.1109/ICIP42928.2021.9506688

Zheng, Yi; Wang, Qitong; Betke, Margrit (September 2021, IEEE)
Consistency Regularization with High-dimensional Non-adversarial Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation

https://doi.org/10.1609/aaai.v35i11.17216

Wang, Kaihong; Yang, Chenhongyi; Betke, Margrit (May 2021, Proceedings of the AAAI Conference on Artificial Intelligence)

Unsupervised domain adaptation for semantic segmentation has been intensively studied due to the low cost of the pixel-level annotation for synthetic data. The most common approaches try to generate images or features mimicking the distribution in the target domain while preserving the semantic contents in the source domain so that a model can be trained with annotations from the latter. However, such methods highly rely on an image translator or feature extractor trained in an elaborated mechanism including adversarial training, which brings in extra complexity and instability in the adaptation process. Furthermore, these methods mainly focus on taking advantage of the labeled source dataset, leaving the unlabeled target dataset not fully utilized. In this paper, we propose a bidirectional style-induced domain adaptation method, called BiSIDA, that employs consistency regularization to efficiently exploit information from the unlabeled target domain dataset, requiring only a simple neural style transfer model. BiSIDA aligns domains by not only transferring source images into the style of target images but also transferring target images into the style of source images to perform high-dimensional perturbation on the unlabeled target images, which is crucial to the success in applying consistency regularization in segmentation tasks. Extensive experiments show that our BiSIDA achieves new state-of-the-art on two commonly-used synthetic-to-real domain adaptation benchmarks: GTA5-to-CityScapes and SYNTHIA-to-CityScapes. Code and pretrained style transfer model are available at: https://github.com/wangkaihong/BiSIDA.
more » « less
Full Text Available
Proposing an Open-Sourced Tool for Computational Framing Analysis of Multilingual Data

https://doi.org/10.1080/21670811.2022.2031241

Guo, Lei; Su, Chao; Paik, Sejin; Bhatia, Vibhu; Akavoor, Vidya Prasad; Gao, Ge; Betke, Margrit; Wijaya, Derry (February 2022, Digital journalism)

We propose a five-step computational framing analysis framework that researchers can use to analyze multilingual news data. The framework combines unsupervised and supervised machine learning and leverages a state-of-the-art multilingual deep learning model, which can significantly enhance frame prediction performance while requiring a considerably small sample of manual annotations. Most importantly, anyone can perform the proposed computational framing analysis using a free, open-sourced system, created by a team of communication scholars, computer scientists, web designers and web developers. Making advanced computational analysis available to researchers without a programming background to some degree bridges the digital divide within the communication research discipline in particular and the academic community in general.
more » « less
Full Text Available
Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model

https://doi.org/10.1007/s10032-021-00382-4

Elanwar, Randa; Qin, Wenda; Betke, Margrit; Wijaya, Derry (June 2021, International Journal on Document Analysis and Recognition (IJDAR))
null (Ed.)
Datasets of documents in Arabic are urgently needed to promote computer vision and natural language processing research that addresses the specifics of the language. Unfortunately, publicly available Arabic datasets are limited in size and restricted to certain document domains. This paper presents the release of BE-Arabic-9K, a dataset of more than 9000 high-quality scanned images from over 700 Arabic books. Among these, 1500 images have been manually segmented into regions and labeled by their functionality. BE-Arabic-9K includes book pages with a wide variety of complex layouts and page contents, making it suitable for various document layout analysis and text recognition research tasks. The paper also presents a page layout segmentation and text extraction baseline model based on fine-tuned Faster R-CNN structure (FFRA). This baseline model yields cross-validation results with an average accuracy of 99.4% and F1 score of 99.1% for text versus non-text block classification on 1500 annotated images of BE-Arabic-9K. These results are remarkably better than those of the state-of-the-art Arabic book page segmentation system ECDP. FFRA also outperforms three other prior systems when tested on a competition benchmark dataset, making it an outstanding baseline model to challenge.
more » « less
Full Text Available
Multimodal Neural and Behavioral Data Predict Response to Rehabilitation in Chronic Poststroke Aphasia

https://doi.org/10.1161/STROKEAHA.121.036749

Billot, Anne; Lai, Sha; Varkanitsa, Maria; Braun, Emily J.; Rapp, Brenda; Parrish, Todd B.; Higgins, James; Kurani, Ajay S.; Caplan, David; Thompson, Cynthia K.; et al (May 2022, Stroke)

Background: Poststroke recovery depends on multiple factors and varies greatly across individuals. Using machine learning models, this study investigated the independent and complementary prognostic role of different patient-related factors in predicting response to language rehabilitation after a stroke. Methods: Fifty-five individuals with chronic poststroke aphasia underwent a battery of standardized assessments and structural and functional magnetic resonance imaging scans, and received 12 weeks of language treatment. Support vector machine and random forest models were constructed to predict responsiveness to treatment using pretreatment behavioral, demographic, and structural and functional neuroimaging data. Results: The best prediction performance was achieved by a support vector machine model trained on aphasia severity, demographics, measures of anatomic integrity and resting-state functional connectivity (F1=0.94). This model resulted in a significantly superior prediction performance compared with support vector machine models trained on all feature sets (F1=0.82, P <0.001) or a single feature set (F1 range=0.68–0.84, P <0.001). Across random forest models, training on resting-state functional magnetic resonance imaging connectivity data yielded the best F1 score (F1=0.87). Conclusions: While behavioral, multimodal neuroimaging data and demographic information carry complementary information in predicting response to rehabilitation in chronic poststroke aphasia, functional connectivity of the brain at rest after stroke is a particularly important predictor of responsiveness to treatment, both alone and combined with other patient-related factors.
more » « less
Full Text Available
What makes gun violence a (less) prominent issue? A computational analysis of compelling arguments and selective agenda setting

https://doi.org/10.1080/15205436.2021.1898644

Guo, Lei; Mays, Kate; Zhang, Yiyan; Wijaya, Derry; Betke, Margrit (January 2021, Mass Communication and Society)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records