Abstract Purpose Social media users share their ideas, thoughts, and emotions with other users. However, it is not clear how online users would respond to new research outcomes. This study aims to predict the nature of the emotions expressed by Twitter users toward scientific publications. Additionally, we investigate what features of the research articles help in such prediction. Identifying the sentiments of research articles on social media will help scientists gauge a new societal impact of their research articles. Design/methodology/approach Several tools are used for sentiment analysis, so we applied five sentiment analysis tools to check which are suitable for capturing a tweet's sentiment value and decided to use NLTK VADER and TextBlob. We segregated the sentiment value into negative, positive, and neutral. We measure the mean and median of tweets’ sentiment value for research articles with more than one tweet. We next built machine learning models to predict the sentiments of tweets related to scientific publications and investigated the essential features that controlled the prediction models. Findings We found that the most important feature in all the models was the sentiment of the research article title followed by the author count. We observed that the tree-based models performed better than other classification models, with Random Forest achieving 89% accuracy for binary classification and 73% accuracy for three-label classification. Research limitations In this research, we used state-of-the-art sentiment analysis libraries. However, these libraries might vary at times in their sentiment prediction behavior. Tweet sentiment may be influenced by a multitude of circumstances and is not always immediately tied to the paper's details. In the future, we intend to broaden the scope of our research by employing word2vec models. Practical implications Many studies have focused on understanding the impact of science on scientists or how science communicators can improve their outcomes. Research in this area has relied on fewer and more limited measures, such as citations and user studies with small datasets. There is currently a critical need to find novel methods to quantify and evaluate the broader impact of research. This study will help scientists better comprehend the emotional impact of their work. Additionally, the value of understanding the public's interest and reactions helps science communicators identify effective ways to engage with the public and build positive connections between scientific communities and the public. Originality/value This study will extend work on public engagement with science, sociology of science, and computational social science. It will enable researchers to identify areas in which there is a gap between public and expert understanding and provide strategies by which this gap can be bridged.
more »
« less
Exploring social contextual influences on healthy eating using big data analytics
An alarming proportion of the US population is overweight: 2/3 of US adults are overweight, and 1/3 of those overweight are obese. Obesity increases the risk of illnesses such as diabetes and cardiovascular diseases. This epidemic can be attributed to the combination of cheap, high-calorie food and lack of physical activity. In this paper, we propose a Big Data Analytics framework, called BiDAF, that aims to explore social contextual influences on healthy eating. For this purpose, we classified food tweets and social media images into as either healthy or unhealthy as well as food sentiments into either positive or negative, and further mapped them to an obesity prevalence map. The classification outcomes would be useful to reveal the social food trends and sentiments of the Centers for Disease and Control Prevention (CDC) USA obesity regions. The BiDAF framework has been implemented on Apache Spark and TensorFlow platforms. We have evaluated the BiDAF framework in terms of the accuracy on the food tweet classification and sentiment analysis. The experimental results indicated that the BiDAF framework is effective in classification and sentiment analysis of food tweet messages and also showed its potential in exploring social contextual influences that may contribute to healthy eating.
more »
« less
- Award ID(s):
- 1650549
- PAR ID:
- 10051099
- Date Published:
- Journal Name:
- 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
- Page Range / eLocation ID:
- 1507 to 1514
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract In April 2023, the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), in partnership with the National Institute of Child Health and Human Development, the National Institute on Aging, and the Office of Behavioral and Social Sciences Research, hosted a 2‐day online workshop to discuss neural plasticity in energy homeostasis and obesity. The goal was to provide a broad view of current knowledge while identifying research questions and challenges regarding neural systems that control food intake and energy balance. This review includes highlights from the meeting and is intended both to introduce unfamiliar audiences with concepts central to energy homeostasis, feeding, and obesity and to highlight up‐and‐coming research in these areas that may be of special interest to those with a background in these fields. The overarching theme of this review addresses plasticity within the central and peripheral nervous systems that regulates and influences eating, emphasizing distinctions between healthy and disease states. This is by no means a comprehensive review because this is a broad and rapidly developing area. However, we have pointed out relevant reviews and primary articles throughout, as well as gaps in current understanding and opportunities for developments in the field.more » « less
-
Abstract An unhealthy diet is a major risk factor for chronic diseases including cardiovascular disease, type 2 diabetes, and cancer 1–4 . Limited access to healthy food options may contribute to unhealthy diets 5,6 . Studying diets is challenging, typically restricted to small sample sizes, single locations, and non-uniform design across studies, and has led to mixed results on the impact of the food environment 7–23 . Here we leverage smartphones to track diet health, operationalized through the self-reported consumption of fresh fruits and vegetables, fast food and soda, as well as body-mass index status in a country-wide observational study of 1,164,926 U.S. participants (MyFitnessPal app users) and 2.3 billion food entries to study the independent contributions of fast food and grocery store access, income and education to diet health outcomes. This study constitutes the largest nationwide study examining the relationship between the food environment and diet to date. We find that higher access to grocery stores, lower access to fast food, higher income and college education are independently associated with higher consumption of fresh fruits and vegetables, lower consumption of fast food and soda, and lower likelihood of being affected by overweight and obesity. However, these associations vary significantly across zip codes with predominantly Black, Hispanic or white populations. For instance, high grocery store access has a significantly larger association with higher fruit and vegetable consumption in zip codes with predominantly Hispanic populations (7.4% difference) and Black populations (10.2% difference) in contrast to zip codes with predominantly white populations (1.7% difference). Policy targeted at improving food access, income and education may increase healthy eating, but intervention allocation may need to be optimized for specific subpopulations and locations.more » « less
-
null (Ed.)This study evaluates the level of service of shared transportation facilities through mining geotagged data from social media and analyzing the perceptions of road users. An algorithm is developed adopting a text classification approach with contextual understanding to filter out relevant information related to users’ perceptions toward active mobility. Using a heuristic-based keyword matching approach produces about 75% tweets that are out of context, so that approach is deemed unsuitable for information extraction from Twitter. This study implements six different text classification models and compares the performance of these models for tweet classification. The model is applied to real-world data to filter out relevant information, and content analysis is performed to check the distribution of keywords within the filtered data. The text classification model “term frequency-inverse document frequency” vectorizer-based logistic regression model performed best at classifying the tweets. To select the best model, the performances of the models are compared based on precision, recall, F1 score (geometric mean of precision and recall), and accuracy metrics. The findings from the analysis show that the proposed method can help produce more relevant information on walking and biking facilities as well as safety concerns. By analyzing the sentiments of the filtered data, the existing condition of biking and walking facilities in the DC area can be inferred. This method can be a critical part of the decision support system to understand the qualitative level of service of existing transportation facilities.more » « less
-
Proc. 2023 ACM Int. Conf. on Web Search and Data Mining (Ed.)Target-oriented opinion summarization is to profile a target by extracting user opinions from multiple related documents. Instead of simply mining opinion ratings on a target (e.g., a restaurant) or on multiple aspects (e.g., food, service) of a target, it is desirable to go deeper, to mine opinion on fine-grained sub-aspects (e.g., fish). However, it is expensive to obtain high-quality annotations at such fine-grained scale. This leads to our proposal of a new framework, FineSum, which advances the frontier of opinion analysis in three aspects: (1) minimal supervision, where no document-summary pairs are provided, only aspect names and a few aspect/sentiment keywords are available; (2) fine-grained opinion analysis, where sentiment analysis drills down to a specific subject or characteristic within each general aspect; and (3) phrase-based summarization, where short phrases are taken as basic units for summarization, and semantically coherent phrases are gathered to improve the consistency and comprehensiveness of summary. Given a large corpus with no annotation, FineSum first automatically identifies potential spans of opinion phrases, and further reduces the noise in identification results using aspect and sentiment classifiers. It then constructs multiple fine-grained opinion clusters under each aspect and sentiment. Each cluster expresses uniform opinions towards certain sub-aspects (e.g., “fish” in “food” aspect) or characteristics (e.g., “Mexican” in “food” aspect). To accomplish this, we train a spherical word embedding space to explicitly represent different aspects and sentiments. We then distill the knowledge from embedding to a contextualized phrase classifier, and perform clustering using the contextualized opinion-aware phrase embedding. Both automatic evaluations on the benchmark and quantitative human evaluation validate the effectiveness of our approach.more » « less
An official website of the United States government

