skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: User recommendation in healthcare social media by assessing user similarity in heterogeneous network
Objective: The rapid growth of online health social websites has captured a vast amount of healthcare information and made the information easy to access for health consumers. E-patients often use these social websites for informational and emotional support. However, health consumers could be easily overwhelmed by the overloaded information. Healthcare information searching can be very difficult for consumers, not to mention most of them are not skilled information searcher. In this work, we investigate the approaches for measuring user similarity in online health social websites. By recommending similar users to consumers, we can help them to seek informational and emotional support in a more efficient way. Methods: We propose to represent the healthcare social media data as a heterogeneous healthcare information network and introduce the local and global structural approaches for measuring user similarity in a heterogeneous network. We compare the proposed structural approaches with the content-based approach. Results: Experiments were conducted on a data set collected from a popular online health social website,and the results showed that content-based approach performed better for inactive users, while structural approaches performed better for active users. Moreover, global structural approach outperformed local structural approach for all user groups. In addition, we conducted experiments on local and global structural approaches using different weight schemas for the edges in the network. Leverage performed the best for both local and global approaches. Finally, we integrated different approaches and demonstrated that hybrid method yielded better performance than the individual approach. Conclusion: The results indicate that content-based methods can effectively capture the similarity of inactive users who usually have focused interests, while structural methods can achieve better performance when rich structural information is available. Local structural approach only considers direct connections between nodes in the network, while global structural approach takes the indirect connections into account. Therefore, the global similarity approach can deal with sparse networks and capture the implicit similarity between two users. Different approaches may capture different aspects of the similarity relationship between two users. When we combine different methods together, we could achieve a better performance than using each individual method.  more » « less
Award ID(s):
1650531
PAR ID:
10048043
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Artificial intelligence in medicine
ISSN:
1873-2860
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We investigate relationships between online self-disclosure and received social support and user engagement during the COVID-19 crisis. We crawl a total of 2,399 posts and 29,851 associated comments from the r/COVID19_support subreddit and manually extract fine-grained personal information categories and types of social support sought from each post. We develop a BERT-based ensemble classifier to automatically identify types of support offered in users’ comments. We then analyze the effect of personal information sharing and posts’ topical, lexical, and sentiment markers on the acquisition of support and five interaction measures (submission scores, the number of comments, the number of unique commenters, the length and sentiments of comments). Our findings show that: (1) users were more likely to share their age, education, and location information when seeking both informational and emotional support as opposed to pursuing either one; (2) while personal information sharing was positively correlated with receiving informational support when requested, it did not correlate with emotional support; (3) as the degree of self-disclosure increased, information support seekers obtained higher submission scores and longer comments, whereas emotional support seekers’ self-disclosure resulted in lower submission scores, fewer comments, and fewer unique commenters; and (4) post characteristics affecting audience response differed significantly based on types of support sought by post authors. These results provide empirical evidence for the varying effects of self-disclosure on acquiring desired support and user involvement online during the COVID-19 pandemic. Furthermore, this work can assist support seekers hoping to enhance and prioritize specific types of social support and user engagement. 
    more » « less
  2. Recent years have witnessed tremendous interest in understanding and predicting information spread on social media platforms such as Twitter, Facebook, etc. Existing diffusion prediction methods primarily exploit the sequential order of influenced users by projecting diffusion cascades onto their local social neighborhoods. However, this fails to capture global social structures that do not explicitly manifest in any of the cascades, resulting in poor performance for inactive users with limited historical activities. In this paper, we present a novel variational autoencoder framework (Inf-VAE) to jointly embed homophily and influence through proximity-preserving social and position-encoded temporal latent variables. To model social homophily, Inf-VAE utilizes powerful graph neural network architectures to learn social variables that selectively exploit the social connections of users. Given a sequence of seed user activations, Inf-VAE uses a novel expressive co-attentive fusion network that jointly attends over their social and temporal variables to predict the set of all influenced users. Our experimental results on multiple real-world social network datasets, including Digg, Weibo, and Stack-Exchanges demonstrate significant gains (22% MAP@10) for Inf-VAE over state-of-the-art diffusion prediction models; we achieve massive gains for users with sparse activities, and users who lack direct social neighbors in seed sets. 
    more » « less
  3. Off-label drug use is an important healthcare topic as it is quite common and sometimes inevitable in medical practice. Though gaining information about off-label drug uses could benefit a lot of healthcare stakeholders such as patients, physicians, and pharmaceutical companies, there is no such data repository of such information available. There is a desire for a systematic approach to detect off-label drug uses. Other than using data sources such as EHR and clinical notes that are provided by healthcare providers, we exploited social media data especially online health community (OHC) data to detect the off-label drug uses, with consideration of the increasing social media users and the large volume of valuable and timely user-generated contents. We adopted tensor decomposition technique, CP decomposition in this work, to deal with the sparsity and missing data problem in social media data. On the basis of tensor decomposition results, we used two approaches to identify off-label drug use candidates: (1) one is via ranking the CP decomposition resulting components, (2) the other one is applying a heterogeneous network mining method, proposed in our previous work [9], on the reconstructed dataset by CP decomposition. The first approach identified a number of significant off-label use candidates, for which we were able to conduct case studies and found medical explanations for 7 out of 12 identified off-label use candidates. The second approach achieved better performance than the previous method [9] by improving the F1-score by 3%. It demonstrated the effectiveness of performing tensor decomposition on social media data for detecting off-label drug use. 
    more » « less
  4. null (Ed.)
    Cyberbullying, identified as intended and repeated online bullying behavior, has become increasingly prevalent in the past few decades. Despite the significant progress made thus far, the focus of most existing work on cyberbullying detection lies in the independent content analysis of different comments within a social media session. We argue that such leading notions of analysis suffer from three key limitations: they overlook the temporal correlations among different comments; they only consider the content within a single comment rather than the topic coherence across comments; they remain generic and exploit limited interactions between social media users. In this work, we observe that user comments in the same session may be inherently related, e.g., discussing similar topics, and their interaction may evolve over time. We also show that modeling such topic coherence and temporal interaction are critical to capture the repetitive characteristics of bullying behavior, thus leading to better predicting performance. To achieve the goal, we first construct a unified temporal graph for each social media session. Drawing on recent advances in graph neural network, we then propose a principled graph-based approach for modeling the temporal dynamics and topic coherence throughout user interactions. We empirically evaluate the effectiveness of our approach with the tasks of session-level bullying detection and comment-level case study. Our code is released to public. 
    more » « less
  5. User representation learning is vital to capture diverse user preferences, while it is also challenging as user intents are latent and scattered among complex and different modalities of user-generated data, thus, not directly measurable. Inspired by the concept of user schema in social psychology, we take a new perspective to perform user representation learning by constructing a shared latent space to capture the dependency among different modalities of user-generated data. Both users and topics are embedded to the same space to encode users' social connections and text content, to facilitate joint modeling of different modalities, via a probabilistic generative framework. We evaluated the proposed solution on large collections of Yelp reviews and StackOverflow discussion posts, with their associated network structures. The proposed model outperformed several state-of-the-art topic modeling based user models with better predictive power in unseen documents, and state-of-the-art network embedding based user models with improved link prediction quality in unseen nodes. The learnt user representations are also proved to be useful in content recommendation, e.g., expert finding in StackOverflow. 
    more » « less