skip to main content


Title: User recommendation in healthcare social media by assessing user similarity in heterogeneous network
Objective: The rapid growth of online health social websites has captured a vast amount of healthcare information and made the information easy to access for health consumers. E-patients often use these social websites for informational and emotional support. However, health consumers could be easily overwhelmed by the overloaded information. Healthcare information searching can be very difficult for consumers, not to mention most of them are not skilled information searcher. In this work, we investigate the approaches for measuring user similarity in online health social websites. By recommending similar users to consumers, we can help them to seek informational and emotional support in a more efficient way. Methods: We propose to represent the healthcare social media data as a heterogeneous healthcare information network and introduce the local and global structural approaches for measuring user similarity in a heterogeneous network. We compare the proposed structural approaches with the content-based approach. Results: Experiments were conducted on a data set collected from a popular online health social website,and the results showed that content-based approach performed better for inactive users, while structural approaches performed better for active users. Moreover, global structural approach outperformed local structural approach for all user groups. In addition, we conducted experiments on local and global structural approaches using different weight schemas for the edges in the network. Leverage performed the best for both local and global approaches. Finally, we integrated different approaches and demonstrated that hybrid method yielded better performance than the individual approach. Conclusion: The results indicate that content-based methods can effectively capture the similarity of inactive users who usually have focused interests, while structural methods can achieve better performance when rich structural information is available. Local structural approach only considers direct connections between nodes in the network, while global structural approach takes the indirect connections into account. Therefore, the global similarity approach can deal with sparse networks and capture the implicit similarity between two users. Different approaches may capture different aspects of the similarity relationship between two users. When we combine different methods together, we could achieve a better performance than using each individual method.  more » « less
Award ID(s):
1650531
NSF-PAR ID:
10048043
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Artificial intelligence in medicine
ISSN:
1873-2860
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Off-label drug use is an important healthcare topic as it is quite common and sometimes inevitable in medical practice. Though gaining information about off-label drug uses could benefit a lot of healthcare stakeholders such as patients, physicians, and pharmaceutical companies, there is no such data repository of such information available. There is a desire for a systematic approach to detect off-label drug uses. Other than using data sources such as EHR and clinical notes that are provided by healthcare providers, we exploited social media data especially online health community (OHC) data to detect the off-label drug uses, with consideration of the increasing social media users and the large volume of valuable and timely user-generated contents. We adopted tensor decomposition technique, CP decomposition in this work, to deal with the sparsity and missing data problem in social media data. On the basis of tensor decomposition results, we used two approaches to identify off-label drug use candidates: (1) one is via ranking the CP decomposition resulting components, (2) the other one is applying a heterogeneous network mining method, proposed in our previous work [9], on the reconstructed dataset by CP decomposition. The first approach identified a number of significant off-label use candidates, for which we were able to conduct case studies and found medical explanations for 7 out of 12 identified off-label use candidates. The second approach achieved better performance than the previous method [9] by improving the F1-score by 3%. It demonstrated the effectiveness of performing tensor decomposition on social media data for detecting off-label drug use. 
    more » « less
  2. Recent years have witnessed tremendous interest in understanding and predicting information spread on social media platforms such as Twitter, Facebook, etc. Existing diffusion prediction methods primarily exploit the sequential order of influenced users by projecting diffusion cascades onto their local social neighborhoods. However, this fails to capture global social structures that do not explicitly manifest in any of the cascades, resulting in poor performance for inactive users with limited historical activities. In this paper, we present a novel variational autoencoder framework (Inf-VAE) to jointly embed homophily and influence through proximity-preserving social and position-encoded temporal latent variables. To model social homophily, Inf-VAE utilizes powerful graph neural network architectures to learn social variables that selectively exploit the social connections of users. Given a sequence of seed user activations, Inf-VAE uses a novel expressive co-attentive fusion network that jointly attends over their social and temporal variables to predict the set of all influenced users. Our experimental results on multiple real-world social network datasets, including Digg, Weibo, and Stack-Exchanges demonstrate significant gains (22% MAP@10) for Inf-VAE over state-of-the-art diffusion prediction models; we achieve massive gains for users with sparse activities, and users who lack direct social neighbors in seed sets. 
    more » « less
  3. Cloud computing services have enjoyed explosive growth over the last decade. Users are typically businesses and government agencies who are able to scale their storage and processing requirements, and choose from pre-defined services (e.g. specific software-as-a-service applications). But with this outsourcing has also come the potential for data breaches targeted at the end-user, typically consumers (e.g. who purchase goods at an online retail store), and citizens (e.g. who transact information for their social security needs). This paper briefly introduces U.S.-based cloud computing regulation, including the U.S. Health Insurance Portability and Accountability Act (HIPPA), the Gramm Leach Bliley Act (GLBA), and the U.S. Stored Communications Act (SCA). We present how data breach notification (DBN) works in the U.S. by examining three mini-case examples: the 2011 Sony PlayStation Network data breach, the 2015 Anthem Healthcare data breach, and the 2017 Equifax data breach. The findings of the paper show that there is a systemic failure to learn from past data breaches, and that data breaches not only affect business and government clients of cloud computing services but their respective end-user customer base. Finally, the level of sensitivity of data breaches is increasing, from cloud computing hacks on video game platforms, to the targeting of more lucrative network and computer crime abuses aiming at invasive private health and financial data. 
    more » « less
  4. null (Ed.)
    Cyberbullying, identified as intended and repeated online bullying behavior, has become increasingly prevalent in the past few decades. Despite the significant progress made thus far, the focus of most existing work on cyberbullying detection lies in the independent content analysis of different comments within a social media session. We argue that such leading notions of analysis suffer from three key limitations: they overlook the temporal correlations among different comments; they only consider the content within a single comment rather than the topic coherence across comments; they remain generic and exploit limited interactions between social media users. In this work, we observe that user comments in the same session may be inherently related, e.g., discussing similar topics, and their interaction may evolve over time. We also show that modeling such topic coherence and temporal interaction are critical to capture the repetitive characteristics of bullying behavior, thus leading to better predicting performance. To achieve the goal, we first construct a unified temporal graph for each social media session. Drawing on recent advances in graph neural network, we then propose a principled graph-based approach for modeling the temporal dynamics and topic coherence throughout user interactions. We empirically evaluate the effectiveness of our approach with the tasks of session-level bullying detection and comment-level case study. Our code is released to public. 
    more » « less
  5. In this paper we describe the iterative evaluation and refinement of a consent flow for a chatbot being developed by a large U.S. health insurance company. This chatbot’s use of a cloud service provider triggers a requirement for users to agree to a HIPAA authorization. We highlight remote usability study and online survey findings indicating that simplifying the interface and language of the consent flow can improve the user experience and help users who read the content understand how their data may be used. However, we observe that most users in our studies, even those using our improved consent flows, missed important information in the authorization until we asked them to review it again. We also show that many people are overconfident about the privacy and security of healthcare data and that many people believe HIPAA protects in far more contexts than it actually does. Given that our redesigns following best practices did not produce many meaningful improvements in informed consent, we argue for the need for research on alternate approaches to health data disclosures such as standardized disclosures; methods borrowed from clinical research contexts such as multimedia formats, quizzes, and conversational approaches; and automated privacy assistants. 
    more » « less