NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Spatio-Temporal Event Forecasting Using Incremental Multi-Source Feature Learning

https://doi.org/10.1145/3464976

Zhao, Liang; Gao, Yuyang; Ye, Jieping; Chen, Feng; Ye, Yanfang; Lu, Chang-Tien; Ramakrishnan, Naren (April 2022, ACM Transactions on Knowledge Discovery from Data)

The forecasting of significant societal events such as civil unrest and economic crisis is an interesting and challenging problem which requires both timeliness, precision, and comprehensiveness. Significant societal events are influenced and indicated jointly by multiple aspects of a society, including its economics, politics, and culture. Traditional forecasting methods based on a single data source find it hard to cover all these aspects comprehensively, thus limiting model performance. Multi-source event forecasting has proven promising but still suffers from several challenges, including (1) geographical hierarchies in multi-source data features, (2) hierarchical missing values, (3) characterization of structured feature sparsity, and (4) difficulty in model’s online update with incomplete multiple sources. This article proposes a novel feature learning model that concurrently addresses all the above challenges. Specifically, given multi-source data from different geographical levels, we design a new forecasting model by characterizing the lower-level features’ dependence on higher-level features. To handle the correlations amidst structured feature sets and deal with missing values among the coupled features, we propose a novel feature learning model based on an N th-order strong hierarchy and fused-overlapping group Lasso. An efficient algorithm is developed to optimize model parameters and ensure global optima. More importantly, to enable the model update in real time, the online learning algorithm is formulated and active set techniques are leveraged to resolve the crucial challenge when new patterns of missing features appear in real time. Extensive experiments on 10 datasets in different domains demonstrate the effectiveness and efficiency of the proposed models.
more » « less
Full Text Available
DescribeCtx: context-aware description synthesis for sensitive behaviors in mobile apps

https://doi.org/10.1145/3510003.3510058

Yang, Shao; Wang, Yuehan; Yao, Yuan; Wang, Haoyu; Ye, Yanfang; Xiao, Xusheng (January 2022, 44th International Conference on Software Engineering (ICSE))

Full Text Available
Heterogeneous Temporal Graph Neural Network

https://doi.org/10.1137/1.9781611977172.74

Fan, Yujie; Ju, Mingxuan; Zhang, Chuxu; Ye, Yanfang (January 2022, SIAM International Conference on Data Mining (SIAM SDM))
Identifying Illicit Drug Dealers on Instagram with Large-scale Multimodal Data Fusion

https://doi.org/10.1145/3472713

Hu, Chuanbo; Yin, Minglei; Liu, Bin; Li, Xin; Ye, Yanfang (October 2021, ACM Transactions on Intelligent Systems and Technology)

Illicit drug trafficking via social media sites such as Instagram have become a severe problem, thus drawing a great deal of attention from law enforcement and public health agencies. How to identify illicit drug dealers from social media data has remained a technical challenge for the following reasons. On the one hand, the available data are limited because of privacy concerns with crawling social media sites; on the other hand, the diversity of drug dealing patterns makes it difficult to reliably distinguish drug dealers from common drug users. Unlike existing methods that focus on posting-based detection, we propose to tackle the problem of illicit drug dealer identification by constructing a large-scale multimodal dataset named Identifying Drug Dealers on Instagram (IDDIG). Nearly 4,000 user accounts, of which more than 1,400 are drug dealers, have been collected from Instagram with multiple data sources including post comments, post images, homepage bio, and homepage images. We then design a quadruple-based multimodal fusion method to combine the multiple data sources associated with each user account for drug dealer identification. Experimental results on the constructed IDDIG dataset demonstrate the effectiveness of the proposed method in identifying drug dealers (almost 95% accuracy). Moreover, we have developed a hashtag-based community detection technique for discovering evolving patterns, especially those related to geography and drug types.
more » « less
Full Text Available
Detection of Illicit Drug Trafficking Events on Instagram: A Deep Multimodal Multilabel Learning Approach

https://doi.org/10.1145/3459637.3481908

Hu, Chuanbo; Yin, Minglei; Liu, Bin; Li, Xin; Ye, Yanfang (October 2021, International Conference on Information and Knowledge Management (CIKM))

Full Text Available
WebEvo: taming web application evolution via detecting semantic structure changes

https://doi.org/10.1145/3460319.3464800

Shao, Fei; Xu, Rui; Haque, Wasif; Xu, Jingwei; Zhang, Ying; Yang, Wei; Ye, Yanfang; Xiao, Xusheng (July 2021, International Symposium on Software Testing and Analysis (ISSTA))

Full Text Available
Hyperbolic Graph Attention Network

https://doi.org/10.1109/TBDATA.2021.3081431

Zhang, Yiding; Wang, Xiao; Shi, Chuan; Jiang, Xunqiang; Ye, Yanfang (May 2021, IEEE Transactions on Big Data)
null (Ed.)
Full Text Available
Heterogeneous Information Network Embedding with Adversarial Disentangler

https://doi.org/10.1109/TKDE.2021.3096231

Wang, Ruijia; Shi, Chuan; Zhao, Tianyu; Wang, Xiao; Ye, Yanfang Fanny (January 2021, IEEE Transactions on Knowledge and Data Engineering)

Full Text Available
Differentially private binary- and matrix-valued data query: an XOR mechanism

https://doi.org/10.14778/3446095.3446106

Ji, Tianxi; Li, Pan; Yilmaz, Emre; Ayday, Erman; Ye, Yanfang; Sun, Jinyuan (January 2021, Proceedings of the VLDB Endowment)
null (Ed.)
Differential privacy has been widely adopted to release continuous- and scalar-valued information on a database without compromising the privacy of individual data records in it. The problem of querying binary- and matrix-valued information on a database in a differentially private manner has rarely been studied. However, binary- and matrix-valued data are ubiquitous in real-world applications, whose privacy concerns may arise under a variety of circumstances. In this paper, we devise an exclusive or (XOR) mechanism that perturbs binary- and matrix-valued query result by conducting an XOR operation on the query result with calibrated noises attributed to a matrix-valued Bernoulli distribution. We first rigorously analyze the privacy and utility guarantee of the proposed XOR mechanism. Then, to generate the parameters in the matrix-valued Bernoulli distribution, we develop a heuristic approach to minimize the expected square query error rate under ϵ -differential privacy constraint. Additionally, to address the intractability of calculating the probability density function (PDF) of this distribution and efficiently generate samples from it, we adapt an Exact Hamiltonian Monte Carlo based sampling scheme. Finally, we experimentally demonstrate the efficacy of the XOR mechanism by considering binary data classification and social network analysis, all in a differentially private manner. Experiment results show that the XOR mechanism notably outperforms other state-of-the-art differentially private methods in terms of utility (such as classification accuracy and F 1 score), and even achieves comparable utility to the non-private mechanisms.
more » « less
Full Text Available
RxNet: Rx-refill Graph Neural Network for Overprescribing Detection

https://doi.org/10.1145/3459637.3482465

Zhang, Jianfei; Kuo, Ai-Te; Zhao, Jianan; Wen, Qianlong; Winstanley, Erin; Zhang, Chuxu; Ye, Yanfang (January 2021, International Conference on Information and Knowledge Management (CIKM))

Full Text Available

« Prev Next »

Search for: All records