Search for: All records

Creators/Authors contains: "Wu, Xindong"

« Prev Next »

Total Resources

12

Resource Type
Conference Paper

5

Conference Proceeding

0

Dataset

0

Journal Article

7

Workshop Report

0

Availability
Full Text / Resource Available

12

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

NTP-Miner: Nonoverlapping Three-Way Sequential Pattern Mining

https://doi.org/10.1145/3480245

Wu, Youxi ; Luo, Lanfang ; Li, Yan ; Guo, Lei ; Fournier-Viger, Philippe ; Zhu, Xingquan ; Wu, Xindong ( June 2022 , ACM Transactions on Knowledge Discovery from Data)

Nonoverlapping sequential pattern mining is an important type of sequential pattern mining (SPM) with gap constraints, which not only can reveal interesting patterns to users but also can effectively reduce the search space using the Apriori (anti-monotonicity) property. However, the existing algorithms do not focus on attributes of interest to users, meaning that existing methods may discover many frequent patterns that are redundant. To solve this problem, this article proposes a task called nonoverlapping three-way sequential pattern (NTP) mining, where attributes are categorized according to three levels of interest: strong, medium, and weak interest. NTP mining can effectively avoid mining redundant patterns since the NTPs are composed of strong and medium interest items. Moreover, NTPs can avoid serious deviations (the occurrence is significantly different from its pattern) since gap constraints cannot match with strong interest patterns. To mine NTPs, an effective algorithm is put forward, called NTP-Miner, which applies two main steps: support (frequency occurrence) calculation and candidate pattern generation. To calculate the support of an NTP, depth-first and backtracking strategies are adopted, which do not require creating a whole Nettree structure, meaning that many redundant nodes and parent–child relationships do not need to be created. Hence, time and space efficiency is improved. To generate candidate patterns while reducing their number, NTP-Miner employs a pattern join strategy and only mines patterns of strong and medium interest. Experimental results on stock market and protein datasets show that NTP-Miner not only is more efficient than other competitive approaches but can also help users find more valuable patterns. More importantly, NTP mining has achieved better performance than other competitive methods in clustering tasks. Algorithms and data are available at: https://github.com/wuc567/Pattern-Mining/tree/master/NTP-Miner .
more » « less
Full Text Available
Unsupervised Lifelong Learning with Curricula

https://doi.org/10.1145/3442381.3449839

He, Yi ; Chen, Sheng ; Wu, Baijun ; Yuan, Xu ; Wu, Xindong ( April 2021 , Proceedings of the Web Conference 2021)

Full Text Available
Active Learning with Multi-granular Graph Auto-Encoder

https://doi.org/10.1109/ICDM50108.2020.00125

He, Yi ; Yuan, Xu ; Tzeng, Nian-Feng ; Wu, Xindong ( November 2020 , 2020 IEEE International Conference on Data Mining (ICDM))
null (Ed.)
Full Text Available
Learning Interpretable Representations with Informative Entanglements

https://doi.org/10.24963/ijcai.2020/273

Beyazıt, Ege ; Tuncel, Doruk ; Yuan, Xu ; Tzeng, Nian-Feng ; Wu, Xindong ( July 2020 , Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence)
null (Ed.)

Learning interpretable representations in an unsupervised setting is an important yet a challenging task. Existing unsupervised interpretable methods focus on extracting independent salient features from data. However they miss out the fact that the entanglement of salient features may also be informative. Acknowledging these entanglements can improve the interpretability, resulting in extraction of higher quality and a wider variety of salient features. In this paper, we propose a new method to enable Generative Adversarial Networks (GANs) to discover salient features that may be entangled in an informative manner, instead of extracting only disentangled features. Specifically, we propose a regularizer to punish the disagreement between the extracted feature interactions and a given dependency structure while training. We model these interactions using a Bayesian network, estimate the maximum likelihood parameters and calculate a negative likelihood score to measure the disagreement. Upon qualitatively and quantitatively evaluating the proposed method using both synthetic and real-world datasets, we show that our proposed regularizer guides GANs to learn representations with disentanglement scores competing with the state-of-the-art, while extracting a wider variety of salient features.

more » « less
Full Text Available
A Framework for Subgraph Detection in Interdependent Networks via Graph Block-Structured Optimization

https://doi.org/10.1109/ACCESS.2020.3018497

Jie, Fei ; Wang, Chunpai ; Chen, Feng ; Li, Lei ; Wu, Xindong ( January 2020 , IEEE Access)
null (Ed.)
Full Text Available
Toward Mining Capricious Data Streams: A Generative Approach

https://doi.org/10.1109/TNNLS.2020.2981386

He, Yi ; Wu, Baijun ; Wu, Di ; Beyazit, Ege ; Chen, Sheng ; Wu, Xindong ( January 2020 , IEEE Transactions on Neural Networks and Learning Systems)
null (Ed.)
Full Text Available
Block-Structured Optimization for Anomalous Pattern Detection in Interdependent Networks

https://doi.org/10.1109/ICDM.2019.00137

Jie, Fei ; Wang, Chunpai ; Chen, Feng ; Li, Lei ; Wu, Xindong ( November 2019 , 2019 IEEE International Conference on Data Mining (ICDM))

We propose a generalized optimization framework for detecting anomalous patterns (subgraphs that are interesting or unexpected) in interdependent networks, such as multi-layer networks, temporal networks, networks of networks, and many others. We frame the problem as a non-convex optimization that has a general nonlinear score function and a set of block-structured and non-convex constraints. We develop an effective, efficient, and parallelizable projection-based algorithm, namely Graph Block-structured Gradient Projection (GBGP), to solve the problem. It is proved that our algorithm 1) runs in nearly-linear time on the network size, and 2) enjoys a theoretical approximation guarantee. Moreover, we demonstrate how our framework can be applied to two very practical applications, and we conduct comprehensive experiments to show the effectiveness and efficiency of our proposed algorithm.
more » « less
Full Text Available
Online Learning from Capricious Data Streams: A Generative Approach

https://doi.org/10.24963/ijcai.2019/346

He, Yi ; Wu, Baijun ; Wu, Di ; Beyazit, Ege ; Chen, Sheng ; Wu, Xindong ( August 2019 , International Joint Conference on Artificial Intelligence Main track)

Learning with streaming data has received extensive attention during the past few years. Existing approaches assume the feature space is fixed or changes by following explicit regularities, limiting their applicability in dynamic environments where the data streams are described by an arbitrarily varying feature space. To handle such capricious data streams, we in this paper develop a novel algorithm, named OCDS (Online learning from Capricious Data Streams), which does not make any assumption on feature space dynamics. OCDS trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. Specifically, the universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve performance with theoretical analysis. The experimental results demonstrate that OCDS achieves conspicuous performance on synthetic and real datasets.

more » « less
Full Text Available
REMIAN: Real-Time and Error-Tolerant Missing Value Imputation

https://doi.org/10.1145/3412364

Ma, Qian ; Gu, Yu ; Lee, Wang-Chien ; Yu, Ge ; Liu, Hongbo ; Wu, Xindong ( October 2020 , ACM Transactions on Knowledge Discovery from Data)

Missing value (MV) imputation is a critical preprocessing means for data mining. Nevertheless, existing MV imputation methods are mostly designed for batch processing, and thus are not applicable to streaming data, especially those with poor quality. In this article, we propose a framework, called Real-time and Error-tolerant Missing vAlue ImputatioN (REMAIN), to impute MVs in poor-quality streaming data. Instead of imputing MVs based on all the observed data, REMAIN first initializes the MV imputation model based on a-RANSAC which is capable of detecting and rejecting anomalies in an efficient manner, and then incrementally updates the model parameters upon the arrival of new data to support real-time MV imputation. As the correlations among attributes of the data may change over time in unforseenable ways, we devise a deterioration detection mechanism to capture the deterioration of the imputation model to further improve the imputation accuracy. Finally, we conduct an extensive evaluation on the proposed algorithms using real-world and synthetic datasets. Experimental results demonstrate that REMAIN achieves significantly higher imputation accuracy over existing solutions. Meanwhile, REMAIN improves up to one order of magnitude in time cost compared with existing approaches.
more » « less
Full Text Available
NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints

https://doi.org/10.1109/TCYB.2017.2750691

Wu, Youxi ; Tong, Yao ; Zhu, Xingquan ; Wu, Xindong ( October 2018 , IEEE Transactions on Cybernetics)

« Prev Next »