NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Large-scale Training Paradigm for Graph Generative Models

Wang, Yu; Rossi, Ryan; Park, Namyong; Chen, Huiyuan; Ahmed, Nesreen; Trivedi, Puja; Dernoncourt, Franck; Koutra, Danai; Derr, Tyler (January 2025, The Thirteenth International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available January 22, 2026
Enhancing Distribution and Label Consistency for Graph Out-of-Distribution Generalization

https://doi.org/10.1109/ICDM59182.2024.00108

Wang, Song; Yang, Xiaodong; Islam, Rashidul; Chen, Huiyuan; Xu, Minghua; Li, Jundong; Cai, Yiwei (December 2024, IEEE)

To deal with distribution shifts in graph data, various graph out-of-distribution (OOD) generalization techniques have been recently proposed. These methods often employ a two-step strategy that first creates augmented environments and subsequently identifies invariant subgraphs to improve generalizability. Nevertheless, this approach could be suboptimal from the perspective of consistency. First, the process of augmenting environments by altering the graphs while preserving labels may lead to graphs that are not realistic or meaningfully related to the origin distribution, thus lacking distribution consistency. Second, the extracted subgraphs are obtained from directly modifying graphs, and may not necessarily maintain a consistent predictive relationship with their labels, thereby impacting label consistency. In response to these challenges, we introduce an innovative approach that aims to enhance these two types of consistency for graph OOD generalization. We propose a modifier to obtain both augmented and invariant graphs in a unified manner. With the augmented graphs, we enrich the training data without compromising the integrity of label-graph relationships. The label consistency enhancement in our framework further preserves the supervision information in the invariant graph. We conduct extensive experiments on real-world datasets to demonstrate the superiority of our framework over other state-of-the-art baselines.
more » « less
Free, publicly-accessible full text available December 9, 2025
Discrete-state Continuous-time Diffusion for Graph Generation

Xu, Zhe; Qiu, Ruizhong; Chen, Yuzhong; Chen, Huiyuan; Ran, Xiran; Pan, Mengha; Zeng, Zhichen; Das, Mahashweta; Tong, Hanghang (December 2024, NeuIPS 2024)

Free, publicly-accessible full text available December 10, 2025
TabLog: Test-Time Adaptation for Tabular Data Using Logic Rules

Ren, Weijieying; Li, Xiaoting; Chen, Huiyuan; Rakesh, Vineeth; Wang, Zhuoyi; Das, Mahashweta; Honavar, Vasant (July 2024, Proceedings of Machine Learning Research: International Conference on Machine Learning)

We consider the problem of test-time adaptation of predictive models trained on tabular data. Effective solution of this problem requires adaptation of predictive models trained on the source domain to a target domain, using only unlabeled target domain data, without access to source domain data. Existing test-time adaptation methods for tabular data have difficulty coping with the heterogeneous features and their complex dependencies inherent in tabular data. To overcome these limitations, we consider test-time adaptation in the setting wherein the logical structure of the rules is assumed to remain invariant despite distribution shift between source and target domains whereas the numerical parameters associated with the rules and the weights assigned to them can vary to accommodate distribution shift. TabLog discretizes numerical features, models dependencies between heterogeneous features, introduces a novel contrastive loss for coping with distribution shift, and presents an end-to-end framework for efficient training and test-time adaptation by taking advantage of a logical neural network representation of a rule ensemble. We present results of experiments using several benchmark data sets that demonstrate TabLog is competitive with or improves upon the state-of-the-art methods for testtime adaptation of predictive models trained on tabular data. Our code is available at https:// github.com/WeijieyingRen/TabLog.
more » « less
Full Text Available
Can One Embedding Fit All? A Multi-Interest Learning Paradigm Towards Improving User Interest Diversity Fairness

https://doi.org/10.1145/3589334.3645662

Zhao, Yuying; Xu, Minghua; Chen, Huiyuan; Chen, Yuzhong; Cai, Yiwei; Islam, Rashidul; Wang, Yu; Derr, Tyler (May 2024, ACM)

Full Text Available
PaCEr: Network Embedding From Positional to Structural

https://doi.org/10.1145/3589334.3645516

Yan, Yuchen; Hu, Yongyi; Zhou, Qinghai; Liu, Lihui; Zeng, Zhichen; Chen, Yuzhong; Pan, Menghai; Chen, Huiyuan; Das, Mahashweta; Tong, Hanghang (May 2024, ACM)

Full Text Available
Federated Few-shot Learning

https://doi.org/10.1145/3580305.3599347

Wang, Song; Fu, Xingbo; Ding, Kaize; Chen, Chen; Chen, Huiyuan; Li, Jundong (August 2023, ACM)

Federated Learning (FL) enables multiple clients to collaboratively learn a machine learning model without exchanging their own local data. In this way, the server can exploit the computational power of all clients and train the model on a larger set of data samples among all clients. Although such a mechanism is proven to be effective in various fields, existing works generally assume that each client preserves sufficient data for training. In practice, however, certain clients can only contain a limited number of samples (i.e., few-shot samples). For example, the available photo data taken by a specific user with a new mobile device is relatively rare. In this scenario, existing FL efforts typically encounter a significant performance drop on these clients. Therefore, it is urgent to develop a few-shot model that can generalize to clients with limited data under the FL scenario. In this paper, we refer to this novel problem as federated few-shot learning. Nevertheless, the problem remains challenging due to two major reasons: the global data variance among clients (i.e., the difference in data distributions among clients) and the local data insufficiency in each client (i.e., the lack of adequate local data for training). To overcome these two challenges, we propose a novel federated few-shot learning framework with two separately updated models and dedicated training strategies to reduce the adverse impact of global data variance and local data insufficiency. Extensive experiments on four prevalent datasets that cover news articles and images validate the effectiveness of our framework compared with the state-of-the-art baselines.
more » « less
Full Text Available
Sketching Multidimensional Time Series for Fast Discord Mining

Yeh, Chin-Chia Michael; Zheng, Yan; Pan, Menghai; Chen, Huiyuan; Zhuang, Zhongfang; Wang, Junpeng; Wang, Liang; Zhang, Wei; Phillips, Jeff M.; Keogh, Eamonn (December 2023, IEEE International Conference on Big Data)

Full Text Available
Kernel Ridge Regression-Based Graph Dataset Distillation

https://doi.org/10.1145/3580305.3599398

Xu, Zhe; Chen, Yuzhong; Pan, Menghai; Chen, Huiyuan; Das, Mahashweta; Yang, Hao; Tong, Hanghang (August 2023, ACM)

Full Text Available
Improving Fairness in Graph Neural Networks via Mitigating Sensitive Attribute Leakage

https://doi.org/10.1145/3534678.3539404

Wang, Yu; Zhao, Yuying; Dong, Yushun; Chen, Huiyuan; Li, Jundong; Derr, Tyler (August 2022, Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Graph Neural Networks (GNNs) have shown great power in learning node representations on graphs. However, they may inherit historical prejudices from training data, leading to discriminatory bias in predictions. Although some work has developed fair GNNs, most of them directly borrow fair representation learning techniques from non-graph domains without considering the potential problem of sensitive attribute leakage caused by feature propagation in GNNs. However, we empirically observe that feature propagation could vary the correlation of previously innocuous non-sensitive features to the sensitive ones. This can be viewed as a leakage of sensitive information which could further exacerbate discrimination in predictions. Thus, we design two feature masking strategies according to feature correlations to highlight the importance of considering feature propagation and correlation variation in alleviating discrimination. Motivated by our analysis, we propose Fair View Graph Neural Network (FairVGNN) to generate fair views of features by automatically identifying and masking sensitive-correlated features considering correlation variation after feature propagation. Given the learned fair views, we adaptively clamp weights of the encoder to avoid using sensitive-related features. Experiments on real-world datasets demonstrate that FairVGNN enjoys a better trade-off between model utility and fairness.
more » « less
Full Text Available

« Prev Next »

Search for: All records