NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Transfer Learning for Diffusion Models

Ouyang, Yidong; Xie, Liyan; Zha, Hongyuan; Cheng, Guang (December 2024, NeurIPS)

Full Text Available
SMURF-THP: score matching-based uncertainty quantification for transformer Hawkes process

Li, Zichong; Xu, Yanbo; Zuo, Simiao; Jiang, Haoming; Zhang, Chao; Zhao, Tuo; Zha, Hongyuan (September 2023, International Conference on Machine Learning)

Full Text Available
Neural Parametric Fokker--Planck Equation

https://doi.org/10.1137/20M1344986

Liu, Shu; Li, Wuchen; Zha, Hongyuan; Zhou, Haomin (June 2022, SIAM Journal on Numerical Analysis)

Full Text Available
Hessian-Free High-Resolution Nesterov Acceleration for Sampling

Li, Ruilin; Zha, Hongyuan; Tao, Molei (January 2022, International Conference on Machine Learning)

Full Text Available
Sqrt(d) Dimension Dependence of Langevin Monte Carlo

Li, Ruilin; Zha, Hongyuan; Tao, Molei (January 2022, The International Conference on Learning Representations)

Full Text Available
Self-Training with Differentiable Teacher

https://doi.org/10.18653/v1/2022.findings-naacl.70

Zuo, Simiao; Yu, Yue; Liang, Chen; Jiang, Haoming; Er, Siawpeng; Zhang, Chao; Zhao, Tuo; Zha, Hongyuan (January 2022, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies)

Full Text Available
Network Diffusions via Neural Mean-Field Dynamics

He, Shushan; Zha, Hongyuan; Ye, Xiaojing (December 2020, Advances in neural information processing systems)
Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M. F.; Lin, H. (Ed.)
We propose a novel learning framework based on neural mean-field dynamics for inference and estimation problems of diffusion on networks. Our new framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities, which renders a delay differential equation with memory integral approximated by learnable time convolution operators, resulting in a highly structured and interpretable RNN. Directly using cascade data, our framework can jointly learn the structure of the diffusion network and the evolution of infection probabilities, which are cornerstone to important downstream applications such as influence maximization. Connections between parameter learning and optimal control are also established. Empirical study shows that our approach is versatile and robust to variations of the underlying diffusion network models, and significantly outperform existing approaches in accuracy and efficiency on both synthetic and real-world data.
more » « less
Full Text Available
Improving sampling accuracy of stochastic gradient MCMC methods via non-uniform subsampling of gradients

https://doi.org/10.3934/dcdss.2021157

Li, Ruilin; Wang, Xin; Zha, Hongyuan; Tao, Molei (January 2021, Discrete & Continuous Dynamical Systems - S)

Many Markov Chain Monte Carlo (MCMC) methods leverage gradient information of the potential function of target distribution to explore sample space efficiently. However, computing gradients can often be computationally expensive for large scale applications, such as those in contemporary machine learning. Stochastic Gradient (SG-)MCMC methods approximate gradients by stochastic ones, commonly via uniformly subsampled data points, and achieve improved computational efficiency, however at the price of introducing sampling error. We propose a non-uniform subsampling scheme to improve the sampling accuracy. The proposed exponentially weighted stochastic gradient (EWSG) is designed so that a non-uniform-SG-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method, and hence the inaccuracy due to SG approximation is reduced. EWSG differs from classical variance reduction (VR) techniques as it focuses on the entire distribution instead of just the variance; nevertheless, its reduced local variance is also proved. EWSG can also be viewed as an extension of the importance sampling idea, successful for stochastic-gradient-based optimizations, to sampling tasks. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index, which is coupled to the MCMC algorithm. Numerical experiments are provided, not only to demonstrate EWSG's effectiveness, but also to guide hyperparameter choices, and validate our non-asymptotic global error bound despite of approximations in the implementation. Notably, while statistical accuracy is improved, convergence speed can be comparable to the uniform version, which renders EWSG a practical alternative to VR (but EWSG and VR can be combined too).
more » « less
Full Text Available
A Hypergradient Approach to Robust Regression without Correspondence

Xie, Yujia; Mao, Yixiu; Zuo, Simiao; Xu, Hongteng; Ye, Xiaojing; Zhao, Tuo; Zha, Hongyuan. (April 2021, International Conference on Learning Representations)

Full Text Available
A Hypergradient Approach to Robust Regression without Correspondence

Xie, Yujia; Mao, Yixiu; Zuo, Simiao; Xu, Hongteng; Ye, Xiaojing; Zhao, Tuo; Zha, Hongyuan (April 2021, International Conference on Learning Representations)
null (Ed.)
We consider a regression problem, where the correspondence between the input and output data is not available. Such shuffled data are commonly observed in many real world problems. Take flow cytometry as an example: the measuring instruments are unable to preserve the correspondence between the samples and the measurements. Due to the combinatorial nature of the problem, most of the existing methods are only applicable when the sample size is small, and are limited to linear regression models. To overcome such bottlenecks, we propose a new computational framework --- ROBOT --- for the shuffled regression problem, which is applicable to large data and complex models. Specifically, we propose to formulate regression without correspondence as a continuous optimization problem. Then by exploiting the interaction between the regression model and the data correspondence, we propose to develop a hypergradient approach based on differentiable programming techniques. Such a hypergradient approach essentially views the data correspondence as an operator of the regression model, and therefore it allows us to find a better descent direction for the model parameters by differentiating through the data correspondence. ROBOT is quite general, and can be further extended to an inexact correspondence setting, where the input and output data are not necessarily exactly aligned. Thorough numerical experiments show that ROBOT achieves better performance than existing methods in both linear and nonlinear regression tasks, including real-world applications such as flow cytometry and multi-object tracking.
more » « less
Full Text Available

« Prev Next »

Search for: All records