NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Fitting networks with a cancellation trick

Jin, Jiashun; Wang, Jingming (August 2025, International Conference on Learning and Representations)

Free, publicly-accessible full text available August 1, 2026
Network Goodness-of-Fit for the Block-Model Family

https://doi.org/10.1080/01621459.2025.2479242

Jin, Jiashun; Ke, Zheng Tracy; Tang, Jiajun; Wang, Jingming (July 2025, Journal of the American Statistical Association)

Free, publicly-accessible full text available July 3, 2026
Optimal Network Membership Estimation under Severe Degree Heterogeneity

https://doi.org/10.1080/01621459.2024.2388903

Ke, Zheng Tracy; Wang, Jingming (April 2025, Journal of the American Statistical Association)

Free, publicly-accessible full text available April 3, 2026
Entry-Wise Eigenvector Analysis and Improved Rates for Topic Modeling on Short Documents

https://doi.org/10.3390/math12111682

Ke, Zheng Tracy; Wang, Jingming (June 2024, Mathematics)

Topic modeling is a widely utilized tool in text analysis. We investigate the optimal rate for estimating a topic model. Specifically, we consider a scenario with n documents, a vocabulary of size p, and document lengths at the order N. When N≥c·p, referred to as the long-document case, the optimal rate is established in the literature at p/(Nn). However, when N=o(p), referred to as the short-document case, the optimal rate remains unknown. In this paper, we first provide new entry-wise large-deviation bounds for the empirical singular vectors of a topic model. We then apply these bounds to improve the error rate of a spectral algorithm, Topic-SCORE. Finally, by comparing the improved error rate with the minimax lower bound, we conclude that the optimal rate is still p/(Nn) in the short-document case.
more » « less
Full Text Available
Improved algorithm and bounds for successive projection

Jin, Jiashun; Moryoussef, Gabriel; Ke, Zheng Tracy; Tang, Jiajun; Wang, Jingming (May 2024, International Conference on learning and representations)

Full Text Available
Improved Algorithm and Bounds for Successive Projection

Jin, Jiashun; Ke, Zheng Tracy; Moryoussef, Gabriel; Tang, Jiajun; Wang, Jingming (March 2024, International Conference on Learning Representations)

Given a K-vertex simplex in a d-dimensional space, suppose we measure n points on the simplex with noise (hence, some of the observed points fall outside the sim- plex). Vertex hunting is the problem of estimating the K vertices of the simplex. A popular vertex hunting algorithm is successive projection algorithm (SPA). How- ever, SPA is observed to perform unsatisfactorily under strong noise or outliers. We propose pseudo-point SPA (pp-SPA). It uses a projection step and a denoise step to generate pseudo-points and feed them into SPA for vertex hunting. We derive error bounds for pp-SPA, leveraging on extreme value theory of (possibly) high-dimensional random vectors. The results suggest that pp-SPA has faster rates and better numerical performances than SPA. Our analysis includes an improved non-asymptotic bound for the original SPA, which is of independent interest.
more » « less
Full Text Available
Statistical inference for principal components of spiked covariance matrices

https://doi.org/10.1214/21-AOS2143

Bao, Zhigang; Ding, Xiucai; Wang, Jingming; Wang, Ke (April 2022, The Annals of Statistics)

Full Text Available

Search for: All records