NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Estimation and Clustering in Popularity Adjusted Block Model

https://doi.org/10.1111/rssb.12410

Noroozi, Majid; Rimal, Ramchandra; Pensky, Marianna (February 2021, Journal of the Royal Statistical Society Series B: Statistical Methodology)

Abstract The paper considers the Popularity Adjusted Block model (PABM) introduced by Sengupta and Chen (Journal of the Royal Statistical Society Series B, 2018, 80, 365–386). We argue that the main appeal of the PABM is the flexibility of the spectral properties of the graph which makes the PABM an attractive choice for modelling networks that appear in biological sciences. We expand the theory of PABM to the case of an arbitrary number of communities which possibly grows with a number of nodes in the network and is not assumed to be known. We produce estimators of the probability matrix and of the community structure and, in addition, provide non-asymptotic upper bounds for the estimation and the clustering errors. We use the Sparse Subspace Clustering (SSC) approach for partitioning the network into communities, the approach that, to the best of our knowledge, has not been used for the clustering network data. The theory is supplemented by a simulation study. In addition, we show advantages of the PABM for modelling a butterfly similarity network and a human brain functional network.
more » « less
Sparse Popularity Adjusted Stochastic Block Model

Majid Noroozi, Marianna Pensky (October 2021, Journal of machine learning research)
null (Ed.)
In the present paper we study a sparse stochastic network enabled with a block structure. The popular Stochastic Block Model (SBM) and the Degree Corrected Block Model (DCBM) address sparsity by placing an upper bound on the maximum probability of connections between any pair of nodes. As a result, sparsity describes only the behavior of network as a whole, without distinguishing between the block-dependent sparsity patterns. To the best of our knowledge, the recently introduced Popularity Adjusted Block Model (PABM) is the only block model that allows to introduce a structural sparsity where some probabilities of connections are identically equal to zero while the rest of them remain above a certain threshold. The latter presents a more nuanced view of the network.
more » « less
Full Text Available
Sparse One-Grab Sampling with Probabilistic Guarantees

https://doi.org/10.1109/TPAMI.2018.2871850

Jaberi, Maryam; Pensky, Marianna; Foroosh, Hassan (December 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence)

Full Text Available
Classification with many classes: Challenges and pluses

https://doi.org/10.1016/j.jmva.2019.104536

Abramovich, Felix; Pensky, Marianna (November 2019, Journal of Multivariate Analysis)
null (Ed.)
Full Text Available
Dynamic network models and graphon estimation

https://doi.org/10.1214/18-AOS1751

Pensky, Marianna (August 2019, The Annals of Statistics)

Full Text Available
Anisotropic functional Laplace deconvolution

https://doi.org/10.1016/j.jspi.2018.07.004

Benhaddou, Rida; Pensky, Marianna; Rajapakshage, Rasika (March 2019, Journal of Statistical Planning and Inference)

Full Text Available
Spectral clustering in the dynamic stochastic block model

https://doi.org/10.1214/19-EJS1533

Pensky, Marianna; Zhang, Teng (January 2019, Electronic Journal of Statistics)

Full Text Available
Density Deconvolution with Small Berkson Errors

https://doi.org/10.3103/S1066530719030025

Rimal, R.; Pensky, M. (January 2019, Mathematical methods of statistics)

The present paper studies density deconvolution in the presence of small Berkson errors, in particular, when the variances of the errors tend to zero as the sample size grows. It is known that when the Berkson errors are present, in some cases, the unknown density estimator can be obtained by simple averaging without using kernels. However, this may not be the case when Berkson errors are asymptotically small. By treating the former case as a kernel estimator with the zero bandwidth, we obtain the optimal expressions for the bandwidth.We show that the density of Berkson errors acts as a regularizer, so that the kernel estimator is unnecessary when the variance of Berkson errors lies above some threshold that depends on the shapes of the densities in the model and the number of observations.
more » « less
Full Text Available
Solution of Linear Ill-Posed Problems Using Random Dictionaries

https://doi.org/10.1007/s13571-018-0151-8

Gupta, Pawan; Pensky, Marianna (May 2018, Sankhya B)

Full Text Available
Probabilistic Sparse Subspace Clustering Using Delayed Association.

Jaberi, M. (January 2018, Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR))

Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into two separate stages of finding the similarity matrix and finding clusters. Similar to some recent works, we integrate these two steps using a joint optimization approach. We make the following contributions: (i) we estimate the reliability of the cluster assignment for each point before assigning a point to a subspace. We group the data points into two groups of “certain” and “uncertain”, with the assignment of latter group delayed until their subspace association certainty improves. (ii) We demonstrate that delayed association is better suited for clustering subspaces that have ambiguities, i.e. when subspaces intersect or data are contaminated with outliers/noise. (iii) We demonstrate experimentally that such delayed probabilistic association leads to a more accurate self-representation and final clusters. The proposed method has higher accuracy both for points that exclusively lie in one subspace, and those that are on the intersection of subspaces. (iv) We show that delayed association leads to huge reduction of computational cost, since it allows for incremental spectral clustering
more » « less
Full Text Available

« Prev Next »

Search for: All records