Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Abstract We propose a combined model, which integrates the latent factor model and a sparse graphical model, for network data. It is noticed that neither a latent factor model nor a sparse graphical model alone may be sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represent the main trends (a.k.a., factors), and a sparse graphical component that captures the remaining ad‐hoc dependence. Model selection and parameter estimation are carried out simultaneously via a penalized likelihood approach. The convexity of the objective function allows us to develop an efficient algorithm, while the penalty terms push towards low‐dimensional latent components and a sparse graphical structure. The effectiveness of our model is demonstrated via simulation studies, and the model is also applied to four real datasets: Zachary's Karate club data, Kreb's U.S. political book dataset (http://www.orgnet.com), U.S. political blog dataset , and citation network of statisticians; showing meaningful performances in practical situations.more » « less
- 
            Testing for independence plays a fundamental role in many statistical techniques. Among the nonparametric approaches, the distance-based methods (such as the distance correlation-based hypotheses testing for independence) have many advantages, compared with many other alternatives. A known limitation of the distance-based method is that its computational complexity can be high. In general, when the sample size is n , the order of computational complexity of a distance-based method, which typically requires computing of all pairwise distances, can be O ( n 2 ). Recent advances have discovered that in the univariate cases, a fast method with O ( n log n ) computational complexity and O ( n ) memory requirement exists. In this paper, we introduce a test of independence method based on random projection and distance correlation, which achieves nearly the same power as the state-of-the-art distance-based approach, works in the multivariate cases, and enjoys the O ( nK log n ) computational complexity and O ( max{ n , K }) memory requirement, where K is the number of random projections. Note that saving is achieved when K < n / log n . We name our method a Randomly Projected Distance Covariance (RPDC). The statistical theoretical analysis takes advantage of some techniques on the random projection which are rooted in contemporary machine learning. Numerical experiments demonstrate the efficiency of the proposed method, relative to numerous competitors.more » « less
- 
            null (Ed.)Under the linear regression framework, we study the variable selection problem when the underlying model is assumed to have a small number of nonzero coefficients. Non-convex penalties in speci c forms are well-studied in the literature for sparse estimation. A recent work, Ahn, Pang, and Xin (2017), has pointed out that nearly all existing non-convex penalties can be represented as difference-of-convex (DC) functions, which are the difference of two convex functions, while itself may not be convex. There is a large existing literature on optimization problems when their objectives and/or constraints involve DC functions. Efficient numerical solutions have been proposed. Under the DC framework, directional-stationary (d-stationary) solutions are considered, and they are usually not unique. In this paper, we show that under some mild conditions, a certain subset of d-stationary solutions in an optimization problem (with a DC objective) has some ideal statistical properties: namely, asymptotic estimation consistency, asymptotic model selection consistency, asymptotic efficiency. Our assumptions are either weaker than or comparable with those conditions that have been adopted in other existing works. This work shows that DC is a nice framework to offer a uni ed approach to these existing works where non-convex penalties are involved. Our work bridges the communities of optimization and statistics.more » « less
- 
            null (Ed.)In image detection, one problem is to test whether the set, though mainly consisting of uniformly scattered points, also contains a small fraction of points sampled from some (a priori unknown) curve, for example, a curve with $$C^\alpha$$-norm bounded by $$\beta$$. One approach is to analyze the data by counting membership in multiscale multianisotropic strips, which involves an algorithm that delves into the length of the path connecting many consecutive “significant” nodes. In this paper, we develop the mathematical formalism of this algorithm and analyze the statistical property of the length of the longest significant run. The rate of convergence is derived. Using percolation theory and random graph theory, we present a novel probabilistic model named, pseudo-tree model. Based on the asymptotic results for the pseudo-tree model, we further study the length of the longest significant run in an “inflating” Bernoulli net. We find that the probability parameter $$p$$ of significant node plays an important role: there is a threshold $$p_c$$, such that in the cases of $$p < p_c$$ and $$p > p_c$$, very different asymptotic behaviors of the length of the significant runs are observed. We apply our results to the detection of an underlying curvilinear feature and prove that the test based on our proposed longest run theory is asymptotically powerful.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available