 Award ID(s):
 1712996
 Publication Date:
 NSFPAR ID:
 10383980
 Journal Name:
 Biometrika
 Volume:
 107
 Issue:
 2
 Page Range or eLocationID:
 293 to 310
 ISSN:
 00063444
 Sponsoring Org:
 National Science Foundation
More Like this

In sparse linear regression, the SLOPE estimator generalizes LASSO by assigning magnitudedependent regular izations to different coordinates of the estimate. In this paper, we present an asymptotically exact characterization of the performance of SLOPE in the highdimensional regime where the number of unknown parameters grows in proportion to the number of observations. Our asymptotic characterization enables us to derive optimal regularization sequences to either minimize the MSE or to maximize the power in variable selection under any given level of TypeI error. In both cases, we show that the optimal design can be recast as certain infinitedimensional convex optimization problems, which have efficient and accurate finitedimensional approximations. Numerical simulations verify our asymptotic predictions. They also demonstrate the superi ority of our optimal design over LASSO and a regularization sequence previously proposed in the literature.

We initiate the algorithmic study of retracting a graph into a cycle in the graph, which seeks a mapping of the graph vertices to the cycle vertices so as to minimize the maximum stretch of any edge, subject to the constraint that the restriction of the mapping to the cycle is the identity map. This problem has its roots in the rich theory of retraction of topological spaces, and has strong ties to wellstudied metric embedding problems such as minimum bandwidth and0extension. Our first result is anO(min{k,√n})approximation for retracting any graph on n nodes to a cycle with k nodes. We also show a surprising connection to Sperner’s Lemma that rules out the possibility of improving this result using certain natural convex relaxations of the problem. Nevertheless, if the problem is restricted to planar graphs, we show that we can overcome these integrality gaps by giving an optimal combinatorial algorithm, which is the technical centerpiece of the paper. Building on our planar graph algorithm, we also obtain a constantfactor approximation algorithm for retraction of points in the Euclidean plane to a uniform cycle.

In this paper, we propose MetaMobi, a novel spatiotemporal multidots connectivityaware modeling and Meta model update approach for crowd Mobility learning. MetaMobi analyzes realworld WiFi association data collected from our campus wireless infrastructure, with the goal towards enabling a smart connected campus. Specifically, MetaMobi aims at addressing the following two major challenges with existing crowd mobility sensing system designs: (a) how to handle the spatially, temporally, and contextually varying features in largescale human crowd mobility distributions; and (b) how to adapt to the impacts of such crowd mobility patterns as well as the dynamic changes in crowd sensing infrastructures. To handle the first challenge, we design a novel multidots connectivityaware learning approach, which jointly learns the crowd flow time series of multiple buildings with fusion of spatial graph connectivities and temporal attention mechanisms. Furthermore, to overcome the adaptivity issues due to changes in the crowd sensing infrastructures (e.g., installation of new ac cess points), we further design a novel meta model update approach with Bernoulli dropout, which mitigates the over fitting behaviors of the model given fewshot distributions of new crowd mobility datasets. Extensive experimental evaluations based on the realworld campus wireless dataset (including over 76 million WiFi association andmore »

Summary We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high dimensional setting where p≫n, LDA is not appropriate for two reasons. First, the standard estimate for the withinclass covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule that is obtained from LDA, since it involves all p features. We propose penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher’s discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization–maximization approach to optimize it efficiently when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L1 and fused lasso penalties. Our proposal is equivalent to recasting Fisher’s discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extendingmore »

Abstract Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \ x_i  x_j \^2} {\sigma ^2} ) $ is widely used in graphbased geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma $, and a common practice called selftuned kernel adaptively sets a $\sigma _i$ at each point $x_i$ by the $k$nearest neighbor (kNN) distance. When $x_i$s are sampled from a $d$dimensional manifold embedded in a possibly highdimensional space, unlike with fixedbandwidth kernels, theoretical results of graph Laplacian convergence with selftuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted)Laplacian for a new family of kNN selftuned kernels $W^{(\alpha )}_{ij} = k_0( \frac{ \ x_i  x_j \^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $, where $\hat{\rho }$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $\alpha $. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\varDelta _p$. Specifically, we prove the pointwise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$more »