We introduce two generalizations to the paradigm of using Random Khatri-Rao Product (RKRP) codes for distributed matrix multiplication. We first introduce a class of codes called Sparse Random Khatri-Rao Product (SRKRP) codes which have sparse generator matrices. SRKRP codes result in lower encoding, computation and communication costs than RKRP codes when the input matrices are sparse, while they exhibit similar numerical stability to other state of the art schemes. We empirically study the relationship between the probability of the generator matrix (restricted to the set of non-stragglers) of a randomly chosen SRKRP code being rank deficient and various parameters of the coding scheme including the degree of sparsity of the generator matrix and the number of non-stragglers. Secondly, we show that if the master node can perform a very small number of matrix product computations in addition to the computations performed by the workers, the failure probability can be substantially improved.
more »
« less
Smoothed Estimators for Markov Chains with Sparse Spatial Observations
Empirical applications of the Markov chain model and its spatial extensions suffer from issues induced by the sparse transition probability matrix, which usually results from adopting maximum likelihood estimators (MLEs). Two discrete kernel estimators with cross‐validated parameters are proposed for reducing the sparsity in the estimated transition probability matrix. Monte Carlo experiments suggest that these estimators are not only quite effective in producing a much less sparse matrix, alleviating issues related to sparsity, but also superior to MLEs in terms of lowering the mean squared error for individual and total transition probability, giving rise to the better recovery of the underlying dynamics.
more »
« less
- PAR ID:
- 10120325
- Publisher / Repository:
- Wiley-Blackwell
- Date Published:
- Journal Name:
- Geographical Analysis
- Volume:
- 53
- Issue:
- 2
- ISSN:
- 0016-7363
- Page Range / eLocation ID:
- p. 307-328
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
SpMV, the product of a sparse matrix and a dense vector, is emblematic of a new class of applications that are memory bandwidth and communication, not flop, driven. Sparsity and randomness in such computations play havoc with performance, especially when strong, instead of weak, scaling is attempted. In this study we develop and evaluate a hybrid implementation for strong scaling of the Compressed Vectorization-oriented sparse Row (CVR) approach to SpMV on a cluster of Intel Xeon Phi Knights Landing (KNL) processors. We show how our hybrid SpMV implementation achieves increased computational performance, yet does not address the dominant communication overhead factor at extreme scale. Issues with workload distribution, data placement, and remote reductions are assessed over a range of matrix characteristics. Our results indicate that as P ! 1 communication overhead is by far the dominant factor despite improved computational performance.more » « less
-
null (Ed.)SpMV, the product of a sparse matrix and a dense vector, is emblematic of a new class of applications that are memory bandwidth and communication, not flop, driven. Sparsity and randomness in such computations play havoc with performance, especially when strong, instead of weak, scaling is attempted. In this study we develop and evaluate a hybrid implementation for strong scaling of the Compressed Vectorization-oriented sparse Row (CVR) approach to SpMV on a cluster of Intel Xeon Phi Knights Landing (KNL) processors. We show how our hybrid SpMV implementation achieves increased computational performance, yet does not address the dominant communication overhead factor at extreme scale. Issues with workload distribution, data placement, and remote reductions are assessed over a range of matrix characteristics. Our results indicate that as P ! 1 communication overhead is by far the dominant factor despite improved computational performance.more » « less
-
SpMV, the product of a sparse matrix and a dense vector, is emblematic of a new class of applications that are memory bandwidth and communication, not flop, driven. Sparsity and randomness in such computations play havoc with conventional implementations, especially when strong, instead of weak, scaling is attempted. This paper studies improved hybrid SpMV codes that have better performance, especially for the sparsest of such problems. Issues with both data placement and remote reductions are modeled over a range of matrix characteristics. Those factors that limit strong scalability are quantified.more » « less
-
Abstract In this paper, we propose a new framework to construct confidence sets for a $$d$$-dimensional unknown sparse parameter $${\boldsymbol \theta }$$ under the normal mean model $${\boldsymbol X}\sim N({\boldsymbol \theta },\sigma ^{2}\bf{I})$$. A key feature of the proposed confidence set is its capability to account for the sparsity of $${\boldsymbol \theta }$$, thus named as sparse confidence set. This is in sharp contrast with the classical methods, such as the Bonferroni confidence intervals and other resampling-based procedures, where the sparsity of $${\boldsymbol \theta }$$ is often ignored. Specifically, we require the desired sparse confidence set to satisfy the following two conditions: (i) uniformly over the parameter space, the coverage probability for $${\boldsymbol \theta }$$ is above a pre-specified level; (ii) there exists a random subset $$S$$ of $$\{1,...,d\}$$ such that $$S$$ guarantees the pre-specified true negative rate for detecting non-zero $$\theta _{j}$$’s. To exploit the sparsity of $${\boldsymbol \theta }$$, we allow the confidence interval for $$\theta _{j}$$ to degenerate to a single point 0 for any $$j\notin S$$. Under this new framework, we first consider whether there exist sparse confidence sets that satisfy the above two conditions. To address this question, we establish a non-asymptotic minimax lower bound for the non-coverage probability over a suitable class of sparse confidence sets. The lower bound deciphers the role of sparsity and minimum signal-to-noise ratio (SNR) in the construction of sparse confidence sets. Furthermore, under suitable conditions on the SNR, a two-stage procedure is proposed to construct a sparse confidence set. To evaluate the optimality, the proposed sparse confidence set is shown to attain a minimax lower bound of some properly defined risk function up to a constant factor. Finally, we develop an adaptive procedure to the unknown sparsity. Numerical studies are conducted to verify the theoretical results.more » « less
An official website of the United States government
