Random feature maps are used to decrease the computational cost of kernel machines in large-scale problems. The Mondrian kernel is one such example of a fast random feature approximation of the Laplace kernel, generated by a computationally efficient hierarchical random partition of the input space known as the Mondrian process. In this work, we study a variation of this random feature map by applying a uniform random rotation to the input space before running the Mondrian process to approximate a kernel that is invariant under rotations. We obtain a closed-form expression for the isotropic kernel that is approximated, as well as a uniform convergence rate of the uniformly rotated Mondrian kernel to this limit. To this end, we utilize techniques from the theory of stationary random tessellations in stochastic geometry and prove a new result on the geometry of the typical cell of the superposition of uniformly rotated Mondrian tessellations. Finally, we test the empirical performance of this random feature map on both synthetic and real-world datasets, demonstrating its improved performance over the Mondrian kernel on a dataset that is debiased from the standard coordinate axes.
more »
« less
Random Feature Maps for the Itemset Kernel
Although kernel methods efficiently use feature combinations without computing them directly, they do not scale well with the size of the training dataset. Factorization machines (FMs) and related models, on the other hand, enable feature combinations efficiently, but their optimization generally requires solving a non-convex problem. We present random feature maps for the itemset kernel, which uses feature combinations, and includes the ANOVA kernel, the all-subsets kernel, and the standard dot product. Linear models using one of our proposed maps can be used as an alternative to kernel methods and FMs, resulting in better scalability during both training and evaluation. We also present theoretical results for a proposed map, discuss the relationship between factorization machines and linear models using a proposed map for the ANOVA kernel, and relate the proposed feature maps to prior work. Furthermore, we show that the maps can be calculated more efficiently by using a signed circulant matrix projection technique. Finally, we demonstrate the effectiveness of using the proposed maps for real-world datasets.
more »
« less
- Award ID(s):
- 1749833
- PAR ID:
- 10097822
- Date Published:
- Journal Name:
- Association for the Advancement of Artificial Intelligence (AAAI), 2019
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Phylogenetic regression is a type of generalised least squares (GLS) method that incorporates a modelled covariance matrix based on the evolutionary relationships between species (i.e. phylogenetic relationships). While this method has found widespread use in hypothesis testing via phylogenetic comparative methods, such as phylogenetic ANOVA, its ability to account for non‐linear relationships has received little attention.To address this, here we implement a phylogenetic Kernel Ridge Regression (phyloKRR) method that utilises GLS in a high‐dimensional feature space, employing linear combinations of phylogenetically weighted data to account for non‐linearity. We analysed two biological datasets using the Radial Basis Function and linear kernel function. The first dataset contained morphometric data, while the second dataset comprised discrete trait data and diversification rates as response variable. Hyperparameter tuning of the model was achieved through cross‐validation rounds in the training set.In the tested biological datasets, phyloKRR reduced the error rate (as measured by RMSE) by around 20% compared to linear‐based regression when data did not exhibit linear relationships. In simulated datasets, the error rate decreased almost exponentially with the level of non‐linearity.These results show that introducing kernels into phylogenetic regression analysis presents a novel and promising tool for complementing phylogenetic comparative methods. We have integrated this method into Python package named phyloKRR, which is freely available at:https://github.com/ulises‐rosas/phylokrr.more » « less
-
We present Neural Kernel Fields: a novel method for reconstructing implicit 3D shapes based on a learned kernel ridge regression. Our technique achieves state-of-the-art results when reconstructing 3D objects and large scenes from sparse oriented points, and can reconstruct shape categories outside the training set with almost no drop in accuracy. The core insight of our approach is that kernel methods are extremely effective for reconstructing shapes when the chosen kernel has an appropriate inductive bias. We thus factor the problem of shape reconstruction into two parts: (1) a backbone neural network which learns kernel parameters from data, and (2) a kernel ridge regression that fits the input points on-the-fly by solving a simple positive definite linear system using the learned kernel. As a result of this factorization, our reconstruction gains the benefits of datadriven methods under sparse point density while maintaining interpolatory behavior, which converges to the ground truth shape as input sampling density increases. Our experiments demonstrate a strong generalization capability to objects outside the train-set category and scanned scenes. Source code and pretrained models are available at https:// nv-tlabs.github.io/nkf.more » « less
-
Kernel methods provide an elegant framework for developing nonlinear learning algorithms from simple linear methods. Though these methods have superior empirical performance in several real data applications, their usefulness is inhibited by the significant computational burden incurred in large sample situations. Various approximation schemes have been proposed in the literature to alleviate these computational issues, and the approximate kernel machines are shown to retain the empirical performance. However, the theoretical properties of these approximate kernel machines are less well understood. In this work, we theoretically study the trade-off between computational complexity and statistical accuracy in Nystrom approximate kernel principal component analysis (KPCA), wherein we show that the Nystrom approximate KPCA matches the statistical performance of (non-approximate) KPCA while remaining computationally beneficial. Additionally, we show that Nystrom approximate KPCA outperforms the statistical behavior of another popular approximation scheme, the random feature approximation, when applied to KPCA.more » « less
-
The US Drought Monitor (USDM) is a hallmark in real time drought monitoring and assessment as it was developed by multiple agencies to provide an accurate and timely assessment of drought conditions in the US on a weekly basis. The map is built based on multiple physical indicators as well as reported observations from local contributors before human analysts combine the information and produce the drought map using their best judgement. Since human subjectivity is included in the production of the USDM maps, it is not an entirely clear quantitative procedure for other entities to reproduce the maps. In this study, we developed a framework to automatically generate the maps through a machine learning approach by predicting the drought categories across the domain of study. A persistence model served as the baseline model for comparison in the framework. Three machine learning algorithms, logistic regression, random forests, and support vector machines, with four different groups of input data, which formed an overall of 12 different configurations, were used for the prediction of drought categories. Finally, all the configurations were evaluated against the baseline model to select the best performing option. The results showed that our proposed framework could reproduce the drought maps to a near-perfect level with the support vector machines algorithm and the group 4 data. The rest of the findings of this study can be highlighted as: 1) employing the past week drought data as a predictor in the models played an important role in achieving high prediction scores, 2) the nonlinear models, random forest, and support vector machines had a better overall performance compared to the logistic regression models, and 3) with borrowing the neighboring grid cells information, we could compensate the lack of training data in the grid cells with insufficient historical USDM data particularly for extreme and exceptional drought conditions.more » « less
An official website of the United States government

