 NSFPAR ID:
 10402273
 Date Published:
 Journal Name:
 2022 IEEE International Conference on Data Mining (ICDM)
 Page Range / eLocation ID:
 1047 to 1052
 Format(s):
 Medium: X
 Sponsoring Org:
 National Science Foundation
More Like this

null (Ed.)The 2Wasserstein distance (or RMS distance) is a useful measure of similarity between probability distributions with exciting applications in machine learning. For discrete distributions, the problem of computing this distance can be expressed in terms of finding a minimumcost perfect matching on a complete bipartite graph given by two multisets of points A, B ⊂ ℝ2, with A = B = n, where the ground distance between any two points is the squared Euclidean distance between them. Although there is a nearlinear time relative ∊approximation algorithm for the case where the ground distance is Euclidean (Sharathkumar and Agarwal, JACM 2020), all existing relative ∊approximation algorithms for the RMS distance take Ω(n3/2) time. This is primarily because, unlike Euclidean distance, squared Euclidean distance is not a metric. In this paper, for the RMS distance, we present a new ∊approximation algorithm that runs in O(n^5/4 poly{log n, 1/∊}) time. Our algorithm is inspired by a recent approach for finding a minimumcost perfect matching in bipartite planar graphs (Asathulla et al, TALG 2020). Their algorithm depends heavily on the existence of sublinear sized vertex separators as well as shortest path data structures that require planarity. Surprisingly, we are able to design a similar algorithm for a complete geometric graph that is far from planar and does not have any vertex separators. Central components of our algorithm include a quadtreebased distance that approximates the squared Euclidean distance and a data structure that supports both Hungarian search and augmentation in sublinear time.more » « less

Metric Learning, which aims at learning a distance metric for a given data set, plays an important role in measuring the distance or similarity between data objects. Due to its broad usefulness, it has attracted a lot of interest in machine learning and related areas in the past few decades. This paper proposes to learn the distance metric from the side information in the forms of mustlinks and cannotlinks. Given the pairwise constraints, our goal is to learn a Mahalanobis distance that minimizes the ratio of the distances of the data pairs in the mustlinks to those in the cannotlinks. Different from many existing papers that use the traditional squared L2norm distance, we develop a robust model that is less sensitive to data noise or outliers by using the notsquared L2norm distance. In our objective, the orthonormal constraint is enforced to avoid degenerate solutions. To solve our objective, we have derived an efficient iterative solution algorithm. We have conducted extensive experiments, which demonstrated the superiority of our method over stateoftheart.

We study the meanstandard deviation minimum cost flow (MSDMCF) problem, where the objective is minimizing a linear combination of the mean and standard deviation of flow costs. Due to the nonlinearity and nonseparability of the objective, the problem is not amenable to the standard algorithms developed for network flow problems. We prove that the solution for the MSDMCF problem coincides with the solution for a particular meanvariance minimum cost flow (MVMCF) problem. Leveraging this result, we propose bisection (BSC), Newton–Raphson (NR), and a hybrid (NRBSC)—method seeking to find the specific MVMCF problem whose optimal solution coincides with the optimal solution for the given MSDMCF problem. We further show that this approach can be extended to solve more generalized nonseparable parametric minimum cost flow problems under certain conditions. Computational experiments show that the NR algorithm is about twice as fast as the CPLEX solver on benchmark networks generated with NETGEN.more » « less

Lowrank matrix recovery is a fundamental problem in machine learning with numerous applications. In practice, the problem can be solved by convex optimization namely nuclear norm minimization, or by nonconvex optimization as it is wellknown that for lowrank matrix problems like matrix sensing and matrix completion, all local optima of the natural nonconvex objectives are also globally optimal under certain ideal assumptions. In this paper, we study new approaches for matrix sensing in a semirandom model where an adversary can add any number of arbitrary sensing matrices. More precisely, the problem is to recover a lowrank matrix $X^\star$ from linear measurements $b_i = \langle A_i, X^\star \rangle$, where an unknown subset of the sensing matrices satisfies the Restricted Isometry Property (RIP) and the rest of the $A_i$'s are chosen adversarially. It is known that in the semirandom model, existing nonconvex objectives can have bad local optima. To fix this, we present a descentstyle algorithm that provably recovers the groundtruth matrix $X^\star$. For the closelyrelated problem of semirandom matrix completion, prior work [CG18] showed that all bad local optima can be eliminated by reweighting the input data. However, the analogous approach for matrix sensing requires reweighting a set of matrices to satisfy RIP, which is a condition that is NPhard to check. Instead, we build on the framework proposed in [KLL$^+$23] for semirandom sparse linear regression, where the algorithm in each iteration reweights the input based on the current solution, and then takes a weighted gradient step that is guaranteed to work well locally. Our analysis crucially exploits the connection between sparsity in vector problems and lowrankness in matrix problems, which may have other applications in obtaining robust algorithms for sparse and lowrank problems.more » « less

We study offline reinforcement learning (RL) with heavytailed reward distribution and data corruption: (i) Moving beyond subGaussian reward distribution, we allow the rewards to have infinite variances; (ii) We allow corruptions where an attacker can arbitrarily modify a small fraction of the rewards and transitions in the dataset. We first derive a sufficient optimality condition for generalized Pessimistic Value Iteration (PEVI), which allows various estimators with proper confidence bounds and can be applied to multiple learning settings. In order to handle the data corruption and heavytailed reward setting, we prove that the trimmedmean estimation achieves the minimax optimal error rate for robust mean estimation under heavytailed distributions. In the PEVI algorithm, we plug in the trimmed mean estimation and the confidence bound to solve the robust offline RL problem. Standard analysis reveals that data corruption induces a bias term in the suboptimality gap, which gives the false impression that any data corruption prevents optimal policy learning. By using the optimality condition for the generalized PEVI, we show that as long as the bias term is less than the ``action gap'', the policy returned by PEVI achieves the optimal value given sufficient data.