NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Approximation Scheme for Weighted Metric Clustering via Sherali-Adams

https://doi.org/10.1609/aaai.v38i8.28629

Avdiukhin, Dmitrii; Chatziafratis, Vaggos; Makarychev, Konstantin; Yaroslavtsev, Grigory (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Motivated by applications to classification problems on metric data, we study Weighted Metric Clustering problem: given a metric d over n points and a k x k symmetric matrix A with non-negative entries, the goal is to find a k-partition of these points into clusters C1,...,Ck, while minimizing the sum of A[i,j] * d(u,v) over all pairs of clusters Ci and Cj and all pairs of points u from Ci and v from Cj. Specific choices of A lead to Weighted Metric Clustering capturing well-studied graph partitioning problems in metric spaces, such as Min-Uncut, Min-k-Sum, Min-k-Cut, and more.Our main result is that Weighted Metric Clustering admits a polynomial-time approximation scheme (PTAS). Our algorithm handles all the above problems using the Sherali-Adams linear programming relaxation. This subsumes several prior works, unifies many of the techniques for various metric clustering objectives, and yields a PTAS for several new problems, including metric clustering on manifolds and a new family of hierarchical clustering objectives. Our experiments on the hierarchical clustering objective show that it better captures the ground-truth structural information compared to the popular Dasgupta's objective.
more » « less
Full Text Available
Objective-Based Hierarchical Clustering of Deep Embedding Vectors

Naumov, Stanislav; Yaroslavtsev, Grigory; Avdiukhin, Dmitrii (January 2021, Thirty-Fifth AAAI Conference on Artificial Intelligence)

Full Text Available
Fast Fourier Sparsity Testing

https://doi.org/10.1137/1.9781611976014.10

Yaroslavtsev, Grigory; Zhou, Samson (April 2020, 3rd Symposium on Simplicity in Algorithms)

Full Text Available
"Bring Your Own Greedy"+Max: Near-Optimal 1/2-Approximations for Submodular Knapsack.

Yaroslavtsev, Grigory; Zhou, Samson; Avdiukhin, Dmitrii (April 2020, The 23rd International Conference on Artificial Intelligence and Statistics)

Full Text Available
Approximate F_2-Sketching of Valuation Functions

https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2019.69

Yaroslavtsev, Grigory; Zhou, Samson (July 2019, Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019))

Full Text Available
Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection

Ahmadian, Sara; Chatziafratis, Vaggos; Epasto, Alessandro; Lee, Euiwoong; Mahdian, Mohammad; Yaroslavtsev, Grigory (April 2020, The 23rd International Conference on Artificial Intelligence and Statistics)

Full Text Available
Optimality of Linear Sketching Under Modular Updates

Hosseini, Kaave; Lovett, Shachar; Yaroslavtsev, Grigory (July 2019, 34th Computational Complexity Conference (CCC 2019))

Full Text Available
Hierarchical clustering for euclidean data

Charikar, Moses; Chatziafratis, Vaggos; Niazadeh, Rad; Yaroslavtsev, Grigory (April 2019, The 22nd International Conference on Artificial Intelligence and Statistics)

Full Text Available
Massively Parallel Algorithms and Hardness for Single-Linkage Clustering under ℓp-Distances

Yaroslavtsev, Grigory; Vadapalli, Adithya (July 2018, 35th International Conference on Machine Learning (ICML'18))

We present first massively parallel (MPC) algorithms and hardness of approximation results for computing Single-Linkage Clustering of $$n$$ input $$d$$-dimensional vectors under Hamming, $$\ell_1, \ell_2$$ and $$\ell_\infty$$ distances. All our algorithms run in $$O(\log n)$$ rounds of MPC for any fixed $$d$$ and achieve $$(1+\epsilon)$$-approximation for all distances (except Hamming for which we show an exact algorithm). We also show constant-factor inapproximability results for $$o(\log n)$$-round algorithms under standard MPC hardness assumptions (for sufficiently large dimension depending on the distance used). Efficiency of implementation of our algorithms in Apache Spark is demonstrated through experiments on the largest available vector datasets from the UCI machine learning repository exhibiting speedups of several orders of magnitude.
more » « less
Full Text Available
Multi-dimensional balanced graph partitioning via projected gradient descent

https://doi.org/10.14778/3324301.3324307

Avdiukhin, Dmitrii; Pupyrev, Sergey; Yaroslavtsev, Grigory (April 2019, Proceedings of the VLDB Endowment)

Motivated by performance optimization of large-scale graph processing systems that distribute the graph across multiple machines, we consider the balanced graph partitioning problem. Compared to most of the previous work, we study the multi-dimensional variant in which balance according to multiple weight functions is required. As we demonstrate by experimental evaluation, such multi-dimensional balance is essential for achieving performance improvements for typical distributed graph processing workloads. We propose a new scalable technique for the multidimensional balanced graph partitioning problem. It is based on applying randomized projected gradient descent to a non-convex continuous relaxation of the objective. We show how to implement the new algorithm efficiently in both theory and practice utilizing various approaches for the projection step. Experiments with large-scale graphs containing up to hundreds of billions of edges indicate that our algorithm has superior performance compared to the state of the art.
more » « less
Full Text Available

« Prev Next »

Search for: All records