skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Learning Tree-Structured Composition of Data Augmentation
Data augmentation is widely used for training a neural network given little labeled data. A common practice of augmentation training is applying a composition of multiple transformations sequentially to the data. Existing augmentation methods such as RandAugment randomly sample from a list of pre-selected transformations, while methods such as AutoAugment apply advanced search to optimize over an augmentation set of size $k^d$, which is the number of transformation sequences of length $$d$$, given a list of $$k$$ transformations. In this paper, we design efficient algorithms whose running time complexity is much faster than the worst-case complexity of $O(k^d)$, provably. We propose a new algorithm to search for a binary tree-structured composition of $$k$$ transformations, where each tree node corresponds to one transformation. The binary tree generalizes sequential augmentations, such as the SimCLR augmentation scheme for contrastive learning. Using a top-down, recursive search procedure, our algorithm achieves a runtime complexity of $O(2^d k)$, which is much faster than $O(k^d)$ as $$k$$ increases above $$2$$. We apply our algorithm to tackle data distributions with heterogeneous subpopulations by searching for one tree in each subpopulation and then learning a weighted combination, resulting in a \emph{forest} of trees. We validate our proposed algorithms on numerous graph and image datasets, including a multi-label graph classification dataset we collected. The dataset exhibits significant variations in the sizes of graphs and their average degrees, making it ideal for studying data augmentation. We show that our approach can reduce the computation cost by 43% over existing search methods while improving performance by 4.3%. The tree structures can be used to interpret the relative importance of each transformation, such as identifying the important transformations on small vs. large graphs.  more » « less
Award ID(s):
2412008
PAR ID:
10593552
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Journal of Machine Learning Research Inc., United States
Date Published:
Journal Name:
Transactions on Machine Learning Research
ISSN:
2835-8856
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This paper presents a policy-driven sequential image augmentation approach for image-related tasks. Our approach applies a sequence of image transformations (e.g., translation, rotation) over a training image, one transformation at a time, with the augmented image from the previous time step treated as the input for the next transformation. This sequential data augmentation substantially improves sample diversity, leading to improved test performance, especially for data-hungry models (e.g., deep neural networks). However, the search for the optimal transformation of each image at each time step of the sequence has high complexity due to its combination nature. To address this challenge, we formulate the search task as a sequential decision process and introduce a deep policy network that learns to produce transformations based on image content. We also develop an iterative algorithm to jointly train a classifier and the policy network in the reinforcement learning setting. The immediate reward of a potential transformation is defined to encourage transformations producing hard samples for the current classifier. At each iteration, we employ the policy network to augment the training dataset, train a classifier with the augmented data, and train the policy net with the aid of the classifier. We apply the above approach to both public image classification benchmarks and a newly collected image dataset for material recognition. Comparisons to alternative augmentation approaches show that our policy-driven approach achieves comparable or improved classification performance while using significantly fewer augmented images. The code is available at https://github.com/Paul-LiPu/rl_autoaug. 
    more » « less
  2. Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola (Ed.)
    In the k-Disjoint Shortest Paths (k-DSP) problem, we are given a graph G (with positive edge weights) on n nodes and m edges with specified source vertices s_1, … , s_k, and target vertices t_1, … , t_k, and are tasked with determining if G contains vertex-disjoint (s_i,t_i)-shortest paths. For any constant k, it is known that k-DSP can be solved in polynomial time over undirected graphs and directed acyclic graphs (DAGs). However, the exact time complexity of k-DSP remains mysterious, with large gaps between the fastest known algorithms and best conditional lower bounds. In this paper, we obtain faster algorithms for important cases of k-DSP, and present better conditional lower bounds for k-DSP and its variants. Previous work solved 2-DSP over weighted undirected graphs in O(n⁷) time, and weighted DAGs in O(mn) time. For the main result of this paper, we present optimal linear time algorithms for solving 2-DSP on weighted undirected graphs and DAGs. Our linear time algorithms are algebraic however, and so only solve the detection rather than search version of 2-DSP (we show how to solve the search version in O(mn) time, which is faster than the previous best runtime in weighted undirected graphs, but only matches the previous best runtime for DAGs). We also obtain a faster algorithm for k-Edge Disjoint Shortest Paths (k-EDSP) in DAGs, the variant of k-DSP where one seeks edge-disjoint instead of vertex-disjoint paths between sources and their corresponding targets. Algorithms for k-EDSP on DAGs from previous work take Ω(m^k) time. We show that k-EDSP can be solved over DAGs in O(mn^{k-1}) time, matching the fastest known runtime for solving k-DSP over DAGs. Previous work established conditional lower bounds for solving k-DSP and its variants via reductions from detecting cliques in graphs. Prior work implied that k-Clique can be reduced to 2k-DSP in DAGs and undirected graphs with O((kn)²) nodes. We improve this reduction, by showing how to reduce from k-Clique to k-DSP in DAGs and undirected graphs with O((kn)²) nodes (halving the number of paths needed in the reduced instance). A variant of k-DSP is the k-Disjoint Paths (k-DP) problem, where the solution paths no longer need to be shortest paths. Previous work reduced from k-Clique to p-DP in DAGs with O(kn) nodes, for p = k + k(k-1)/2. We improve this by showing a reduction from k-Clique to p-DP, for p = k + ⌊k²/4⌋. Under the k-Clique Hypothesis from fine-grained complexity, our results establish better conditional lower bounds for k-DSP for all k ≥ 4, and better conditional lower bounds for p-DP for all p ≤ 4031. Notably, our work gives the first nontrivial conditional lower bounds 4-DP in DAGs and 4-DSP in undirected graphs and DAGs. Before our work, nontrivial conditional lower bounds were only known for k-DP and k-DSP on such graphs when k ≥ 6. 
    more » « less
  3. The K-Truss of a graph is a cohesive subgraph that has been widely used for community detection in applications such as social networks and security analysis. In this paper, we first propose one optimized triangle search kernel with a few operations that can be used in both triangle counting and triangle search to replace the existing list intersection method. Based on the optimized kernel, three truss analytics algorithms, an optimized K-Truss parallel algorithm, a maximal K-Truss parallel algorithm, and a Truss decomposition parallel algorithm, are developed to efficiently enable different kinds of graph analysis. Moreover, all proposed parallel algorithms have been implemented in the highly-productive parallel language Chapel and integrated into the open-source framework Arkouda. Experimental results compared with the existing list intersection-based method show that for both synthetic and real-world graphs, the proposed method can significantly improve the performance of truss analysis on large graphs. The implemented method is publicly available from GitHub. 
    more » « less
  4. Bun, Mark (Ed.)
    We introduce and study the problem of balanced districting, where given an undirected graph with vertices carrying two types of weights (different population, resource types, etc) the goal is to maximize the total weights covered in vertex disjoint districts such that each district is a star or (in general) a connected induced subgraph with the two weights to be balanced. This problem is strongly motivated by political redistricting, where contiguity, population balance, and compactness are essential. We provide hardness and approximation algorithms for this problem. In particular, we show NP-hardness for an approximation better than n^{1/2-δ} for any constant δ > 0 in general graphs even when the districts are star graphs, as well as NP-hardness on complete graphs, tree graphs, planar graphs and other restricted settings. On the other hand, we develop an algorithm for balanced star districting that gives an O(√n)-approximation on any graph (which is basically tight considering matching hardness of approximation results), an O(log n) approximation on planar graphs with extensions to minor-free graphs. Our algorithm uses a modified Whack-a-Mole algorithm [Bhattacharya, Kiss, and Saranurak, SODA 2023] to find a sparse solution of a fractional packing linear program (despite exponentially many variables) which requires a new design of a separation oracle specific for our balanced districting problem. To turn the fractional solution to a feasible integer solution, we adopt the randomized rounding algorithm by [Chan and Har-Peled, SoCG 2009]. To get a good approximation ratio of the rounding procedure, a crucial element in the analysis is the balanced scattering separators for planar graphs and minor-free graphs - separators that can be partitioned into a small number of k-hop independent sets for some constant k - which may find independent interest in solving other packing style problems. Further, our algorithm is versatile - the very same algorithm can be analyzed in different ways on various graph classes, which leads to class-dependent approximation ratios. We also provide a FPTAS algorithm for complete graphs and tree graphs, as well as greedy algorithms and approximation ratios when the district cardinality is bounded, the graph has bounded degree or the weights are binary. We refer the readers to the full version of the paper for complete set of results and proofs. 
    more » « less
  5. Gortz, Inge Li; Farach-Colton, Martin; Puglisi, Simon J.; Herman, Grzegorz (Ed.)
    Computing the diameter of a graph, i.e. the largest distance, is a fundamental problem that is central in fine-grained complexity. In undirected graphs, the Strong Exponential Time Hypothesis (SETH) yields a lower bound on the time vs. approximation trade-off that is quite close to the upper bounds. In directed graphs, however, where only some of the upper bounds apply, much larger gaps remain. Since d(u,v) may not be the same as d(v,u), there are multiple ways to define the problem, the two most natural being the (one-way) diameter (max_(u,v) d(u,v)) and the roundtrip diameter (max_{u,v} d(u,v)+d(v,u)). In this paper we make progress on the outstanding open question for each of them. - We design the first algorithm for diameter in sparse directed graphs to achieve n^{1.5-ε} time with an approximation factor better than 2. The new upper bound trade-off makes the directed case appear more similar to the undirected case. Notably, this is the first algorithm for diameter in sparse graphs that benefits from fast matrix multiplication. - We design new hardness reductions separating roundtrip diameter from directed and undirected diameter. In particular, a 1.5-approximation in subquadratic time would refute the All-Nodes k-Cycle hypothesis, and any (2-ε)-approximation would imply a breakthrough algorithm for approximate 𝓁_∞-Closest-Pair. Notably, these are the first conditional lower bounds for diameter that are not based on SETH. 
    more » « less