Data augmentation is widely used for training a neural network given little labeled data. A common practice of augmentation training is applying a composition of multiple transformations sequentially to the data. Existing augmentation methods such as RandAugment randomly sample from a list of pre-selected transformations, while methods such as AutoAugment apply advanced search to optimize over an augmentation set of size $k^d$, which is the number of transformation sequences of length $$d$$, given a list of $$k$$ transformations. In this paper, we design efficient algorithms whose running time complexity is much faster than the worst-case complexity of $O(k^d)$, provably. We propose a new algorithm to search for a binary tree-structured composition of $$k$$ transformations, where each tree node corresponds to one transformation. The binary tree generalizes sequential augmentations, such as the SimCLR augmentation scheme for contrastive learning. Using a top-down, recursive search procedure, our algorithm achieves a runtime complexity of $O(2^d k)$, which is much faster than $O(k^d)$ as $$k$$ increases above $$2$$. We apply our algorithm to tackle data distributions with heterogeneous subpopulations by searching for one tree in each subpopulation and then learning a weighted combination, resulting in a \emph{forest} of trees. We validate our proposed algorithms on numerous graph and image datasets, including a multi-label graph classification dataset we collected. The dataset exhibits significant variations in the sizes of graphs and their average degrees, making it ideal for studying data augmentation. We show that our approach can reduce the computation cost by 43% over existing search methods while improving performance by 4.3%. The tree structures can be used to interpret the relative importance of each transformation, such as identifying the important transformations on small vs. large graphs. 
                        more » 
                        « less   
                    
                            
                            Distributionally Robust Losses for Latent Covariate Mixtures
                        
                    
    
            While modern large-scale data sets often consist of heterogeneous subpopulations—for example, multiple demographic groups or multiple text corpora—the standard practice of minimizing average loss fails to guarantee uniformly low losses across all subpopulations. We propose a convex procedure that controls the worst case performance over all subpopulations of a given size. Our procedure comes with finite-sample (nonparametric) convergence guarantees on the worst-off subpopulation. Empirically, we observe on lexical similarity, wine quality, and recidivism prediction tasks that our worst case procedure learns models that do well against unseen subpopulations. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2006777
- PAR ID:
- 10378817
- Date Published:
- Journal Name:
- Operations Research
- ISSN:
- 0030-364X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Weighting methods are popular tools for estimating causal effects, and assessing their robustness under unobserved confounding is important in practice. Current approaches to sensitivity analyses rely on bounding a worst-case error from omitting a confounder. In this paper, we introduce a new sensitivity model called the variance-based sensitivity model, which instead bounds the distributional differences that arise in the weights from omitting a confounder. The variance-based sensitivity model can be parameterized by an R2 parameter that is both standardized and bounded. We demonstrate, both empirically and theoretically, that the variance-based sensitivity model provides improvements on the stability of the sensitivity analysis procedure over existing methods. We show that by moving away from worst-case bounds, we are able to obtain more interpretable and informative bounds. We illustrate our proposed approach on a study examining blood mercury levels using the National Health and Nutrition Examination Survey.more » « less
- 
            null (Ed.)Maximal Independent Set (MIS) is one of the fundamental problems in distributed computing. The round (time) complexity of distributed MIS has traditionally focused on the worst-case time for all nodes to finish. The best-known (randomized) MIS algorithms take O(log n) worst-case rounds on general graphs (where n is the number of nodes). Breaking the O(log n) worst-case bound has been a longstanding open problem, while currently the best-known lower bound is [EQUATION] rounds. Motivated by the goal to reduce total energy consumption in energy-constrained networks such as sensor and ad hoc wireless networks, we take an alternative approach to measuring performance. We focus on minimizing the total (or equivalently, the average) time for all nodes to finish. It is not clear whether the currently best-known algorithms yield constant-round (or even o(log n)) node-averaged round complexity for MIS in general graphs. We posit the sleeping model, a generalization of the traditional model, that allows nodes to enter either "sleep" or "waking" states at any round. While waking state corresponds to the default state in the traditional model, in sleeping state a node is "offline", i.e., it does not send or receive messages (and messages sent to it are dropped as well) and does not incur any time, communication, or local computation cost. Hence, in this model, only rounds in which a node is awake are counted and we are interested in minimizing the average as well as the worst-case number of rounds a node spends in the awake state, besides the traditional worst-case round complexity (i.e., the rounds for all nodes to finish including both the awake and sleeping rounds). Our main result is that we show that MIS can be solved in (expected) O(1) rounds under node-averaged awake complexity measure in the sleeping model. In particular, we present a randomized distributed algorithm for MIS that has expected O(1)-rounds node-averaged awake complexity and, with high probability1 has O(log n)-rounds worst-case awake complexity and O(log3.41 n)-rounds worst-case complexity. Our work is a step towards understanding the node-averaged complexity of MIS both in the traditional and sleeping models, as well as designing energy-efficient distributed algorithms for energy-constrained networks.more » « less
- 
            Abstract Recent work has shown that Tor is vulnerable to attacks that manipulate inter-domain routing to compromise user privacy. Proposed solutions such as Counter-RAPTOR [29] attempt to ameliorate this issue by favoring Tor entry relays that have high resilience to these attacks. However, because these defenses bias Tor path selection on the identity of the client, they invariably leak probabilistic information about client identities. In this work, we make the following contributions. First, we identify a novel means to quantify privacy leakage in guard selection algorithms using the metric of Max-Divergence. Max-Divergence ensures that probabilistic privacy loss is within strict bounds while also providing composability over time. Second, we utilize Max-Divergence and multiple notions of entropy to understand privacy loss in the worst-case for Counter-RAPTOR. Our worst-case analysis provides a fresh perspective to the field, as prior work such as Counter-RAPTOR only analyzed average case-privacy loss. Third, we propose modifications to Counter-RAPTOR that incorporate worst-case Max-Divergence in its design. Specifically, we utilize the exponential mechanism (a mechanism for differential privacy) to guarantee a worst-case bound on Max-Divergence/privacy loss. For the quality function used in the exponential mechanism, we show that a Monte-Carlo sampling-based method for stochastic optimization can be used to improve multi-dimensional trade-offs between security, privacy, and performance. Finally, we demonstrate that compared to Counter-RAPTOR, our approach achieves an 83% decrease in Max-Divergence after one guard selection and a 245% increase in worst-case Shannon entropy after 5 guard selections. Notably, experimental evaluations using the Shadow emulator shows that our approach provides these privacy benefits with minimal impact on system performance.more » « less
- 
            Federated Survival Analysis (FSA) is an emerging Federated Learning (FL) paradigm that enables training survival models on decentralized data while preserving privacy. However, existing FSA approaches largely overlook the potential risk of bias in predictions arising from demographic and censoring disparities across clients' datasets, which impacts the fairness and performance of federated survival models, especially for underrepresented groups. To address this gap, we introduce FairFSA, a novel FSA framework that adapts existing fair survival models to the federated setting. FairFSA jointly trains survival models using distributionally robust optimization, penalizing worst-case errors across subpopulations that exceed a specified probability threshold. Partially observed survival outcomes in clients are reconstructed with federated pseudo values (FPV) before model training to address censoring. Furthermore, we design a weight aggregation strategy by enhancing the FedAvg algorithm with a fairness-aware concordance index-based aggregation method to foster equitable performance distribution across clients. To the best of our knowledge, this is the first work to study and integrate fairness into Federated Survival Analysis. Comprehensive experiments on distributed non-IID datasets demonstrate FairFSA's superiority in fairness and accuracy over state-of-the-art FSA methods, establishing it as a robust FSA approach capable of handling censoring while providing equitable and accurate survival predictions for all subjects.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    