Deep neural networks are powerful tools to detect hidden patterns in data and leverage them to make predictions, but they are not designed to understand uncertainty and estimate reliable probabilities. In particular, they tend to be overconfident. We begin to address this problem in the context of multi-class classification by developing a novel training algorithm producing models with more dependable uncertainty estimates, without sacrificing predictive power. The idea is to mitigate overconfidence by minimizing a loss function, inspired by advances in conformal inference, that quantifies model uncertainty by carefully leveraging hold-out data. Experiments with synthetic and real data demonstrate this method can lead to smaller conformal prediction sets with higher conditional coverage, after exact calibration with hold-out data, compared to state-of-the-art alternatives. 
                        more » 
                        « less   
                    
                            
                            Conformal Inference is (almost) Free for Neural Networks Trained with Early Stopping
                        
                    
    
            Early stopping based on hold-out data is a popular regularization technique designed to mitigate overfitting and increase the predictive accuracy of neural networks. Models trained with early stopping often provide relatively accurate predictions,but they generally still lack precise statistical guarantees unless they are further calibrated using independent hold-out data. This paper addresses the above limitation with conformalized early stopping: a novel method that combines early stopping with conformal calibration while efficiently recycling the same hold-out data. This leads to models that are both accurate and able to provide exact predictive inferences without multiple data splits nor overly conservative adjustments. Practical implementations are developed for different learning tasks—outlier detection, multi-class classification, regression—and their competitive performance is demonstrated on real data. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2210637
- PAR ID:
- 10504894
- Publisher / Repository:
- Proceedings of Machine Learning Research
- Date Published:
- Journal Name:
- Proceedings of the 40 th International Conference on Machine Learning
- Format(s):
- Medium: X
- Location:
- Honolulu, Hawaii
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Summary Distinguishing two models is a fundamental and practically important statistical problem. Error rate control is crucial to the testing logic, but in complex nonparametric settings can be difficult to achieve, especially when the stopping rule that determines the data collection process is not available. This paper proposes an $ e $-process construction based on the predictive recursion algorithm originally designed to recursively fit nonparametric mixture models. The resulting predictive recursion $ e $-process affords anytime-valid inference and is asymptotically efficient in the sense that its growth rate is first-order optimal relative to the predictive recursion’s mixture model.more » « less
- 
            We derive new algorithms for online multiple testing that provably control false discovery exceedance (FDX) while achieving orders of magnitude more power than previous methods. This statistical advance is enabled by the development of new algorithmic ideas: earlier algorithms are more “static” while our new ones allow for the dynamical adjustment of testing levels based on the amount of wealth the algorithm has accumulated. We demonstrate that our algorithms achieve higher power in a variety of synthetic experiments. We also prove that SupLORD can provide error control for both FDR and FDX, and controls FDR at stopping times. Stopping times are particularly important as they permit the experimenter to end the experiment arbitrarily early while maintaining desired control of the FDR. SupLORD is the first non-trivial algorithm, to our knowledge, that can control FDR at stopping times in the online setting.more » « less
- 
            Abstract The ability to accurately measure recovery rate of infrastructure systems and communities impacted by disasters is vital to ensure effective response and resource allocation before, during, and after a disruption. However, a challenge in quantifying such measures resides in the lack of data as community recovery information is seldom recorded. To provide accurate community recovery measures, a hierarchical Bayesian kernel model (HBKM) is developed to predict the recovery rate of communities experiencing power outages during storms. The performance of the proposed method is evaluated using cross‐validation and compared with two models, the hierarchical Bayesian regression model and the Poisson generalized linear model. A case study focusing on the recovery of communities in Shelby County, Tennessee after severe storms between 2007 and 2017 is presented to illustrate the proposed approach. The predictive accuracy of the models is evaluated using the log‐likelihood and root mean squared error. The HBKM yields on average the highest out‐of‐sample predictive accuracy. This approach can help assess the recoverability of a community when data are scarce and inform decision making in the aftermath of a disaster. An illustrative example is presented demonstrating how accurate measures of community resilience can help reduce the cost of infrastructure restoration.more » « less
- 
            For predictive models to provide reliable guidance in decision making processes, they are often required to be accurate and robust to distribution shifts. Shortcut learning–where a model relies on spurious correlations or shortcuts to predict the target label–undermines the robustness property, leading to models with poor out-of-distribution accuracy despite good in-distribution performance. Existing work on shortcut learning either assumes that the set of possible shortcuts is known a priori or is discoverable using interpretability methods such as saliency maps, which might not always be true. Instead, we propose a two step approach to (1) efficiently identify relevant shortcuts, and (2) leverage the identified shortcuts to build models that are robust to distribution shifts. Our approach relies on having access to a (possibly) high dimensional set of auxiliary labels at training time, some of which correspond to possible shortcuts. We show both theoretically and empirically that our approach is able to identify a sufficient set of shortcuts leading to more efficient predictors in finite samples.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    