Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to nonfederal websites. Their policies may differ from this site.

Noisy labels can significantly affect the performance of deep neural networks (DNNs). In medical image segmentation tasks, annotations are errorprone due to the high demand in annotation time and in the annotators' expertise. Existing methods mostly tackle label noise in classification tasks. Their independentnoise assumptions do not fit label noise in segmentation task. In this paper, we propose a novel noise model for segmentation problems that encodes spatial correlation and bias, which are prominent in segmentation annotations. Further, to mitigate such label noise, we propose a label correction method to recover true label progressively. We provide theoretical guarantees of the correctness of the proposed method. Experiments show that our approach outperforms current stateoftheart methods on both synthetic and realworld noisy annotations.more » « less

Noisy labels can significantly affect the performance of deep neural networks (DNNs). In medical image segmentation tasks, annotations are errorprone due to the high demand in annotation time and in the annotators' expertise. Existing methods mostly tackle label noise in classification tasks. Their independentnoise assumptions do not fit label noise in segmentation task. In this paper, we propose a novel noise model for segmentation problems that encodes spatial correlation and bias, which are prominent in segmentation annotations. Further, to mitigate such label noise, we propose a label correction method to recover true label progressively. We provide theoretical guarantees of the correctness of the proposed method. Experiments show that our approach outperforms current stateoftheart methods on both synthetic and realworld noisy annotations.more » « less

The adversarial risk of a machine learning model has been widely studied. Most previous works assume that the data lies in the whole ambient space. We propose to take a new angle and take the manifold assumption into consideration. Assuming data lies in a manifold, we investigate two new types of adversarial risk, the normal adversarial risk due to perturbation along normal direction, and the inmanifold adversarial risk due to perturbation within the manifold. We prove that the classic adversarial risk can be bounded from both sides using the normal and inmanifold adversarial risks. We also show with a surprisingly pessimistic case that the standard adversarial risk can be nonzero even when both normal and inmanifold risks are zero. We finalize the paper with empirical studies supporting our theoretical results. Our results suggest the possibility of improving the robustness of a classifier by only focusing on the normal adversarial risk.more » « less

There has been a longstanding interest in computing diverse solutions to optimization problems. In 1995 J. Krarup [28] posed the problem of finding kedge disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solutions to spanning trees, paths, vertex covers, matchings, and more. Unlike the PSP that has a constraint on the total weight of the solutions, recent work has involved finding diverse solutions that are all optimal. However, sometimes the space of exact solutions may be too small to achieve sufficient diversity. Motivated by this, we initiate the study of obtaining sufficientlydiverse, yet approximatelyoptimal solutions to optimization problems. Formally, given an integer k, an approximation factor c, and an instance I of an optimization problem, we aim to obtain a set of k solutions to I that a) are all c approximatelyoptimal for I and b) maximize the diversity of the k solutions. Finding such solutions, therefore, requires a better understanding of the global landscape of the optimization function. Given a metric on the space of solutions, and the diversity measure as the sum of pairwise distances between solutions, we first provide a general reduction to an associated budgetconstrained optimization (BCO) problem, where one objective function is to optimized subject to a bound on the second objective function. We then prove that biapproximations to the BCO can be used to give biapproximations to the diverse approximately optimal solutions problem. As applications of our result, we present polynomial time approximation algorithms for several problems such as diverse capproximate maximum matchings, shortest paths, global mincut, and minimum weight bases of a matroid. The last result gives us diverse capproximate minimum spanning trees, advancing a step towards achieving diverse capproximate TSP tours. We also explore the connection to the field of multiobjective optimization and show that the class of problems to which our result applies includes those for which the associated DUALRESTRICT problem defined by Papadimitriou and Yannakakis [35], and recently explored by Herzel et al. [26] can be solved in polynomial timore » « less

There has been a longstanding interest in computing diverse solutions to optimization problems. In 1995 J. Krarup [28] posed the problem of finding kedge disjoint Hamiltonian Circuits of minimum total weight, called the peripatetic salesman problem (PSP). Since then researchers have investigated the complexity of finding diverse solutions to spanning trees, paths, vertex covers, matchings, and more. Unlike the PSP that has a constraint on the total weight of the solutions, recent work has involved finding diverse solutions that are all optimal. However, sometimes the space of exact solutions may be too small to achieve sufficient diversity. Motivated by this, we initiate the study of obtaining sufficientlydiverse, yet approximatelyoptimal solutions to optimization problems. Formally, given an integer k, an approximation factor c, and an instance I of an optimization problem, we aim to obtain a set of k solutions to I that a) are all c approximatelyoptimal for I and b) maximize the diversity of the k solutions. Finding such solutions, therefore, requires a better understanding of the global landscape of the optimization function. Given a metric on the space of solutions, and the diversity measure as the sum of pairwise distances between solutions, we first provide a general reduction to an associated budgetconstrained optimization (BCO) problem, where one objective function is to optimized subject to a bound on the second objective function. We then prove that biapproximations to the BCO can be used to give biapproximations to the diverse approximately optimal solutions problem. As applications of our result, we present polynomial time approximation algorithms for several problems such as diverse capproximate maximum matchings, shortest paths, global mincut, and minimum weight bases of a matroid. The last result gives us diversecapproximate minimum spanning trees, advancing a step towards achieving diverse capproximate TSP tours. We also explore the connection to the field of multiobjective optimization and show that the class of problems to which our result applies includes those for which the associated DUALRESTRICT problem defined by Papadimitriou and Yannakakis [35], and recently explored by Herzel et al. [26] can be solved in polynomial time.more » « less

Stochastic Gradient Descent (SGD) based methods have been widely used for training largescale machine learning models that also generalize well in practice. Several explanations have been offered for this generalization performance, a prominent one being algorithmic stability [18]. However, there are no known examples of smooth loss functions for which the analysis can be shown to be tight. Furthermore, apart from the properties of the loss function, data distribution has also been shown to be an important factor in generalization performance. This raises the question: is the stability analysis of [18] tight for smooth functions, and if not, for what kind of loss functions and data distributions can the stability analysis be improved? In this paper we first settle open questions regarding tightness of bounds in the dataindependent setting: we show that for general datasets, the existing analysis for convex and stronglyconvex loss functions is tight, but it can be improved for nonconvex loss functions. Next, we give a novel and improved datadependent bounds: we show stability upper bounds for a large class of convex regularized loss functions, with negligible regularization parameters, and improve existing datadependent bounds in the nonconvex setting. We hope that our results will initiate further efforts to better understand the datadependent setting under nonconvex loss functions, leading to an improved understanding of the generalization abilities of deep networks.more » « less

Stochastic Gradient Descent (SGD) based methods have been widely used for training largescale machine learning models that also generalize well in practice. Several explanations have been offered for this generalization performance, a prominent one being algorithmic stability Hardt et al [2016]. However, there are no known examples of smooth loss functions for which the analysis can be shown to be tight. Furthermore, apart from properties of the loss function, data distribution has also been shown to be an important factor in generalization performance. This raises the question: is the stability analysis of Hardt et al [2016] tight for smooth functions, and if not, for what kind of loss functions and data distributions can the stability analysis be improved? In this paper we first settle open questions regarding tightness of bounds in the dataindependent setting: we show that for general datasets, the existing analysis for convex and stronglyconvex loss functions is tight, but it can be improved for nonconvex loss functions. Next, we give novel and improved datadependent bounds: we show stability upper bounds for a large class of convex regularized loss functions, with negligible regularization parameters, and improve existing datadependent bounds in the nonconvex setting. We hope that our results will initiate further efforts to better understand the datadependent setting under nonconvex loss functions, leading to an improved understanding of the generalization abilities of deep networks.more » « less

The adversarial risk of a machine learning model has been widely studied. Most previous works assume that the data lies in the whole ambient space. We propose to take a new angle and take the manifold assumption into consideration. Assuming data lies in a manifold, we investigate two new types of adversarial risk, the normal adversarial risk due to perturbation along normal direction, and the inmanifold adversarial risk due to perturbation within the manifold. We prove that the classic adversarial risk can be bounded from both sides using the normal and inmanifold adversarial risks. We also show with a surprisingly pessimistic case that the standard adversarial risk can be nonzero even when both normal and inmanifold risks are zero. We finalize the paper with empirical studies supporting our theoretical results. Our results suggest the possibility of improving the robustness of a classifier by only focusing on the normal adversarial risk.more » « less