Recently, contrastive learning has achieved great results in self-supervised learning, where the main idea is to pull two augmentations of an image (positive pairs) closer compared to other random images (negative pairs). We argue that not all negative images are equally negative. Hence, we introduce a self-supervised learning algorithm where we use a soft similarity for the negative images rather than a binary distinction between positive and negative pairs. We iteratively distill a slowly evolving teacher model to the student model by capturing the similarity of a query image to some random images and transferring that knowledge to the student. Specifically, our method should handle unbalanced and unlabeled data better than existing contrastive learning methods, because the randomly chosen negative set might include many samples that are semantically similar to the query image. In this case, our method labels them as highly similar while standard contrastive methods label them as negatives. Our method achieves comparable results to the state-of-the-art models.
more »
« less
The strange geometry of skip-gram with negative sampling
Despite their ubiquity, word embeddings trained with skip-gram negative sampling (SGNS) remain poorly understood. We find that vector positions are not simply determined by semantic similarity, but rather occupy a narrow cone, diametrically opposed to the context vectors. We show that this geometric concentration depends on the ratio of positive to negative examples, and that it is neither theoretically nor empirically inherent in related embedding algorithms.
more »
« less
- Award ID(s):
- 1652536
- PAR ID:
- 10057832
- Date Published:
- Journal Name:
- Empirical Methods in Natural Language Processing
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Contrastive learning is a self-supervised representation learning method that achieves milestone performance in various classification tasks. However, due to its unsupervised fashion, it suffers from the false negative sample problem: randomly drawn negative samples that are assumed to have a different label but actually have the same label as the anchor. This deteriorates the performance of contrastive learning as it contradicts the motivation of contrasting semantically similar and dissimilar pairs. This raised the attention and the importance of finding legitimate negative samples, which should be addressed by distinguishing between 1) true vs. false negatives; 2) easy vs. hard negatives. However, previous works were limited to the statistical approach to handle false negative and hard negative samples with hyperparameters tuning. In this paper, we go beyond the statistical approach and explore the connection between hard negative samples and data bias. We introduce a novel debiased contrastive learning method to explore hard negatives by relative difficulty referencing the bias-amplifying counterpart. We propose triplet loss for training a biased encoder that focuses more on easy negative samples. We theoretically show that the triplet loss amplifies the bias in self-supervised representation learning. Finally, we empirically show the proposed method improves downstream classification performance.more » « less
-
Biological and technological processes that involve liquids under negative pressure are vulnerable to the formation of cavities. Maximal negative pressures found in plants are around −100 bar, even though cavitation in pure bulk water only occurs at much more negative pressures on the relevant time scales. Here, we investigate the influence of small solutes and lipid bilayers, both constituents of all biological liquids, on the formation of cavities under negative pressures. By combining molecular dynamics simulations with kinetic modeling, we quantify cavitation rates on biologically relevant length and time scales. We find that lipid bilayers, in contrast to small solutes, increase the rate of cavitation, which remains unproblematically low at the pressures found in most plants. Only when the negative pressures approach −100 bar does cavitation occur on biologically relevant time scales. Our results suggest that bilayerbased cavitation is what generally limits the magnitude of negative pressures in liquids that contain lipid bilayers.more » « less
-
Why do brains have inhibitory connections? Why do deep networks have negative weights? We propose an answer from the perspective of representation capacity. We believe representing functions is the primary role of both (i) the brain in natural intelligence, and (ii) deep networks in artificial intelligence. Our answer to why there are inhibitory/negative weights is: to learn more functions. We prove that, in the absence of negative weights, neural networks with non-decreasing activation functions are not universal approximators. While this may be an intuitive result to some, to the best of our knowledge, there is no formal theory, in either machine learning or neuroscience, that demonstrates why negative weights are crucial in the context of representation capacity. Further, we provide insights on the geometric properties of the representation space that non-negative deep networks cannot represent. We expect these insights will yield a deeper understanding of more sophisticated inductive priors imposed on the distribution of weights that lead to more efficient biological and machine learning.more » « less
-
null (Ed.)Abstract We show the existence of complete negative Kähler–Einstein metric on Stein manifolds with holomorphic sectional curvature bounded from above by a negative constant. We prove that any Kähler metrics on such manifolds can be deformed to the complete negative Kähler–Einstein metric using the normalized Kähler–Ricci flow.more » « less
An official website of the United States government

