Contrastive learning learns visual representations by enforcing feature consistency under different augmented views. In this work, we explore contrastive learning from a new perspective. Interestingly, we find that quantization, when properly engineered, can enhance the effectiveness of contrastive learning. To this end, we propose a novel contrastive learning framework, dubbed Contrastive Quant, to encourage feature consistency under both differently augmented inputs via various data transformations and differently augmented weights/activations via various quantization levels. Extensive experiments, built on top of two state-of-the-art contrastive learning methods SimCLR and BYOL, show that Contrastive Quant consistently improves the learned visual representation.
more »
« less
Incorporating simulated spatial context information improves the effectiveness of contrastive learning models
Visual learning often occurs in a specific context, where an agent acquires skills through exploration and tracking of its location in a consistent environment. The historical spatial context of the agent provides a similarity signal for self-supervised contrastive learning. We present a unique approach, termed Environmental Spatial Similarity (ESS), that complements existing contrastive learning methods. Using images from simulated, photorealistic environments as an experimental setting, we demonstrate that ESS outperforms traditional instance discrimination approaches. Moreover, sampling additional data from the same environment substantially improves accuracy and provides new augmentations. ESS allows remarkable proficiency in room classification and spatial prediction tasks, especially in unfamiliar environments. This learning paradigm has the potential to enable rapid visual learning in agents operating in new environments with unique visual characteristics. Potentially transformative applications span from robotics to space exploration. Our proof of concept demonstrates improved efficiency over methods that rely on extensive, disconnected datasets.
more »
« less
- Award ID(s):
- 2216127
- PAR ID:
- 10628467
- Editor(s):
- Wang, Wanying
- Publisher / Repository:
- Cell Press
- Date Published:
- Journal Name:
- Patterns
- Volume:
- 5
- Issue:
- 5
- ISSN:
- 2666-3899
- Page Range / eLocation ID:
- 100964
- Subject(s) / Keyword(s):
- Contrastive learning Representation learning Virtual environment Developmental psychology Deep learning Bio-inspired computing Intelligent agent
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Traditional cancer rate estimations are often limited in spatial resolutions and lack considerations of environmental factors. Satellite imagery has become a vital data source for monitoring diverse urban environments, supporting applications across environmental, socio-demographic, and public health domains. However, while deep learning (DL) tools, particularly convolutional neural networks, have demonstrated strong performance in extracting features from high-resolution imagery, their reliance on local spatial cues often limits their ability to capture complex, non-local, and higher-order structural information. To overcome this limitation, we propose a novel LLM-based multi-agent coordination system for satellite image analysis, which integrates visual and contextual reasoning through a simplicial contrastive learning framework (Agent- SNN). Our Agent-SNN contains two augmented superpixel-based graphs and maximizes mutual information between their latent simplicial complex representations, thereby enabling the system to learn both local and global topological features. The LLM-based agents generate structured prompts that guide the alignment of these representations across modalities. Experiments with satellite imagery of Los Angeles and San Diego demonstrate that Agent-SNN achieves signi cant improvements over state-of-the-art baselines in regional cancer prevalence estimation tasks.more » « less
-
IntroductionComputer vision and deep learning (DL) techniques have succeeded in a wide range of diverse fields. Recently, these techniques have been successfully deployed in plant science applications to address food security, productivity, and environmental sustainability problems for a growing global population. However, training these DL models often necessitates the large-scale manual annotation of data which frequently becomes a tedious and time-and-resource- intensive process. Recent advances in self-supervised learning (SSL) methods have proven instrumental in overcoming these obstacles, using purely unlabeled datasets to pre-train DL models. MethodsHere, we implement the popular self-supervised contrastive learning methods of NNCLR Nearest neighbor Contrastive Learning of visual Representations) and SimCLR (Simple framework for Contrastive Learning of visual Representations) for the classification of spatial orientation and segmentation of embryos of maize kernels. Maize kernels are imaged using a commercial high-throughput imaging system. This image data is often used in multiple downstream applications across both production and breeding applications, for instance, sorting for oil content based on segmenting and quantifying the scutellum’s size and for classifying haploid and diploid kernels. Results and discussionWe show that in both classification and segmentation problems, SSL techniques outperform their purely supervised transfer learning-based counterparts and are significantly more annotation efficient. Additionally, we show that a single SSL pre-trained model can be efficiently finetuned for both classification and segmentation, indicating good transferability across multiple downstream applications. Segmentation models with SSL-pretrained backbones produce DICE similarity coefficients of 0.81, higher than the 0.78 and 0.73 of those with ImageNet-pretrained and randomly initialized backbones, respectively. We observe that finetuning classification and segmentation models on as little as 1% annotation produces competitive results. These results show SSL provides a meaningful step forward in data efficiency with agricultural deep learning and computer vision.more » « less
-
Agents that are aware of the separation between themselves and their environments can leverage this understanding to form effective representations of visual input. We propose an approach for learning such structured representations for RL algorithms, using visual knowledge of the agent, such as its shape or mask, which is often inexpensive to obtain. This is incorporated into the RL objective using a simple auxiliary loss. We show that our method, Structured Environment-Agent Representations, outperforms state-of-the-art model-free approaches over 18 different challenging visual simulation environments spanning 5 different robots.more » « less
-
Contrastive learning is an effective unsupervised method in graph representation learning. The key component of contrastive learning lies in the construction of positive and negative samples. Previous methods usually utilize the proximity of nodes in the graph as the principle. Recently, the data-augmentation-based contrastive learning method has advanced to show great power in the visual domain, and some works have extended this method from images to graphs. However, unlike the data augmentation on images, the data augmentation on graphs is far less intuitive and it is much harder to provide high-quality contrastive samples, which leaves much space for improvement. In this work, by introducing an adversarial graph view for data augmentation, we propose a simple but effective method,Adversarial Graph Contrastive Learning(ArieL), to extract informative contrastive samples within reasonable constraints. We develop a new technique calledinformation regularizationfor stable training and use subgraph sampling for scalability. We generalize our method from node-level contrastive learning to the graph level by treating each graph instance as a super-node.ArieLconsistently outperforms the current graph contrastive learning methods for both node-level and graph-level classification tasks on real-world datasets. We further demonstrate thatArieLis more robust in the face of adversarial attacks.more » « less
An official website of the United States government

