skip to main content

Title: Discovering Visual Patterns in Art Collections with Spatially-consistent Feature Learning
Our goal in this paper is to discover near duplicate patterns in large collections of artworks. This is harder than standard instance mining due to differences in the artistic media (oil, pastel, drawing, etc), and imperfections inherent in the copying process. The key technical insight is to adapt a standard deep feature to this task by fine-tuning it on the specific art collection using self-supervised learning. More specifically, spatial consistency between neighbouring feature matches is used as supervisory fine-tuning signal. The adapted feature leads to more accurate style-invariant matching, and can be used with a standard discovery approach, based on geometric verification, to identify duplicate patterns in the dataset. The approach is evaluated on several different datasets and shows surprisingly good qualitative discovery results. For quantitative evaluation of the method, we annotated 273 near duplicate details in a dataset of 1587 artworks attributed to Jan Brueghel and his workshop. Beyond artwork, we also demonstrate improvement on localization on the Oxford5K photo dataset as well as on historical photograph localization on the Large Time Lags Location (LTLL) dataset.
; ;
Award ID(s):
Publication Date:
Journal Name:
IEEE Conference on Computer Vision and Pattern Recognition
Sponsoring Org:
National Science Foundation
More Like this
  1. Feature representations from pre-trained deep neural networks have been known to exhibit excellent generalization and utility across a variety of related tasks. Fine-tuning is by far the simplest and most widely used approach that seeks to exploit and adapt these feature representations to novel tasks with limited data. Despite the effectiveness of fine-tuning, itis often sub-optimal and requires very careful optimization to prevent severe over-fitting to small datasets. The problem of sub-optimality and over-fitting, is due in part to the large number of parameters used in a typical deep convolutional neural network. To address these problems, we propose a simple yet effective regularization method for fine-tuning pre-trained deep networks for the task of k-shot learning. To prevent overfitting, our key strategy is to cluster the model parameters while ensuring intra-cluster similarity and inter-cluster diversity of the parameters, effectively regularizing the dimensionality of the parameter search space. In particular, we identify groups of neurons within each layer of a deep network that shares similar activation patterns. When the network is to be fine-tuned for a classification task using only k examples, we propagate a single gradient to all of the neuron parameters that belong to the same group. The grouping ofmore »neurons is non-trivial as neuron activations depend on the distribution of the input data. To efficiently search for optimal groupings conditioned on the input data, we propose a reinforcement learning search strategy using recurrent networks to learn the optimal group assignments for each network layer. Experimental results show that our method can be easily applied to several popular convolutional neural networks and improve upon other state-of-the-art fine-tuning based k-shot learning strategies by more than10%« less
  2. Al-Kadi, Omar Sultan (Ed.)
    In this paper, we capture and explore the painterly depictions of materials to enable the study of depiction and perception of materials through the artists’ eye. We annotated a dataset of 19k paintings with 200k+ bounding boxes from which polygon segments were automatically extracted. Each bounding box was assigned a coarse material label (e.g., fabric) and half was also assigned a fine-grained label (e.g., velvety, silky). The dataset in its entirety is available for browsing and downloading at . We demonstrate the cross-disciplinary utility of our dataset by presenting novel findings across human perception, art history and, computer vision. Our experiments include a demonstration of how painters create convincing depictions using a stylized approach. We further provide an analysis of the spatial and probabilistic distributions of materials depicted in paintings, in which we for example show that strong patterns exists for material presence and location. Furthermore, we demonstrate how paintings could be used to build more robust computer vision classifiers by learning a more perceptually relevant feature representation. Additionally, we demonstrate that training classifiers on paintings could be used to uncover hidden perceptual cues by visualizing the features used by the classifiers. We conclude that our dataset of painterlymore »material depictions is a rich source for gaining insights into the depiction and perception of materials across multiple disciplines and hope that the release of this dataset will drive multidisciplinary research.« less
  3. Deep learning algorithms have been moderately successful in diagnoses of diseases by analyzing medical images especially through neuroimaging that is rich in annotated data. Transfer learning methods have demonstrated strong performance in tackling annotated data. It utilizes and transfers knowledge learned from a source domain to target domain even when the dataset is small. There are multiple approaches to transfer learning that result in a range of performance estimates in diagnosis, detection, and classification of clinical problems. Therefore, in this paper, we reviewed transfer learning approaches, their design attributes, and their applications to neuroimaging problems. We reviewed two main literature databases and included the most relevant studies using predefined inclusion criteria. Among 50 reviewed studies, more than half of them are on transfer learning for Alzheimer's disease. Brain mapping and brain tumor detection were second and third most discussed research problems, respectively. The most common source dataset for transfer learning was ImageNet, which is not a neuroimaging dataset. This suggests that the majority of studies preferred pre-trained models instead of training their own model on a neuroimaging dataset. Although, about one third of studies designed their own architecture, most studies used existing Convolutional Neural Network architectures. Magnetic Resonance Imaging wasmore »the most common imaging modality. In almost all studies, transfer learning contributed to better performance in diagnosis, classification, segmentation of different neuroimaging diseases and problems, than methods without transfer learning. Among different transfer learning approaches, fine-tuning all convolutional and fully-connected layers approach and freezing convolutional layers and fine-tuning fully-connected layers approach demonstrated superior performance in terms of accuracy. These recent transfer learning approaches not only show great performance but also require less computational resources and time.« less
  4. This paper presents a semi-supervised framework for multi-level description learning aiming for robust and accurate camera relocalization across large perception variations. Our proposed network, namely DLSSNet, simultaneously learns weakly-supervised semantic segmentation and local feature description in the hierarchy. Therefore, the augmented descriptors, trained in an end-to-end manner, provide a more stable high-level representation for local feature dis-ambiguity. To facilitate end-to-end semantic description learning, the descriptor segmentation module is proposed to jointly learn semantic descriptors and cluster centers using standard semantic segmentation loss. We show that our model can be easily fine-tuned for domain-specific usage without any further semantic annotations, instead, requiring only 2D-2D pixel correspondences. The learned descriptors, trained with our proposed pipeline, can boost the cross-season localization performance against other state-of-the-arts.
  5. Continuous bending under tension (CBT) is known to achieve elongation-to-failure well above that achieved under a conventional uniaxial simple tension (ST) strain path. However, the detailed mechanism for supplying this increased ductility has not been fully understood. It is clear that the necking that occurs in a typical ST specimen is avoided by constantly moving the region of plastic deformation during the CBT process. The volume of material in which the flow stress is greatest is limited to a moving line where the rollers contact the sheet and superimpose bending stress on the applied tensile load. Hence the condition of a large volume of material experiencing stress greater than the material flow stress, leading to strain localization during ST, is avoided. However, the magnitude of the contribution of this phenomenon to the overall increase in elongation is unclear. In the current set of experiments, an elongation to fracture (ETF) of 4.56x and 3.7x higher than ST was achieved by fine-tuning CBT forming parameters for Q&P 1180 and TBF 1180, respectively. A comparison of maximum local strains near the final point of fracture in ST and CBT sheets via digital image correlation revealed that avoidance of localization of plastic strain duringmore »CBT accounts for less than half of the increased elongation in the CBT specimens for two steels containing different amounts of retained austenite (RA). Geometrically necessary dislocation evolution is monitored using high-resolution EBSD (HREBSD) for both strain paths, indicating a lower hardening rate in the CBT samples in the bulk of the sheet, potentially relating to the cyclical nature of the stress in the outer layers of the sheet. Interestingly, the GND evolution in the center of the sheet, which does not experience the same amplitude of cyclic stress, follows the ST behavior more closely than the sheet edges. This appears to contribute to a precipitous drop in residual ductility for the specimens that are pulled in ST after partial CBT processing. The rate of transformation of RA is also tracked in the steels, with a significantly lower rate of transformation during CBT, compared to ST. This suggests that a slower transformation rate achieved under CBT also contributed to higher strain-to-failure levels.« less