skip to main content

Title: Learning Instance Occlusion for Panoptic Segmentation
Panoptic segmentation requires segments of both “things” (countable object instances) and “stuff” (uncountable and amorphous regions) within a single output. A common approach involves the fusion of instance segmentation (for “things”) and semantic segmentation (for “stuff”) into a non-overlapping placement of segments, and resolves overlaps. However, instance ordering with detection confidence do not correlate well with natural occlusion relationship. To resolve this issue, we propose a branch that is tasked with modeling how two instance masks should overlap one another as a binary relation. Our method, named OCFusion, is lightweight but particularly effective in the instance fusion process. OCFusion is trained with the ground truth relation derived automatically from the existing dataset annotations. We obtain state-of-the-art results on COCO and show competitive results on the Cityscapes panoptic segmentation benchmark.
; ; ;
Award ID(s):
1717431 1618477
Publication Date:
Journal Name:
IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Page Range or eLocation-ID:
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we explore the possibility to increase the training examples without laborious data collection and annotation for long-tailed instance segmentation. We find that an abundance of instance segments can potentially be obtained freely from object-centric images, according to two insights: (i) an object-centric image usually contains one salient object in a simple background; (ii) objects from the same class often share similar appearances or similar contrasts to the background. Motivated by these insights, we propose a simple and scalable framework FREESEG for extracting and leveraging these “free” object segments to facilitate model training. Concretely, we investigate the similarity among object-centric images of the same class to propose candidate segments of foreground instances, followed by a novel ranking of segment quality. The resulting high quality object segments can then be used to augment the existing long-tailed datasets, e.g., by copying and pasting the segments onto the original training images. Extensive experiments show that FREESEG yields substantial improvements on top of strong baselines and achieves state-of-the-art accuracy for segmenting rare object categories.
  2. The task of instance segmentation in videos aims to consistently identify objects at pixel level throughout the entire video sequence. Existing state-of-the-art methods either follow the tracking-bydetection paradigm to employ multi-stage pipelines or directly train a complex deep model to process the entire video clips as 3D volumes. However, these methods are typically slow and resourceconsuming such that they are often limited to offline processing. In this paper, we propose SRNet, a simple and efficient framework for joint segmentation and tracking of object instances in videos. The key to achieving both high efficiency and accuracy in our framework is to formulate the instance segmentation and tracking problem into a unified spatial-relation learning task where each pixel in the current frame relates to its object center, and each object center relates to its location in the previous frame. This unified learning framework allows our framework to perform join instance segmentation and tracking through a single stage while maintaining low overheads among different learning tasks. Our proposed framework can handle two different task settings and demonstrates comparable performance with state-of-the-art methods on two different benchmarks while running significantly faster.
  3. An important means for disseminating information in social media platforms is by including URLs that point to external sources in user posts. In Twitter, we estimate that about 21% of the daily stream of English-language tweets contain URLs. We notice that NLP tools make little attempt at understanding the relationship between the content of the URL and the text surrounding it in a tweet. In this work, we study the structure of tweets with URLs relative to the content of the Web documents pointed to by the URLs. We identify several segments classes that may appear in a tweet with URLs, such as the title of a Web page and the user's original content. Our goals in this paper are: introduce, define, and analyze the segmentation problem of tweets with URLs, develop an effective algorithm to solve it, and show that our solution can benefit sentiment analysis on Twitter. We also show that the problem is an instance of the block edit distance problem, and thus an NP-hard problem.

    A good understanding of earthquake rupture segmentation is important to characterize fault geometries at depth for follow-up tectonic, stress-field or other analyses. We propose a data-driven strategy and develop pre-optimization methods to support finite fault inversions with independent prior estimates on earthquake source parameters. The first method we develop is a time-domain, multi-array and novel multiphase backprojection (BP) of teleseismic data. This method infers the spatio-temporal evolution of the rupture process, including a potential occurrence of rupture segmentation. Secondly, we apply image analysis methods on InSAR surface displacement maps to infer rupture characteristics (e.g. strike and length) and the number of potential segments. Both methods can provide model-independent constraints on fault location, dimension, orientation and rupture timing, applicable to form priors of model parameters before detailed modelling. We demonstrate and test our methods based on synthetic tests and an application to the 25.11.2016 Muji Mw 6.6 earthquake. Our results indicate segmentation and bilateral rupturing for the 2016 Muji earthquake. The results of the BP of the Muji Mw 6.6 earthquake using high-frequency filtered teleseismic waveforms in particular shows the capability to illuminate the rupture history with the potential to resolve the start and stop phases of individual fault segments.

  5. In this paper, we investigate the interesting yet challenging problem of camouflaged instance segmentation. To this end, we first annotate the available CAMO dataset at the instance level. We also embed the data augmentation in order to increase the number of training samples. Then, we train different state-of-the-art instance segmentation on the CAMO-instance data. Last but not least, we develop an interactive user interface which demonstrates the performance of different state-of-the-art instance segmentation methods on the task of camouflaged instance segmentation. The users are able to compare the results of different methods on the given input images. Our work is expected to push the envelope of the camouflage analysis problem.