null
(Ed.)
Training a semantic segmentation model requires large
densely-annotated image datasets that are costly to obtain.
Once the training is done, it is also difficult to add new object
categories to such segmentation models. In this paper,
we tackle the few-shot semantic segmentation problem,
which aims to perform image segmentation task on unseen
object categories merely based on one or a few support
example(s). The key to solving this few-shot segmentation
problem lies in effectively utilizing object information
from support examples to separate target objects from
the background in a query image. While existing methods
typically generate object-level representations by averaging
local features in support images, we demonstrate
that such object representations are typically noisy and less
distinguishing. To solve this problem, we design an object
representation generator (ORG) module which can effectively
aggregate local object features from support image(
s) and produce better object-level representation. The
ORG module can be embedded into the network and trained
end-to-end in a weakly-supervised fashion without extra human
annotation. We incorporate this design into a modified
encoder-decoder network to present a powerful and efficient
framework for few-shot semantic segmentation. Experimental
results on the Pascal-VOC and MS-COCO datasets show
that our approach achieves better performance compared to
existing methods under both one-shot and five-shot settings.
more »
« less