null
(Ed.)
Training a semantic segmentation model requires large
densely-annotated image datasets that are costly to obtain.
Once the training is done, it is also difficult to add new ob-
ject categories to such segmentation models. In this pa-
per, we tackle the few-shot semantic segmentation prob-
lem, which aims to perform image segmentation task on un-
seen object categories merely based on one or a few sup-
port example(s). The key to solving this few-shot segmen-
tation problem lies in effectively utilizing object informa-
tion from support examples to separate target objects from
the background in a query image. While existing meth-
ods typically generate object-level representations by av-
eraging local features in support images, we demonstrate
that such object representations are typically noisy and less
distinguishing. To solve this problem, we design an ob-
ject representation generator (ORG) module which can ef-
fectively aggregate local object features from support im-
age(s) and produce better object-level representation. The
ORG module can be embedded into the network and trained
end-to-end in a weakly-supervised fashion without extra hu-
man annotation. We incorporate this design into a modified
encoder-decoder network to present a powerful and efficient
framework for few-shot semantic segmentation. Experimen-
tal results on the Pascal-VOC and MS-COCO datasets show
that our approach achieves better performance compared to
existing methods under both one-shot and five-shot settings.
more »
« less