Search for: All records

Creators/Authors contains: "Ying, Xiaowen"

« Prev Next »

Total Resources

6

Resource Type
Conference Paper

6

Conference Proceeding

0

Dataset

0

Journal Article

0

Workshop Report

0

Availability
Full Text / Resource Available

6

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Robustness of Trajectory Prediction Models Under Map-Based Attacks

https://doi.org/10.1109/WACV56688.2023.00452

Zheng, Zhihao ; Ying, Xiaowen ; Yao, Zhen ; Chuah, Mooi Choo ( January 2023 , IEEE)
UCTNet: Uncertainty-Aware Cross-Modal Transformer Network for Indoor RGB-D Semantic Segmentation

Ying, Xiaowen ; Chuah, Mooi Choo ( November 2022 , Springer)
Avidan, S. (Ed.)
In this paper, we tackle the problem of RGB-D Semantic Segmentation. The key challenges in solving this problem lie in 1) how to extract features from depth sensor data and 2) how to effectively fuse the features extracted from the two modalities. For the first challenge, we found that the depth information obtained from the sensor is not always reliable (e.g. objects with reflective or dark surfaces typically have inaccurate or void sensor readings), and existing methods that extract depth features using ConvNets did not explicitly consider the reliability of depth value at different pixel locations. To tackle this challenge, we propose a novel mechanism, namely Uncertainty-Aware Self-Attention that explicitly controls the information flow from unreliable depth pixels to confident depth pixels during feature extraction. For the second challenge, we propose an effective and scalable fusion module based on Cross-Attention that can adaptively fuse and exchange information between the RGB encoder and depth encoder. Our proposed framework, namely UCTNet, is an encoder-decoder network that naturally incorporates these two key designs for robust and accurate RGB-D Segmentation. Experimental results show that UCTNet outperforms existing works and achieves state-of-the-art performances on two RGB-D Semantic Segmentation benchmarks.
more » « less
Full Text Available
Delving into Light-Dark Semantic Segmentation for Indoor Scenes Understanding

https://doi.org/10.1145/3552482.3556556

Ying, Xiaowen ; Lang, Bo ; Zheng, Zhihao ; Chuah, Mooi Choo ( October 2022 , ACM)
SRNet: Spatial Relation Network for Efficient Single Stage Instance Segmentation in Videos

Ying, Xiaowen ; Li, Xin ; Chuah, Mooi Choo ( October 2021 , ACM Multimedia 2021)
null (Ed.)
The task of instance segmentation in videos aims to consistently identify objects at pixel level throughout the entire video sequence. Existing state-of-the-art methods either follow the tracking-bydetection paradigm to employ multi-stage pipelines or directly train a complex deep model to process the entire video clips as 3D volumes. However, these methods are typically slow and resourceconsuming such that they are often limited to offline processing. In this paper, we propose SRNet, a simple and efficient framework for joint segmentation and tracking of object instances in videos. The key to achieving both high efficiency and accuracy in our framework is to formulate the instance segmentation and tracking problem into a unified spatial-relation learning task where each pixel in the current frame relates to its object center, and each object center relates to its location in the previous frame. This unified learning framework allows our framework to perform join instance segmentation and tracking through a single stage while maintaining low overheads among different learning tasks. Our proposed framework can handle two different task settings and demonstrates comparable performance with state-of-the-art methods on two different benchmarks while running significantly faster.
more » « less
Full Text Available
Weakly-Supervised Object Representation Learning for Few-Shot Semantic Segmentation

Ying, Xiaowen ; Li, Xin ; Chuah, Mooi Choo ( January 2021 , IEEE Winter Conference on Applications of Computer Vision)
null (Ed.)
Training a semantic segmentation model requires large densely-annotated image datasets that are costly to obtain. Once the training is done, it is also difficult to add new ob- ject categories to such segmentation models. In this pa- per, we tackle the few-shot semantic segmentation prob- lem, which aims to perform image segmentation task on un- seen object categories merely based on one or a few sup- port example(s). The key to solving this few-shot segmen- tation problem lies in effectively utilizing object informa- tion from support examples to separate target objects from the background in a query image. While existing meth- ods typically generate object-level representations by av- eraging local features in support images, we demonstrate that such object representations are typically noisy and less distinguishing. To solve this problem, we design an ob- ject representation generator (ORG) module which can ef- fectively aggregate local object features from support im- age(s) and produce better object-level representation. The ORG module can be embedded into the network and trained end-to-end in a weakly-supervised fashion without extra hu- man annotation. We incorporate this design into a modified encoder-decoder network to present a powerful and efficient framework for few-shot semantic segmentation. Experimen- tal results on the Pascal-VOC and MS-COCO datasets show that our approach achieves better performance compared to existing methods under both one-shot and five-shot settings.
more » « less
Full Text Available
Weakly-Supervised Object Representation Learning for Few-Shot Semantic Segmentation

Ying, Xiaowen ; Li, Xin ; Chuah, Mooi Choo ( January 2021 , IEEE Winter Conference on Applications of Computer Vision)
null (Ed.)
Training a semantic segmentation model requires large densely-annotated image datasets that are costly to obtain. Once the training is done, it is also difficult to add new object categories to such segmentation models. In this paper, we tackle the few-shot semantic segmentation problem, which aims to perform image segmentation task on unseen object categories merely based on one or a few support example(s). The key to solving this few-shot segmentation problem lies in effectively utilizing object information from support examples to separate target objects from the background in a query image. While existing methods typically generate object-level representations by averaging local features in support images, we demonstrate that such object representations are typically noisy and less distinguishing. To solve this problem, we design an object representation generator (ORG) module which can effectively aggregate local object features from support image( s) and produce better object-level representation. The ORG module can be embedded into the network and trained end-to-end in a weakly-supervised fashion without extra human annotation. We incorporate this design into a modified encoder-decoder network to present a powerful and efficient framework for few-shot semantic segmentation. Experimental results on the Pascal-VOC and MS-COCO datasets show that our approach achieves better performance compared to existing methods under both one-shot and five-shot settings.
more » « less
Full Text Available