skip to main content


Search for: All records

Creators/Authors contains: "Le, Trung-Nghia"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Image synthesis is a process of converting the input text, sketch, or other sources, i.e., another image or mask, into an image. It is an important problem in the computer vision field, where it has attracted the research community to attempt to solve this challenge at a high level to generate photorealistic images. Different techniques and strategies have been employed to achieve this purpose. Thus, the aim of this paper is to provide a comprehensive review of various image synthesis models covering several aspects. First, the image synthesis concept is introduced. We then review different image synthesis methods divided into three categories: image generation from text, sketch, and other inputs, respectively. Each sub-category is introduced under the proper category based upon the general framework to provide a broad vision of all existing image synthesis methods. Next, brief details of the benchmarked datasets used in image synthesis are discussed along with specifying the image synthesis models that leverage them. Regarding the evaluation, we summarize the metrics used to evaluate the image synthesis models. Moreover, a detailed analysis based on the evaluation metrics of the results of the introduced image synthesis is provided. Finally, we discuss some existing challenges and suggest possible future research directions. 
    more » « less
  2. Public speaking is one of the most important ways to share ideas with many people in different domains such as education, training, marketing, or healthcare. Being able to master this skill allows the speaker to clearly advocate for their subject and greatly influence others. However, most of the population reported having public speaking anxiety or glossophobia, which prevents them from effectively conveying their messages to others. One of the best solutions is to have a safe and private space to practice speaking in front of others. As a result, this research work is proposed with the overarching goal of providing people with virtual environments to practice in front of simulated audiences. In addition, the proposed work will aim to have live audience feedback and speech analysis details which could be useful for the users. The experiments via a user study provide insights into the proposed public speaking simulator. 
    more » « less
  3. Face recognition with wearable items has been a challenging task in computer vision and involves the problem of identifying humans wearing a face mask. Masked face analysis via multi-task learning could effectively improve performance in many fields of face analysis. In this paper, we propose a unified framework for predicting the age, gender, and emotions of people wearing face masks. We first construct FGNET-MASK, a masked face dataset for the problem. Then, we propose a multi-task deep learning model to tackle the problem. In particular, the multi-task deep learning model takes the data as inputs and shares their weight to yield predictions of age, expression, and gender for the masked face. Through extensive experiments, the proposed framework has been found to provide a better performance than other existing methods. 
    more » « less
  4. null (Ed.)
    In this paper, we investigate the interesting yet challenging problem of camouflaged instance segmentation. To this end, we first annotate the available CAMO dataset at the instance level. We also embed the data augmentation in order to increase the number of training samples. Then, we train different state-of-the-art instance segmentation on the CAMO-instance data. Last but not least, we develop an interactive user interface which demonstrates the performance of different state-of-the-art instance segmentation methods on the task of camouflaged instance segmentation. The users are able to compare the results of different methods on the given input images. Our work is expected to push the envelope of the camouflage analysis problem. 
    more » « less
  5. null (Ed.)
    In this paper, we introduce a practical system for interactive video object mask annotation, which can support multiple back-end methods. To demonstrate the generalization of our system, we introduce a novel approach for video object annotation. Our proposed system takes scribbles at a chosen key-frame from the end-users via a user-friendly interface and produces masks of corresponding objects at the key-frame via the Control-Point-based Scribbles-to-Mask (CPSM) module. The object masks at the key-frame are then propagated to other frames and refined through the Multi-Referenced Guided Segmentation (MRGS) module. Last but not least, the user can correct wrong segmentation at some frames, and the corrected mask is continuously propagated to other frames in the video via the MRGS to produce the object masks at all video frames. 
    more » « less