skip to main content


Search for: All records

Creators/Authors contains: "Nguyen, Tam"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Sketch-to-image is an important task to reduce the burden of creating a color image from scratch. Unlike previous sketch-to-image models, where the image is synthesized in an end-to-end manner, leading to an unnaturalistic image, we propose a method by decomposing the problem into subproblems to generate a more naturalistic and reasonable image. It first generates an intermediate output which is a semantic mask map from the input sketch through instance and semantic segmentation in two levels, background segmentation and foreground segmentation. Background segmentation is formed based on the context of the foreground objects. Then, the foreground segmentations are sequentially added to the created background segmentation. Finally, the generated mask map is fed into an image-to-image translation model to generate an image. Our proposed method works with 92 distinct classes. Compared to state-of-the-art sketch-to-image models, our proposed method outperforms the previous methods and generates better images. 
    more » « less
    Free, publicly-accessible full text available March 1, 2025
  2. In this paper, we meticulously examine the robustness of computer vision object detection frameworks within the intricate realm of real-world traffic scenarios, with a particular emphasis on challenging adverse weather conditions. Conventional evaluation methods often prove inadequate in addressing the complexities inherent in dynamic traffic environments—an increasingly vital consideration as global advancements in autonomous vehicle technologies persist. Our investigation delves specifically into the nuanced performance of these algorithms amidst adverse weather conditions like fog, rain, snow, sun flare, and more, acknowledging the substantial impact of weather dynamics on their precision. Significantly, we seek to underscore that an object detection framework excelling in clear weather may encounter significant challenges in adverse conditions. Our study incorporates in-depth ablation studies on dual modality architectures, exploring a range of applications including traffic monitoring, vehicle tracking, and object tracking. The ultimate goal is to elevate the safety and efficiency of transportation systems, recognizing the pivotal role of robust computer vision systems in shaping the trajectory of future autonomous and intelligent transportation technologies. 
    more » « less
    Free, publicly-accessible full text available March 13, 2025
  3. Creating thematic collections in industries demands innovative designs and cohesive concepts. Designers may face challenges in maintaining thematic consistency when drawing inspiration from existing objects, landscapes, or artifacts. While AI-powered graphic design tools offer help, they often fail to generate cohesive sets based on specific thematic concepts. In response, we introduce iCONTRA, an interactive CONcept TRAnsfer system. With a user-friendly interface, iCONTRA enables both experienced designers and novices to effortlessly explore creative design concepts and efficiently generate thematic collections. We also propose a zero-shot image editing algorithm, eliminating the need for fine-tuning models, which gradually integrates information from initial objects, ensuring consistency in the generation process without influencing the background. A pilot study suggests iCONTRA’s potential to reduce designers’ efforts. Experimental results demonstrate its effectiveness in producing consistent and high-quality object concept transfers. iCONTRA stands as a promising tool for innovation and creative exploration in thematic collection design. 
    more » « less
    Free, publicly-accessible full text available May 2, 2025
  4. Drawing is an art that enables people to express their imagination and emotions. However, individuals usually face challenges in drawing, especially when translating conceptual ideas into visually coherent representations and bridging the gap between mental visualization and practical execution. In response, we propose ARtVista - a novel system integrating AR and generative AI technologies. ARtVista not only recommends reference images aligned with users’ abstract ideas and generates sketches for users to draw but also goes beyond, crafting vibrant paintings in various painting styles. ARtVista also offers users an alternative approach to create striking paintings by simulating the paint-by-number concept on reference images, empowering users to create visually stunning artwork devoid of the necessity for advanced drawing skills. We perform a pilot study and reveal positive feedback on its usability, emphasizing its effectiveness in visualizing user ideas and aiding the painting process to achieve stunning pictures without requiring advanced drawing skills. 
    more » « less
    Free, publicly-accessible full text available May 11, 2025
  5. Sketch-to-image synthesis method transforms a simple abstract black-and-white sketch into an image. Most sketch-to-image synthesis methods generate an image in an end-to-end manner, leading to generate a non-satisfactory result. The reason is that, in end-to-end models, the models generate images directly from the input sketches. Thus, with very abstract and complicated sketches, the models might struggle in generating naturalistic images due to the simultaneous focus on both factors: overall shape and fine-grained details. In this paper, we propose to divide the problem into subproblems. To this end, an intermediate output, which is a semantic mask map, is first generated from the input sketch via an instance and semantic segmentation. In the instance segmentation stage, the objects' sizes might be modified depending on the surrounding environment and their respective size prior to reflect reality and produce more realistic images. In the semantic seg-mentation stage, a background segmentation is first constructed based on the context of the detected objects. Various natural scenes are implemented for both indoor and outdoor scenes. Following this, a foreground segmentation process is commenced, where each detected object is semantically added into the constructed segmented background. Then, in the next stage, an image-to-image translation model is leveraged to convert the semantic mask map into a colored image. Finally, a post-processing stage is incorporated to further enhance the image result. Extensive experiments demonstrate the superiority of our proposed method over state-of-the-art methods. 
    more » « less
  6. Few-shot instance segmentation extends the few-shot learning paradigm to the instance segmentation task, which tries to segment instance objects from a query image with a few annotated examples of novel categories. Conventional approaches have attempted to address the task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (e.g. mean of K-shot) for prediction, leading to performance instability. To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and K-shot information. Inspired by augmentation approaches that perturb data with Gaussian noise for populating low data density regions, we model the mask distribution with a diffusion probabilistic model. We also propose to utilize classifier-free guided mask sampling to integrate category information into the binary mask generation process. Without bells and whistles, our proposed method consistently outperforms state-of-the-art methods on both base and novel classes of the COCO dataset while simultaneously being more stable than existing methods. The source code is available at: https://github.com/minhquanlecs/MaskDiff. 
    more » « less
    Free, publicly-accessible full text available March 25, 2025
  7. Free, publicly-accessible full text available May 13, 2025
  8. The proliferation of Artificial Intelligence (AI) models such as Generative Adversarial Networks (GANs) has shown impressive success in image synthesis. Artificial GAN-based synthesized images have been widely spread over the Internet with the advancement in generating naturalistic and photo-realistic images. This might have the ability to improve content and media; however, it also constitutes a threat with regard to legitimacy, authenticity, and security. Moreover, implementing an automated system that is able to detect and recognize GAN-generated images is significant for image synthesis models as an evaluation tool, regardless of the input modality. To this end, we propose a framework for reliably detecting AI-generated images from real ones through Convolutional Neural Networks (CNNs). First, GAN-generated images were collected based on different tasks and different architectures to help with the generalization. Then, transfer learning was applied. Finally, several Class Activation Maps (CAM) were integrated to determine the discriminative regions that guided the classification model in its decision. Our approach achieved 100% on our dataset, i.e., Real or Synthetic Images (RSI), and a superior performance on other datasets and configurations in terms of its accuracy. Hence, it can be used as an evaluation tool in image generation. Our best detector was a pre-trained EfficientNetB4 fine-tuned on our dataset with a batch size of 64 and an initial learning rate of 0.001 for 20 epochs. Adam was used as an optimizer, and learning rate reduction along with data augmentation were incorporated.

     
    more » « less
  9. Abstract

    Fusobacterium nucleatum, a gram-negative oral bacterium, has been consistently validated as a strong contributor to the progression of several types of cancer, including colorectal (CRC) and pancreatic cancer. While previous in vitro studies have shown that intracellularF. nucleatumenhances malignant phenotypes such as cell migration, the dependence of this regulation on features of the tumor microenvironment (TME) such as oxygen levels are wholly uncharacterized. Here we examine the influence of hypoxia in facilitatingF. nucleatuminvasion and its effects on host responses focusing on changes in the global epigenome and transcriptome. Using a multiomic approach, we analyze epigenomic alterations of H3K27ac and global transcriptomic alterations sustained within a hypoxia and normoxia conditioned CRC cell line HCT116 at 24 h following initial infection withF. nucleatum. Our findings reveal that intracellularF. nucleatumactivates signaling pathways and biological processes in host cells similar to those induced upon hypoxia conditioning in the absence of infection. Furthermore, we show that a hypoxic TME favorsF. nucleatuminvasion and persistence and therefore infection under hypoxia may amplify malignant transformation by exacerbating the effects induced by hypoxia alone. These results motivate future studies to investigate host-microbe interactions in tumor tissue relevant conditions that more accurately define parameters for targeted cancer therapies.

     
    more » « less
  10. Anomaly analysis is an important component of any surveillance system. In recent years, it has drawn the attention of the computer vision and machine learning communities. In this article, our overarching goal is thus to provide a coherent and systematic review of state-of-the-art techniques and a comprehensive review of the research works in anomaly analysis. We will provide a broad vision of computational models, datasets, metrics, extensive experiments, and what anomaly analysis can do in images and videos. Intensively covering nearly 200 publications, we review (i) anomaly related surveys, (ii) taxonomy for anomaly problems, (iii) the computational models, (iv) the benchmark datasets for studying abnormalities in images and videos, and (v) the performance of state-of-the-art methods in this research problem. In addition, we provide insightful discussions and pave the way for future work. 
    more » « less