NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

COCO-Search18 fixation dataset for predicting goal-directed attention control

https://doi.org/10.1038/s41598-021-87715-9

Chen, Yupei; Yang, Zhibo; Ahn, Seoyoung; Samaras, Dimitris; Hoai, Minh; Zelinsky, Gregory (April 2021, Scientific Reports)

Abstract Attention control is a basic behavioral process that has been studied for decades. The currently best models of attention control are deep networks trained on free-viewing behavior to predict bottom-up attention control – saliency. We introduce COCO-Search18, the first dataset of laboratory-qualitygoal-directed behaviorlarge enough to train deep-network models. We collected eye-movement behavior from 10 people searching for each of 18 target-object categories in 6202 natural-scene images, yielding$$\sim$$ $\sim$ 300,000 search fixations. We thoroughly characterize COCO-Search18, and benchmark it using three machine-learning methods: a ResNet50 object detector, a ResNet50 trained on fixation-density maps, and an inverse-reinforcement-learning model trained on behavioral search scanpaths. Models were also trained/tested on images transformed to approximate a foveated retina, a fundamental biological constraint. These models, each having a different reliance on behavioral training, collectively comprise the new state-of-the-art in predicting goal-directed search fixations. Our expectation is that future work using COCO-Search18 will far surpass these initial efforts, finding applications in domains ranging from human-computer interactive systems that can anticipate a person’s intent and render assistance to the potentially early identification of attention-related clinical disorders (ADHD, PTSD, phobia) based on deviation from neurotypical fixation behavior.
more » « less
Unifying Top-Down and Bottom-Up Scanpath Prediction Using Transformers

https://doi.org/10.1109/CVPR52733.2024.00166

Yang, Zhibo; Mondal, Sounak; Ahn, Seoyoung; Xue, Ruoyu; Zelinsky, Gregory; Hoai, Minh; Samaras, Dimitris (June 2024, IEEE)

Full Text Available
Patch-level Gaze Distribution Prediction for Gaze Following

https://doi.org/10.1109/WACV56688.2023.00094

Miao, Qiaomu; Hoai, Minh; Samaras, Dimitris (January 2023, IEEE Winter Conference on Applications of Computer Vision)

Full Text Available
Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

Mondal, Sounak; Yang, Zhibo; Ahn, Seoyoung; Samaras, Dimitris; Zelinsky, Gregory; Hoai, Minh (January 2023, IEEE Conference on Computer Vision and Pattern Recognition)

Full Text Available
Characterizing Target-absent Human Attention

Chen, Yupei; Yang, Zhibo; Chakraborty, Souradeep; Mondal, Sounak; Ahn, Seoyoung; Samaras, Dimitris; Hoai, Minh; Zelinsky, Gregory (June 2022, Proceedings of CVPR International Workshop on Gaze Estimation and Prediction in the Wild)

Human efficiency in finding a target in an image has attracted the attention of machine learning researchers, but what about when no target is there? Knowing how people search in the absence of a target, and when they stop, is important for Human-computer-interaction systems attempting to predict human gaze behavior in the wild. Here we report a rigorous evaluation of target-absent search behavior using the COCO-Search18 dataset to train stateof- the-art models. We focus on two specific aims. First, we characterize the presence of a target guidance signal in target-absent search behavior by comparing it to targetpresent guidance and free viewing. We do this by comparing how well a model trained on one type of fixation behavior (target-present, target-absent, free viewing) can predict behavior in either the same or different task. To compare target-absent search to free viewing behavior we created COCO-FreeView, a dataset of free-viewing fixations for the same images used in COCO-Search18. These comparisons revealed the existence of a target guidance signal in targetabsent search, albeit one much less dominant compared to when a target actually appeared in an image, and that the target-absent guidance signal was similar to free viewing in that saliency and center bias were both weighted more than guidance from target features. Our second aim focused on the stopping criteria, a question intrinsic to target-absent search. Here we propose to train a foveated target detector whose target detection representation is sensitive to the relationship between distance from the fovea. Then combining the predicted target detection representation with other information such as fixation history and subject ID, our model outperforms the baselines in predicting when a person stops moving his attention during target-absent search.
more » « less
Full Text Available
Characterizing Target-absent Human Attention

https://doi.org/10.1109/CVPRW56347.2022.00551

Chen, Yupei; Yang, Zhibo; Chakraborty, Souradeep; Mondal, Sounak; Ahn, Seoyoung; Samaras, Dimitris; Hoai, Minh; Zelinsky, Gregory (June 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW))

Full Text Available
Sequence-to-Segments Networks for Detecting Segments in Videos

https://doi.org/10.1109/TPAMI.2019.2940225

Wei, Zijun; Wang, Boyu; Hoai, Minh; Zhang, Jianming; Shen, Xiaohui; Lin, Zhe; Mech, Radomir; Samaras, Dimitris (March 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence)
null (Ed.)
Full Text Available
Learning Visual Emotion Representations From Web Data

https://doi.org/10.1109/CVPR42600.2020.01312

Wei, Zijun; Zhang, Jianming; Lin, Zhe; Lee, Joon-Young; Balasubramanian, Niranjan; Hoai, Minh; Samaras, Dimitris (June 2020, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))
null (Ed.)
Full Text Available
Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning

Yang, Zhibo; Huang, Lihan; Chen, Yupei; Wei, Zijun; Ahn, S.; Samaras, Dimitris; Hoai, Minh (June 2020, IEEE Conference on Computer Vision and Pattern Recognition (CVPR))

Human gaze behavior prediction is important for behavioral vision and for computer vision applications. Most models mainly focus on predicting free-viewing behavior using saliency maps, but do not generalize to goal-directed behavior, such as when a person searches for a visual target object. We propose the first inverse reinforcement learning (IRL) model to learn the internal reward function and policy used by humans during visual search. We modeled the viewer’s internal belief states as dynamic contextual belief maps of object locations. These maps were learned and then used to predict behavioral scanpaths for multiple target categories. To train and evaluate our IRL model we created COCO-Search18, which is now the largest dataset of highquality search fixations in existence. COCO-Search18 has 10 participants searching for each of 18 target-object categories in 6202 images, making about 300,000 goal-directed fixations. When trained and evaluated on COCO-Search18, the IRL model outperformed baseline models in predicting search fixation scanpaths, both in terms of similarity to human search behavior and search efficiency. Finally, reward maps recovered by the IRL model reveal distinctive targetdependent patterns of object prioritization, which we interpret as a learned object context.
more » « less
Full Text Available
Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning

https://doi.org/10.51628/001c.22322

Zelinsky, Gregory J.; Ahn, Seoyoung; Chen, Yupei; Yang, Zhibo; Adeli, Hossein; Huang, Lihan; Samaras, Dimitrios; Hoai, Minh (January 2020, Neurons, Behavior, Data analysis, and Theory)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records