NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Online and Offline Evaluation in Search Clarification

https://doi.org/10.1145/3681786

Tavakoli, Leila; Trippas, Johanne R; Zamani, Hamed; Scholer, Falk; Sanderson, Mark (January 2025, ACM Transactions on Information Systems)

The effectiveness of clarification question models in engaging users within search systems is currently constrained, casting doubt on their overall usefulness. To improve the performance of these models, it is crucial to employ assessment approaches that encompass both real-time feedback from users (online evaluation) and the characteristics of clarification questions evaluated through human assessment (offline evaluation). However, the relationship between online and offline evaluations has been debated in information retrieval. This study aims to investigate how this discordance holds in search clarification. We use user engagement as ground truth and employ several offline labels to investigate to what extent the offline ranked lists of clarification resemble the ideal ranked lists based on online user engagement. Contrary to the current understanding that offline evaluations fall short of supporting online evaluations, we indicate that when identifying the most engaging clarification questions from the user’s perspective, online and offline evaluations correspond with each other. We show that the query length does not influence the relationship between online and offline evaluations, and reducing uncertainty in online evaluation strengthens this relationship. We illustrate that an engaging clarification needs to excel from multiple perspectives, and SERP quality and characteristics of the clarification are equally important. We also investigate if human labels can enhance the performance of Large Language Models (LLMs) and Learning-to-Rank (LTR) models in identifying the most engaging clarification questions from the user’s perspective by incorporating offline evaluations as input features. Our results indicate that LTR models do not perform better than individual offline labels. However, GPT, an LLM, emerges as the standout performer, surpassing all LTR models and offline labels.
more » « less
Free, publicly-accessible full text available January 31, 2026
Understanding Modality Preferences in Search Clarification

Tavakoli, L; Castiglia, G; Calo, F; Deldjoo, Y; Zamani, H; Trippas, J (October 2024, Online Proceedings of the 1st Workshop on Multimodal Search and Recommendations (CIKM MMSR '24))

Full Text Available
Interactions with Generative Information Retrieval Systems

https://doi.org/10.1007/978-3-031-73147-1_3

Aliannejadi, Mohammad; Gwizdka, Jacek; Zamani, Hamed (September 2024, Springer Nature Switzerland)
Shah, Chirag; White, Ryen (Ed.)
Full Text Available
LaMP: When Large Language Models Meet Personalization

Salemi, A; Mysore, S; Bendersky, M; Zamani, H (August 2024, Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024))

Full Text Available
Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models

https://doi.org/10.1145/3626772.3657733

Salemi, Alireza; Zamani, Hamed (July 2024, Proceedings of The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024))

Full Text Available
Third Workshop on Personalization and Recommendations in Search (PaRiS)

https://doi.org/10.1145/3626772.3657983

Lamkhede, Sudarshan; Zamani, Hamed; Bhattacharya, Moumita; Wang, Hongning (July 2024, Proceedings of The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24))

Full Text Available
ProCIS: A Benchmark for Proactive Retrieval in Conversations

https://doi.org/10.1145/3626772.3657869

Samarinas, Chris; Zamani, Hamed (July 2024, Proceedings of The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24))

Full Text Available
Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

https://doi.org/10.1145/3626772.3657923

Zamani, Hamed; Bendersky, Michael (July 2024, Proceedings of The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24))

Full Text Available
Multi-Modal Augmentation for Large Language Models with Applications to Task-Oriented Dialogues

Samarinas, Chris; Promthaw, Pracha; Lekhwani, Rahul; Mysore, Sheshera; Huang, Sung Ming; Nijasure, Atharva; Zeng, Hansi; Zamani, Hamed (October 2023, 2nd Proceedings of Alexa Prize TaskBot (Alexa Prize 2023))

We introduce MarunaBot V2, an advanced Task-Oriented Dialogue System (TODS) primarily aimed at aiding users in cooking and Do-It-Yourself tasks. We utilized large language models (LLMs) for data generation and inference, and implemented hybrid methods for intent classification, retrieval, and question answering, striking a balance between efficiency and performance. A key feature of our system is its multi-modal capabilities. We have incorporated a multi-modal enrichment technique that uses a fine-tuned CLIP model to supplement recipe instructions with pertinent images, a custom Diffusion model for image enhancement and generation, and a method for multi-modal option matching. A unique aspect of our system is its user-centric development approach, facilitated by a custom tool for tracking user interactions and swiftly integrating feedback. For a demonstration of our system, visit https://youtu.be/4MNI-puv_eE.
more » « less
Full Text Available
Multi-Modal Augmentation for Large Language Models with Applications to Task-Oriented Dialogues

Samarinas, Chris; Promthaw, Pracha; Lekhwani, Rahul; Mysore, Sheshera; Huang, Sung Ming; Nijasure, Atharva; Zeng, Hansi; Zamani, Hamed (October 2023, 2nd Proceedings of Alexa Prize TaskBot (Alexa Prize 2023))

We introduce MarunaBot V2, an advanced Task-Oriented Dialogue System (TODS) primarily aimed at aiding users in cooking and Do-It-Yourself tasks. We utilized large language models (LLMs) for data generation and inference, and implemented hybrid methods for intent classification, retrieval, and question answering, striking a balance between efficiency and performance. A key feature of our system is its multi-modal capabilities. We have incorporated a multi-modal enrichment technique that uses a fine-tuned CLIP model to supplement recipe instructions with pertinent images, a custom Diffusion model for image enhancement and generation, and a method for multi-modal option matching. A unique aspect of our system is its user-centric development approach, facilitated by a custom tool for tracking user interactions and swiftly integrating feedback. For a demonstration of our system, visit https://youtu.be/4MNI-puv_eE.
more » « less
Full Text Available

« Prev Next »

Search for: All records