skip to main content


Title: Patterns of Gender-Specializing Query Reformulation
Users of search systems often reformulate their queries by adding query terms to reflect their evolving information need or to more precisely express their information need when the system fails to surface relevant content. Analyzing these query reformulations can inform us about both system and user behavior. In this work, we study a special category of query reformulations that involve specifying demographic group attributes, such as gender, as part of the reformulated query (e.g., “olympic 2021 soccer results” → “olympic 2021 women‘s soccer results”). There are many ways a query, the search results, and a demographic attribute such as gender may relate, leading us to hypothesize different causes for these reformulation patterns, such as under-representation on the original result page or based on the linguistic theory of markedness. This paper reports on an observational study of gender-specializing query reformulations—their contexts and effects—as a lens on the relationship between system results and gender, based on large-scale search log data from Bing. We find that these reformulations sometimes correct for and other times reinforce gender representation on the original result page, but typically yield better access to the ultimately-selected results. The prevalence of these reformulations—and which gender they skew towards—differ by topical context. However, we do not find evidence that either group under-representation or markedness alone adequately explains these reformulations. We hope that future research will use such reformulations as a probe for deeper investigation into gender (and other demographic) representation on the search result page.  more » « less
Award ID(s):
1751278
NSF-PAR ID:
10423689
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’23)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Searching for the meaning of an unfamiliar sign-language word in a dictionary is difficult for learners, but emerging sign-recognition technology will soon enable users to search by submitting a video of themselves performing the word they recall. However, sign-recognition technology is imperfect, and users may need to search through a long list of possible results when seeking a desired result. To speed this search, we present a hybrid-search approach, in which users begin with a video-based query and then filter the search results by linguistic properties, e.g., handshape. We interviewed 32 ASL learners about their preferences for the content and appearance of the search-results page and filtering criteria. A between-subjects experiment with 20 ASL learners revealed that our hybrid search system outperformed a video-based search system along multiple satisfaction and performance metrics. Our findings provide guidance for designers of video-based sign-language dictionary search systems, with implications for other search scenarios. 
    more » « less
  2. This study assesses the awareness and perceived utility of two features Google Search introduced in February 2021: “About this result” and “More about this page”. Google stated that the goal of these features is to help users vet unfamiliar web domains (or sources). We investigated whether the features were sufficiently prominent to be detected by frequent users of Google Search, and their perceived utility for making credibility judgments of sources, in one-on-one user studies with 25 undergraduate college students, who identify as frequent users of Google Search. Our results indicate a lack of adoption or awareness of these features by our participants and neutral-positive perceptions of their utility in evaluating web sources. We also examined the perceived usefulness of nine other domain credibility signals collected from the W3C. 
    more » « less
  3. Diversity, group representation, and similar needs often apply to query results, which in turn require constraints on the sizes of various subgroups in the result set. Traditional relational queries only specify conditions as part of the query predicate(s), and do not support such restrictions on the output. In this paper, we study the problem of modifying queries to have the result satisfy constraints on the sizes of multiple subgroups in it. This problem, in the worst case, cannot be solved in polynomial time. Yet, with the help of provenance annotation, we are able to develop a query refinement method that works quite efficiently, as we demonstrate through extensive experiments.

     
    more » « less
  4. We propose a multi-task learning framework to jointly learn document ranking and query suggestion for web search. It consists of two major components, a document ranker and a query recommender. Document ranker combines current query and session information and compares the combined representation with document representation to rank the documents. Query recommender tracks users’ query reformulation sequence considering all previous in-session queries using a sequence to sequence approach. As both tasks are driven by the users’ underlying search intent, we perform joint learning of these two components through session recurrence, which encodes search context and intent. Extensive comparisons against state-of-the-art document ranking and query suggestion algorithms are performed on the public AOL search log, and the promising results endorse the effectiveness of the joint learning framework. 
    more » « less
  5. We propose a multi-task learning framework to jointly learn document ranking and query suggestion for web search. It consists of two major components, a document ranker and a query recommender. Document ranker combines current query and session information and compares the combined representation with document representation to rank the documents. Query recommender tracks users’ query reformulation sequence considering all previous in-session queries using a sequence to sequence approach. As both tasks are driven by the users’ underlying search intent, we perform joint learning of these two components through session recurrence, which encodes search context and intent. Extensive comparisons against state-of-the-art document ranking and query suggestion algorithms are performed on the public AOL search log, and the promising results endorse the effectiveness of the joint learning framework. 
    more » « less