skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on March 24, 2026

Title: Contextualizing the Role of Web Search In Creative Workflows: Insights from a Longitudinal Study
While creativity is often romanticized as a serendipitous ’aha’ moment of insight, in reality, it is an iterative process that often involves searching for information on the Web. In this paper, we investigate the role of web search throughout the creative process. We conducted a longitudinal study involving 15 professionals engaged in creative work, such as scientific research, startup product design, and policy development, observing them throughout their one to six-month-long projects. We developed Web ChronoLogger, a browser extension that logs Web Search and Project document activity over the course of the project in an intuitive, transparent, and privacy-preserving manner. Additionally, we collect qualitative insights from participants reflecting on their logs through weekly surveys and a post-study interview. We find quantitative patterns in how participants search the web and work with information in working documents throughout their creative projects. Web search was used even when generating ideas and defining goals, stages often assumed to involve just mental processes. Further, patterns in the content, structure, and edit history of how participants work with information found on the web can encode signals about the user’s context, such as patterns and gaps in their knowledge, project goals and progress, and work style. This study’s longitudinal perspective provides a foundation for building the future of web search tools in ways that support the entire creative workflow.  more » « less
Award ID(s):
2009003
PAR ID:
10654075
Author(s) / Creator(s):
 ;  
Publisher / Repository:
ACM
Date Published:
Page Range / eLocation ID:
208 to 218
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Given the importance of broadening participation in the field of computing, goals of supporting personal expression and developing a sense of belonging must live alongside the goals of conceptual knowledge and developing disciplinary expertise. Integrating opportunities for students to be creative in how they enact computing ideas plays an important role when designing curricula. We examine how student creativity, as expressed through theme and the use of costumes, backdrops, and narrative in Scratch projects, is affected by using a themed starter project. Starter projects are Scratch projects that include a set of sprites and backdrops aligned to a theme (e.g. baseball), but no code. Using within-group and between- group comparisons, we establish a baseline of what students do when they are given a starter project and explore how their projects differ in the absence of a starter project. This work contributes to our understanding of the impacts of structured elements within open-ended learning tasks and how we can design computer science learning experiences for students that promote opportunities for self-expression while engaging them in computing. 
    more » « less
  2. Regular expressions are frequently found in programming projects. Studies have found that developers can accurately determine whether a string matches a regular expression. However, we still do not know the challenges associated with composing regular expressions. We conduct an exploratory case study to reveal the tools and strategies developers use during regular expression composition. In this study, 29 students are tasked with composing regular expressions that pass unit tests illustrating the intended behavior. The tasks are in Java and the Eclipse IDE was set up with JUnit tests. Participants had one hour to work and could use any Eclipse tools, web search, or web-based tools they desired. Screen- capture software recorded all interactions with browsers and the IDE. We analyzed the videos quantitatively by transcribing logs and extracting personas. Our results show that participants were 30% successful (28 of 94 attempts) at achieving a 100% pass rate on the unit tests. When participants used tools frequently, as in the case of the novice tester and the knowledgeable tester personas, or when they guess at a solution prior to searching, they are more likely to pass all the unit tests. We also found that compile errors often arise when participants searched for a result and copy/pasted the regular expression from another language into their Java files. These results point to future research into making regular expression composition easier for programmers, such as integrating visualization into the IDE to reduce context switching or providing language migration support when reusing regular expressions written in another language to reduce compile errors. 
    more » « less
  3. Global teams frequently consist of language-based subgroups who put together complementary information to achieve common goals. Previous research outlines a two-step work communication flow in these teams. There are team meetings using a required common language (i.e., English); in preparation for those meetings, people have subgroup conversations in their native languages. Work communication at team meetings is often less effective than in subgroup conversations. In the current study, we investigate the idea of leveraging machine translation (MT) to facilitate global team meetings. We hypothesize that exchanging subgroup conversation logs before a team meeting offers contextual information that benefits teamwork at the meeting. MT can translate these logs, which enables comprehension at a low cost. To test our hypothesis, we conducted a between-subjects experiment where twenty quartets of participants performed a personnel selection task. Each quartet included two English native speakers (NS) and two non-native speakers (NNS) whose native language was Mandarin. All participants began the task with subgroup conversations in their native languages, then proceeded to team meetings in English. We manipulated the exchange of subgroup conversation logs prior to team meetings: with MT-mediated exchanges versus without. Analysis of participants' subjective experience, task performance, and depth of discussions as reflected through their conversational moves jointly indicates that team meeting quality improved when there were MT-mediated exchanges of subgroup conversation logs as opposed to no exchanges. We conclude with reflections on when and how MT could be applied to enhance global teamwork across a language barrier. 
    more » « less
  4. The volume, variety, and velocity of different data, e.g., simulation data, observation data, and social media data, are growing ever faster, posing grand challenges for data discovery. An increasing trend in data discovery is to mine hidden relationships among users and metadata from the web usage logs to support the data discovery process. Web usage log mining is the process of reconstructing sessions from raw logs and finding interesting patterns or implicit linkages. The mining results play an important role in improving quality of search-related components, e.g., ranking, query suggestion, and recommendation. While researches were done in the data discovery domain, collecting and analyzing logs efficiently remains a challenge because (1) the volume of web usage logs continues to grow as long as users access the data; (2) the dynamic volume of logs requires on-demand computing resources for mining tasks; (3) the mining process is compute-intensive and time-intensive. To speed up the mining process, we propose a cloud-based log-mining framework using Apache Spark and Elasticsearch. In addition, a data partition paradigm, logPartitioner, is designed to solve the data imbalance problem in data parallelism. As a proof of concept, oceanographic data search and access logs are chosen to validate performance of the proposed parallel log-mining framework. 
    more » « less
  5. How do Google Search results change following an impactful real-world event, such as the U.S. Supreme Court decision on June 24, 2022 to overturn Roe v. Wade? And what do they tell us about the nature of event-driven content, generated by various participants in the online information environment? In this paper, we present a dataset of more than 1.74 million Google Search results pages collected between June 24 and July 17, 2022, intended to capture what Google Search surfaced in response to queries about this event of national importance. These search pages were collected for 65 locations in 13 U.S. states, a mix of red, blue, and purple states, with respect to their voting patterns. We describe the process of building a set of circa 1,700 phrases used for searching Google, how we gathered the search results for each location, and how these results were parsed to extract information about the most frequently encountered web domains. We believe that this dataset, which comprises raw data (search results as HTML files) and processed data (extracted links organized as CSV files) can be used to answer research questions that are of interest to computational social scientists as well as communication and media studies scholars. 
    more » « less