skip to main content


Title: Supercalifragilisticexpialidocious: Why Using the “Right” Readability Formula in Children’s Web Search Matters
Readability is a core component of information retrieval (IR) tools as the complexity of a resource directly affects its relevance: a resource is only of use if the user can comprehend it. Even so, the link between readability and IR is often overlooked. As a step towards advancing knowledge on the influence of readability on IR, we focus on Web search for children. We explore how traditional formulas–which are simple, efficient, and portable–fare when applied to estimating the readability of Web resources for children written in English. We then present a formula well-suited for readability estimation of child-friendly Web resources. Lastly, we empirically show that readability can sway children’s information access. Outcomes from this work reveal that: (i) for Web resources targeting children, a simple formula suffices as long as it considers contemporary terminology and audience requirements, and (ii) instead of turning to Flesch-Kincaid–a popular formula–the use of the “right” formula can shape Web search tools to best serve children. The work we present herein builds on three pillars: Audience, Application, and Expertise. It serves as a blueprint to place readability estimation methods that best apply to and inform IR applications serving varied audiences.  more » « less
Award ID(s):
1763649
NSF-PAR ID:
10337077
Author(s) / Creator(s):
Editor(s):
Hagen, Matthias and
Date Published:
Journal Name:
44th European Conference on Information Retrieval (ECIR)
Page Range / eLocation ID:
3-18
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Given the more widespread nature of natural language interfaces, it is increasingly important to understand who are accessing those interfaces, and how those interfaces are being used. In this paper, we explore spellchecking in the context of web search with children as the target audience. In particular, via a literature review we show that, while widely used, popular search tools are ill-designed for children. We then use spellcheckers as a case study to highlight the need for an interdisciplinary approach that brings together natural language processing, education, human-computer interaction to address a known information retrieval problem: query misspelling. We conclude that it is imperative that those for whom the interfaces are designed have a voice in the design process. 
    more » « less
  2. Abstract

    We present a search for extremely red, dust-obscured,z> 7 galaxies with JWST/NIRCam+MIRI imaging over the first 20 arcmin2of publicly available Cycle 1 data from the COSMOS-Web, CEERS, and PRIMER surveys. Based on their red color in F277W−F444W (∼2.5 mag) and detection in MIRI/F770W (∼25 mag), we identify two galaxies, COS-z8M1 and CEERS-z7M1, that have best-fit photometric redshifts ofz=8.40.4+0.3and7.60.1+0.1, respectively. We perform spectral energy distribution fitting with a variety of codes (includingbagpipes,prospector,beagle, andcigale) and find a >95% probability that these indeed lie atz> 7. Both sources are compact (Reff≲ 200 pc) and highly obscured (AV∼ 1.5–2.5) and, at our best-fit redshift estimates, likely have strong [Oiii]+Hβemission contributing to their 4.4μm photometry. We estimate stellar masses of ∼1010Mfor both sources; by virtue of detection in MIRI at 7.7μm, these measurements are robust to the inclusion of bright emission lines, for example, from an active galactic nucleus. We identify a marginal (2.9σ) Atacama Large Millimeter/submillimeter Array detection at 2 mm within 0.″5 of COS-z8M1, which, if real, would suggest a remarkably high IR luminosity of ∼1012L. These two galaxies, if confirmed atz∼ 8, would be extreme in their stellar and dust masses and may be representative of a substantial population of highly dust-obscured galaxies at cosmic dawn.

     
    more » « less
  3. Children use popular web search tools, which are generally designed for adult users. Because children have different developmental needs than adults, these tools may not always adequately support their search for information. Moreover, even though search tools offer support to help in query formulation, these too are aimed at adults and may hinder children rather than help them. This calls for the examination of existing technologies in this area, to better understand what remains to be done when it comes to facilitating query-formulation tasks for young users. In this paper, we investigate interaction elements of query formulation--including query suggestion algorithms--for children. The primary goals of our research efforts are to: (i) examine existing plug-ins and interfaces that explicitly aid children's query formulation; (ii) investigate children's interactions with suggestions offered by a general-purpose query suggestion strategy vs. a counterpart designed with children in mind; and (iii) identify, via participatory design sessions, their preferences when it comes to tools / strategies that can help children find information and guide them through the query formulation process. Our analysis shows that existing tools do not meet children's needs and expectations; the outcomes of our work can guide researchers and developers as they implement query formulation strategies for children. 
    more » « less
  4. One of the top priorities in observational astronomy is the direct imaging and characterization of extrasolar planets (exoplanets) and planetary systems. Direct images of rocky exoplanets are of particular interest in the search for life beyond the Earth, but they tend to be rather challenging targets since they are orders-of-magnitude dimmer than their host stars and are separated by small angular distances that are comparable to the classicalλ<#comment/>/Ddiffraction limit, even for the coming generation of 30 m class telescopes. Current and planned efforts for ground-based direct imaging of exoplanets combine high-order adaptive optics (AO) with a stellar coronagraph observing at wavelengths ranging from the visible to the mid-IR. The primary barrier to achieving high contrast with current direct imaging methods is quasi-static speckles, caused largely by non-common path aberrations (NCPAs) in the coronagraph optical train. Recent work has demonstrated that millisecond imaging, which effectively “freezes” the atmosphere’s turbulent phase screens, should allow the wavefront sensor (WFS) telemetry to be used as a probe of the optical system to measure NCPAs. Starting with a realistic model of a telescope with an AO system and a stellar coronagraph, this paper provides simulations of several closely related regression models that take advantage of millisecond telemetry from the WFS and coronagraph’s science camera. The simplest regression model, called the naïve estimator, does not treat the noise and other sources of information loss in the WFS. Despite its flaws, in one of the simulations presented herein, the naïve estimator provides a useful estimate of an NCPA of∼<#comment/>0.5radian RMS (≈<#comment/>λ<#comment/>/13), with an accuracy of∼<#comment/>0.06radian RMS in 1 min of simulated sky time on a magnitude 8 star. Thebias-corrected estimatorgeneralizes the regression model to account for the noise and information loss in the WFS. A simulation of the bias-corrected estimator with 4 min of sky time included an NCPA of∼<#comment/>0.05radian RMS (≈<#comment/>λ<#comment/>/130) and an extended exoplanet scene. The joint regression of the bias-corrected estimator simultaneously achieved an NCPA estimate with an accuracy of∼<#comment/>5×<#comment/>10−<#comment/>3radian RMS and an estimate of the exoplanet scene that was free of the self-subtraction artifacts typically associated with differential imaging. The5σ<#comment/>contrast achieved by imaging of the exoplanet scene was∼<#comment/>1.7×<#comment/>10−<#comment/>4at a distance of3λ<#comment/>/Dfrom the star and∼<#comment/>2.1×<#comment/>10−<#comment/>5at10λ<#comment/>/D. These contrast values are comparable to the very best on-sky results obtained from multi-wavelength observations that employ both angular differential imaging (ADI) and spectral differential imaging (SDI). This comparable performance is despite the fact that our simulations are quasi-monochromatic, which makes SDI impossible, nor do they have diurnal field rotation, which makes ADI impossible. The error covariance matrix of the joint regression shows substantial correlations in the exoplanet and NCPA estimation errors, indicating that exoplanet intensity and NCPA need to be estimated self-consistently to achieve high contrast.

     
    more » « less
  5. We consider the problem of dividing limited resources to individuals arriving over T rounds. Each round has a random number of individuals arrive, and individuals can be characterized by their type (i.e., preferences over the different resources). A standard notion of fairness in this setting is that an allocation simultaneously satisfy envy-freeness and efficiency. The former is an individual guarantee, requiring that each agent prefers the agent’s own allocation over the allocation of any other; in contrast, efficiency is a global property, requiring that the allocations clear the available resources. For divisible resources, when the number of individuals of each type are known up front, the desiderata are simultaneously achievable for a large class of utility functions. However, in an online setting when the number of individuals of each type are only revealed round by round, no policy can guarantee these desiderata simultaneously, and hence, the best one can do is to try and allocate so as to approximately satisfy the two properties. We show that, in the online setting, the two desired properties (envy-freeness and efficiency) are in direct contention in that any algorithm achieving additive counterfactual envy-freeness up to a factor of L T necessarily suffers an efficiency loss of at least [Formula: see text]. We complement this uncertainty principle with a simple algorithm, Guarded-Hope, which allocates resources based on an adaptive threshold policy and is able to achieve any fairness–efficiency point on this frontier. Our results provide guarantees for fair online resource allocation with high probability for multiple resource and multiple type settings. In simulation results, our algorithm provides allocations close to the optimal fair solution in hindsight, motivating its use in practical applications as the algorithm is able to adapt to any desired fairness efficiency trade-off. Funding: This work was supported by the National Science Foundation [Grants ECCS-1847393, DMS-1839346, CCF-1948256, and CNS-1955997] and the Army Research Laboratory [Grant W911NF-17-1-0094]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2022.2397 . 
    more » « less