skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Dataset of Noisy Typing on QWERTY Keyboards
Text entry is a common and important part of many intelligent user interfaces. However, inferring a user’s intended text from their input can be challenging: motor actions can be imprecise, input sensors can be noisy, and situations or disabilities can hamper a user’s perception of interface feedback. Numerous prior studies have explored input on touchscreen phones, smartwatches, in midair, and on desktop keyboards. Based on these prior studies, we are releasing a large and diverse data set of noisy typing input consisting of thousands of sentences written by hundreds of users on QWERTY-layout keyboards. This paper describes the various subsets contained in this new research dataset as well as the data format.  more » « less
Award ID(s):
1909248 1750193
PAR ID:
10404000
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Companion Proceedings of the 28th International Conference on Intelligent User Interfaces
Page Range / eLocation ID:
251 to 254
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Bottoni, Paolo; Panizzi, Emanuele (Ed.)
    Many questions regarding single-hand text entry on modern smartphones (in particular, large-screen smartphones) remain under-explored, such as, (i) will the existing prevailing single-handed keyboards fit for large-screen smartphone users? and (ii) will individual customization improve single-handed keyboard performance? In this paper we study single-handed typing behaviors on several representative keyboards on large-screen mobile devices.We found that, (i) the user-adaptable-shape curved keyboard performs best among all the studied keyboards; (ii) users’ familiarity with the Qwerty layout plays a significant role at the beginning, but after several sessions of training, the user-adaptable curved keyboard can have the best learning curve and performs best; (iii) generally the statistical decoding algorithms via spatial and language models can well handle the input noise from single-handed typing. 
    more » « less
  2. null (Ed.)
    Accessible onscreen keyboards require people who are blind to keep out their phone at all times to search for visual affordances they cannot see. Is it possible to re-imagine text entry without a reference screen? To explore this question, we introduce screenless keyboards as aural flows (keyflows): rapid auditory streams of Text-To-Speech (TTS) characters controllable by hand gestures. In a study, 20 screen-reader users experienced keyflows to perform initial text entry. Typing took inordinately longer than current screen-based keyboards, but most participants preferred screen-free text entry to current methods, especially for short messages on-the-go. We model navigation strategies that participants enacted to aurally browse entirely auditory keyboards and discuss their limitation and benefits for daily access. Our work points to trade-offs in user performance and user experience for situations when blind users may trade typing speed with the benefit of being untethered from the screen. 
    more » « less
  3. null (Ed.)
    Typing every character in a text message may require more time or effort than strictly necessary. Skipping spaces or other characters may be able to speed input and also reduce a user's physical input effort. This can be particularly important for people with motor impairments. In a large crowdsourced study, we found workers frequently abbreviated text by omitting mid-word vowels. We designed a recognizer optimized for noisy input where users often omit spaces and mid-word vowels. We show using neural language models for selecting training text and rescoring sentences improved accuracy. On noisy touchscreen data collected from hundreds of users, we found accurate abbreviated input was possible even if a third of characters were omitted. Finally, in a study where users had to dwell for a second on each key, sentence abbreviated input was competitive with a conventional keyboard with word predictions. After practice, users wrote abbreviated sentences at 9.6 words-per-minute versus word input at 9.9 words-per-minute. 
    more » « less
  4. Keystroke dynamics study the way in which users input text via their keyboards, which is unique to each individual, and can form a component of a behavioral biometric system to improve existing account security. Keystroke dynamics systems on free-text data use n-graphs that measure the timing between consecutive keystrokes to distinguish between users. Many algorithms require 500, 1,000, or more keystrokes to achieve EERs of below 10%. In this paper, we propose an instance-based graph comparison algorithm to reduce the number of keystrokes required to authenticate users. Commonly used features such as monographs and digraphs are investigated. Feature importance is determined and used to construct a fused classifier. Detection error tradeoff (DET) curves are produced with different numbers of keystrokes. The fused classifier outperforms the state-of-the-art with EERs of 7.9%, 5.7%, 3.4%, and 2.7% for test samples of 50, 100, 200, and 500 keystrokes. 
    more » « less
  5. Abstract We present a method for mining the web for text entered on mobile devices. Using searching, crawling, and parsing techniques, we locate text that can be reliably identified as originating from 300 mobile devices. This includes 341,000 sentences written on iPhones alone. Our data enables a richer understanding of how users type “in the wild” on their mobile devices. We compare text and error characteristics of different device types, such as touchscreen phones, phones with physical keyboards, and tablet computers. Using our mined data, we train language models and evaluate these models on mobile test data. A mixture model trained on our mined data, Twitter, blog, and forum data predicts mobile text better than baseline models. Using phone and smartwatch typing data from 135 users, we demonstrate our models improve the recognition accuracy and word predictions of a state-of-the-art touchscreen virtual keyboard decoder. Finally, we make our language models and mined dataset available to other researchers. 
    more » « less