In many situations, it may be impractical or impossible to enter text
by selecting precise locations on a physical or touchscreen keyboard.
We present an ambiguous keyboard with four character groups that
has potential applications for eyes-free text entry, as well as text entry
using a single switch or a brain-computer interface.We develop
a procedure for optimizing these character groupings based on a disambiguation
algorithm that leverages a long-span language model.
We produce both alphabetically-constrained and unconstrained
character groups in an offline optimization experiment and compare
them in a longitudinal user study. Our results did not show a
significant difference between the constrained and unconstrained
character groups after four hours of practice. As expected, participants
had significantly more errors with the unconstrained groups
in the first session, suggesting a higher barrier to learning the
technique.We therefore recommend the alphabetically-constrained
character groups, where participants were able to achieve an average
entry rate of 12.0 words per minute with a 2.03% character
error rate using a single hand and with no visual feedback.
more »
« less
Enhancing the Composition Task in Text Entry Studies: Eliciting Difficult Text and Improving Error Rate Calculation
Participants in text entry studies usually copy phrases or compose novel messages. A composition task mimics actual user behavior and can allow researchers to better understand how a system might perform in reality. A problem with composition is that participants may gravitate towards writing simple text, that is, text containing only common words. Such simple text is insufficient to explore all factors governing a text entry method, such as its error correction features. We contribute to enhancing composition tasks in two ways. First, we show participants can modulate the difficulty of their compositions based on simple instructions. While it took more time to compose difficult messages, they were longer, had more difficult words, and resulted in more use of error correction features. Second, we compare two methods for obtaining a participant’s intended text, comparing both methods with a previously proposed crowdsourced judging procedure. We found participant-supplied references were more accurate.
more »
« less
- Award ID(s):
- 1909248
- NSF-PAR ID:
- 10237296
- Date Published:
- Journal Name:
- Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems
- Page Range / eLocation ID:
- 1 to 8
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Readers find text difficult to consume for many reasons. Summarization can address some of these difficulties, but introduce others, such as omitting, misrepresenting, or hallucinating information, which can be hard for a reader to notice. One approach to addressing this problem is to instead modify how the original text is rendered to make important information more salient. We introduce Grammar-Preserving Text Saliency Modulation (GP-TSM), a text rendering method with a novel means of identifying what to de-emphasize. Specifically, GP-TSM uses a recursive sentence compression method to identify successive levels of detail beyond the core meaning of a passage, which are de-emphasized by rendering words in successively lighter but still legible gray text. In a lab study (n=18), participants preferred GP-TSM over pre-existing word-level text rendering methods and were able to answer GRE reading comprehension questions more efficiently.more » « less
-
Given that depression is one of the most prevalent mental illnesses, developing effective and unobtrusive diagnosis tools is of great importance. Recent work that screens for depression with text messages leverage models relying on lexical category features. Given the colloquial nature of text messages, the performance of these models may be limited by formal lexicons. We thus propose a strategy to automatically construct alternative lexicons that contain more relevant and colloquial terms. Specifically, we generate 36 lexicons from fiction, forum, and news corpuses. These lexicons are then used to extract lexical category features from the text messages. We utilize machine learning models to compare the depression screening capabilities of these lexical category features. Out of our 36 constructed lexicons, 14 achieved statistically significantly higher average F1 scores over the pre-existing formal lexicon and basic bag-of-words approach. In comparison to the pre-existing lexicon, our best performing lexicon increased the average F1 scores by 10%. We thus confirm our hypothesis that less formal lexicons can improve the performance of classification models that screen for depression with text messages. By providing our automatically constructed lexicons, we aid future machine learning research that leverages less formal text.more » « less
-
null (Ed.)Text entry makes up about one-fourth of the smartphone interaction events, and is known to be challenging and difficult. However, there has been little study about the characteristics of text entry in the context of smartphone app usage. In this paper, we present a mixed-method in-situ study conducted in 2016 with 17 active smartphone users to better understand text entry in smartphone app usage. Our results show 80% of text was entered into communication apps, with different apps exhibiting distinct usage patterns. We found that structured data such as URLs and email addresses are rarely typed but instead are auto-completed or replaced with search, copy-and-paste is rarely used, and sessions of smartphone usage with text entry involve more apps and last longer. We conclude with a discussion about the implications on the development of systems to better support mobile interaction.more » « less
-
As text generated by large language models proliferates, it becomes vital to understand how humans engage with such text, and whether or not they are able to detect when the text they are reading did not originate with a human writer. Prior work on human detection of generated text focuses on the case where an entire passage is either human-written or machine-generated. In this paper, we study a more realistic setting where text begins as human-written and transitions to being generated by state-of-the-art neural language models. We show that, while annotators often struggle at this task, there is substantial variance in annotator skill and that given proper incentives, annotators can improve at this task over time. Furthermore, we conduct a detailed comparison study and analyze how a variety of variables (model size, decoding strategy, fine-tuning, prompt genre, etc.) affect human detection performance. Finally, we collect error annotations from our participants and use them to show that certain textual genres influence models to make different types of errors and that certain sentence-level features correlate highly with annotator selection. We release the RoFT dataset: a collection of over 21,000 human annotations paired with error classifications to encourage future work in human detection and evaluation of generated text.more » « less