skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on July 1, 2026

Title: Typing in tandem: Language planning in multisentence text production is fundamentally parallel.
Classical serial models view the process of producing a text as a chain of discrete pauses during which the next span of text is planned, and bursts of activity during which this text is output onto the page or computer screen. In contrast, parallel models assume that by default planning of the next text unit is performed in parallel with previous execution. We instantiated these two views as Bayesian mixed-effects models across six sets of keystroke data from child and adult writers composing different types of multi-sentence text. We modelled interkey intervals with a single distribution, hypothesised by the serial processing account, and with a two-distribution mixture model that is hypothesised by the parallel-processing account. We analysed intervals occuring before-sentence, before word, and within word. Model comparisons demonstrated strong evidence in favour of the parallel view across all datasets. When pausing occurred, sentence initial inter-keystroke intervals were longer than word initial pauses. This is consistent with the idea that edges of larger linguistic units are associated with higher level planning. However, we found – across populations – that interkey intervals at word and even at sentence boundaries were often too brief to plausibly represent time to plan what was written next. Our results cannot be explained by the serial processing but are in line with the parallel view of multi-sentence text composition.  more » « less
Award ID(s):
2302644
PAR ID:
10614280
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
APA
Date Published:
Journal Name:
Journal of Experimental Psychology: General
Volume:
154
Issue:
7
ISSN:
0096-3445
Page Range / eLocation ID:
1824 to 1854
Subject(s) / Keyword(s):
parallel processing writing mixture models language production keystroke logging
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Account recovery is ubiquitous across web applications but circumvents the username/password-based login step. Therefore, it deserves the same level of security as the user authentication process. A common simplistic procedure for account recovery requires that a user enters the same email used during registration, to which a password recovery link or a new username could be sent. Therefore, an impostor with access to a user’s registration email and other credentials can trigger an account recovery session to take over the user’s account. To prevent such attacks, beyond validating the email and other credentials entered by the user, our proposed recovery method utilizes keystroke dynamics to further secure the account recovery mechanism. Keystroke dynamics is a type of behavioral biometrics that uses the analysis of typing rhythm for user authentication. Using a new dataset with over 500,000 keystrokes collected from 44 students and university staff when they fill out an account recovery web form of multiple fields, we have evaluated the performance of five scoring algorithms on individual fields as well as feature-level fusion and weighted-score fusion. We achieve the best EER of 5.47% when keystroke dynamics from individual fields are used, 0% for a feature-level fusion of five fields, and 0% for a weighted-score fusion of seven fields. Our work represents a new kind of keystroke dynamics that we would like to call it ‘medium fixed-text’ as it sits between the conventional (short) fixed text and (long) free text research. 
    more » « less
  2. With the rapid improvement of large language models capabilities, there has been increasing interest in challenging constrained text generation problems. However, existing benchmarks for constrained generation usually focus on fixed constraint types (e.g. generate a sentence containing certain words) that have proved to be easy for state-of-the-art models like GPT-4. We present COLLIE, a grammar- based framework that allows the specification of rich, compositional constraints with diverse generation levels (word, sentence, paragraph, passage) and modeling challenges (e.g. language understanding, logical reasoning, counting, semantic planning). We also develop tools for automatic extraction of task instances given a constraint structure and a raw text corpus. Using COLLIE, we compile the COLLIE- v1 dataset with 2,080 instances comprising 13 constraint structures. We perform systematic experiments across five state-of-the-art instruction-tuned language mod- els and analyze their performances to reveal shortcomings. COLLIE is designed to be extensible and lightweight, and we hope the community finds it useful to develop more complex constraints and evaluations in the future. 
    more » « less
  3. Free-text keystroke is a form of behavioral biometrics which has great potential for addressing the security limitations of conventional one-time authentication by continuously monitoring the user's typing behaviors. This paper presents a new, enhanced continuous authentication approach by incorporating the dynamics of both keystrokes and wrist motions. Based upon two sets of features (free-text keystroke latency features and statistical wrist motion patterns extracted from the wrist-worn smartwatches), two one-vs-all Random Forest Ensemble Classifiers (RFECs) are constructed and trained respectively. A Dynamic Trust Model (DTM) is then developed to fuse the two classifiers' decisions and realize non-time-blocked real-time authentication. In the free-text typing experiments involving 25 human subjects, an imposter/intruder can be detected within no more than one sentence (average 56 keystrokes) with an FRR of 1.82% and an FAR of 1.94%. Compared with the scheme relying on only keystroke latency which has an FRR of 4.66%, an FAR of 17.92% and the required number of keystroke of 162, the proposed authentication system shows significant improvements in terms of accuracy, efficiency, and usability. 
    more » « less
  4. Speech processing is highly incremental. It is widely accepted that human listeners continuously use the linguistic context to anticipate upcoming concepts, words, and phonemes. However, previous evidence supports two seemingly contradictory models of how a predictive context is integrated with the bottom-up sensory input: Classic psycholinguistic paradigms suggest a two-stage process, in which acoustic input initially leads to local, context-independent representations, which are then quickly integrated with contextual constraints. This contrasts with the view that the brain constructs a single coherent, unified interpretation of the input, which fully integrates available information across representational hierarchies, and thus uses contextual constraints to modulate even the earliest sensory representations. To distinguish these hypotheses, we tested magnetoencephalography responses to continuous narrative speech for signatures of local and unified predictive models. Results provide evidence that listeners employ both types of models in parallel. Two local context models uniquely predict some part of early neural responses, one based on sublexical phoneme sequences, and one based on the phonemes in the current word alone; at the same time, even early responses to phonemes also reflect a unified model that incorporates sentence-level constraints to predict upcoming phonemes. Neural source localization places the anatomical origins of the different predictive models in nonidentical parts of the superior temporal lobes bilaterally, with the right hemisphere showing a relative preference for more local models. These results suggest that speech processing recruits both local and unified predictive models in parallel, reconciling previous disparate findings. Parallel models might make the perceptual system more robust, facilitate processing of unexpected inputs, and serve a function in language acquisition. 
    more » « less
  5. Research on keystroke dynamics has the good potential to offer continuous authentication that complements conventional authentication methods in combating insider threats and identity theft before more harm can be done to the genuine users. Unfortunately, the large amount of data required by free-text keystroke authentication often contain personally identifiable information, or PII, and personally sensitive information, such as a user's first name and last name, username and password for an account, bank card numbers, and social security numbers. As a result, there are privacy risks associated with keystroke data that must be mitigated before they are shared with other researchers. We conduct a systematic study to remove PII's from a recent large keystroke dataset. We find substantial amounts of PII's from the dataset, including names, usernames and passwords, social security numbers, and bank card numbers, which, if leaked, may lead to various harms to the user, including personal embarrassment, blackmails, financial loss, and identity theft. We thoroughly evaluate the effectiveness of our detection program for each kind of PII. We demonstrate that our PII detection program can achieve near perfect recall at the expense of losing some useful information (lower precision). Finally, we demonstrate that the removal of PII's from the original dataset has only negligible impact on the detection error tradeoff of the free-text authentication algorithm by Gunetti and Picardi. We hope that this experience report will be useful in informing the design of privacy removal in future keystroke dynamics based user authentication systems. 
    more » « less