skip to main content


Title: Fast and Accurate Continuous User Authentication by Fusion of Instance-based, Free-text Keystroke Dynamics
Keystroke dynamics study the way in which users input text via their keyboards, which is unique to each individual, and can form a component of a behavioral biometric system to improve existing account security. Keystroke dynamics systems on free-text data use n-graphs that measure the timing between consecutive keystrokes to distinguish between users. Many algorithms require 500, 1,000, or more keystrokes to achieve EERs of below 10%. In this paper, we propose an instance-based graph comparison algorithm to reduce the number of keystrokes required to authenticate users. Commonly used features such as monographs and digraphs are investigated. Feature importance is determined and used to construct a fused classifier. Detection error tradeoff (DET) curves are produced with different numbers of keystrokes. The fused classifier outperforms the state-of-the-art with EERs of 7.9%, 5.7%, 3.4%, and 2.7% for test samples of 50, 100, 200, and 500 keystrokes.  more » « less
Award ID(s):
1650503
NSF-PAR ID:
10136313
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
International Conference of the Biometrics Special Interest Group
ISSN:
1617-5468
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Keystroke dynamics are a powerful behavioral biometric capable of determining user identity and for continuous authentication. It is an unobtrusive method that can complement an existing security system such as a password scheme and provides continuous user authentication. Existing methods record all keystrokes and use n-graphs that measure the timing between consecutive keystrokes to distinguish between users. Current state-of-the-art algorithms report EER’s of 7.5% or higher with 1000 characters. With 1000 characters it takes a longer time to detect an imposter and significant damage could be done. In this paper, we investigate how quickly a user is authenticated or how many digraphs are required to accurately detect an imposter in an uncontrolled free-text environment. We present and evaluate the effectiveness of three distance metrics individually and fused with each other. We show that with just 100 digraphs, about the length of a single sentence, we achieve an EER of 35.3%. At 200 digraphs the EER drops to 15.3%. With more digraphs, the performance continues to steadily improve. With 1000 digraphs the EER drops to 3.6% which is an improvement over the state-of-the-art. 
    more » « less
  2. Account recovery is ubiquitous across web applications but circumvents the username/password-based login step. Therefore, it deserves the same level of security as the user authentication process. A common simplistic procedure for account recovery requires that a user enters the same email used during registration, to which a password recovery link or a new username could be sent. Therefore, an impostor with access to a user’s registration email and other credentials can trigger an account recovery session to take over the user’s account. To prevent such attacks, beyond validating the email and other credentials entered by the user, our proposed recovery method utilizes keystroke dynamics to further secure the account recovery mechanism. Keystroke dynamics is a type of behavioral biometrics that uses the analysis of typing rhythm for user authentication. Using a new dataset with over 500,000 keystrokes collected from 44 students and university staff when they fill out an account recovery web form of multiple fields, we have evaluated the performance of five scoring algorithms on individual fields as well as feature-level fusion and weighted-score fusion. We achieve the best EER of 5.47% when keystroke dynamics from individual fields are used, 0% for a feature-level fusion of five fields, and 0% for a weighted-score fusion of seven fields. Our work represents a new kind of keystroke dynamics that we would like to call it ‘medium fixed-text’ as it sits between the conventional (short) fixed text and (long) free text research. 
    more » « less
  3. Free-text keystroke is a form of behavioral biometrics which has great potential for addressing the security limitations of conventional one-time authentication by continuously monitoring the user's typing behaviors. This paper presents a new, enhanced continuous authentication approach by incorporating the dynamics of both keystrokes and wrist motions. Based upon two sets of features (free-text keystroke latency features and statistical wrist motion patterns extracted from the wrist-worn smartwatches), two one-vs-all Random Forest Ensemble Classifiers (RFECs) are constructed and trained respectively. A Dynamic Trust Model (DTM) is then developed to fuse the two classifiers' decisions and realize non-time-blocked real-time authentication. In the free-text typing experiments involving 25 human subjects, an imposter/intruder can be detected within no more than one sentence (average 56 keystrokes) with an FRR of 1.82% and an FAR of 1.94%. Compared with the scheme relying on only keystroke latency which has an FRR of 4.66%, an FAR of 17.92% and the required number of keystroke of 162, the proposed authentication system shows significant improvements in terms of accuracy, efficiency, and usability. 
    more » « less
  4. Revision plays an important role in writing, and as revisions break down the linearity of the writing process, they are crucial in describing writing process dynamics. Keystroke logging and analysis have been used to identify revisions made during writing. Previous approaches include the manual annotation of revisions, building nonlinear S-notations, and the automated extraction of backspace keypresses. However, these approaches are time-intensive, vulnerable to construct, or restricted. Therefore, this article presents a computational approach to the automatic extraction of full revision events from keystroke logs, including both insertions and deletions, as well as the characters typed to replace the deleted text. Within this approach, revision candidates are first automatically extracted, which allows for a simplified manual annotation of revision events. Second, machine learning is used to automatically detect revision events. For this, 7120 revision events were manually annotated in a dataset of keystrokes obtained from 65 students conducting a writing task. The results showed that revision events could be automatically predicted with a relatively high accuracy. In addition, a case study proved that this approach could be easily applied to a new dataset. To conclude, computational approaches can be beneficial in providing automated insights into revisions in writing. 
    more » « less
  5. Research on keystroke dynamics has the good potential to offer continuous authentication that complements conventional authentication methods in combating insider threats and identity theft before more harm can be done to the genuine users. Unfortunately, the large amount of data required by free-text keystroke authentication often contain personally identifiable information, or PII, and personally sensitive information, such as a user's first name and last name, username and password for an account, bank card numbers, and social security numbers. As a result, there are privacy risks associated with keystroke data that must be mitigated before they are shared with other researchers. We conduct a systematic study to remove PII's from a recent large keystroke dataset. We find substantial amounts of PII's from the dataset, including names, usernames and passwords, social security numbers, and bank card numbers, which, if leaked, may lead to various harms to the user, including personal embarrassment, blackmails, financial loss, and identity theft. We thoroughly evaluate the effectiveness of our detection program for each kind of PII. We demonstrate that our PII detection program can achieve near perfect recall at the expense of losing some useful information (lower precision). Finally, we demonstrate that the removal of PII's from the original dataset has only negligible impact on the detection error tradeoff of the free-text authentication algorithm by Gunetti and Picardi. We hope that this experience report will be useful in informing the design of privacy removal in future keystroke dynamics based user authentication systems. 
    more » « less