skip to main content

Title: Robust fingerprinting of genomic databases
Abstract Motivation

Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel’s law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks.


Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP–phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP–phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy.

Availability and implementation

more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Medium: X Size: p. i143-i152
["p. i143-i152"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Universal Serial Bus (USB) ports are a ubiquitous feature in computer systems and offer a cheap and efficient way to provide power and data connectivity between a host and peripheral devices. Even with the rise of cloud and off-site computing, USB has played a major role in enabling data transfer between devices. Its usage is especially prevalent in high-security environments where systems are ‘air-gapped’ and not connected to the Internet. However, recent research has demonstrated that USB is not nearly as secure as once thought, with different attacks showing that modified firmware on USB mass storage devices can compromise a host system. While many defenses have been proposed, they require user interaction, advanced hardware support (incompatible with legacy devices), or utilize device identifiers that can be subverted by an attacker. In this paper, we present Time-Print, a novel timing-based fingerprinting method, for identifying USB mass storage devices. We create a fingerprint by timing a series of read operations from different locations on a drive, as the timing variations are unique enough to identify individual USB devices. Time-Print is low overhead, completely software-based, and does not require any extra or specialized hardware. To validate the efficacy of Time-Print, we examine more than 40 USB flash drives and conduct experiments in multiple authentication scenarios. The experimental results show that Time-Print can (1) identify known/unknown brand/model USB devices with greater than 99.5% accuracy, (2) identify seen/unseen devices of the same brand/model with 95% accuracy, and (3) classify USB devices from the same brand/model with an average accuracy of 98.7%. 
    more » « less
  2. Polymorphic gates are reconfigurable devices that deliver multiple functionalities at different temperature, supply voltage or external inputs. Capable of working in different modes, polymorphic gate is a promising candidate for embedding secret information such as fingerprints. In this paper, we report five polymorphic gates whose functionality varies in response to specific control input and propose a circuit fingerprinting scheme based on these gates. The scheme selectively replaces standard logic cells by polymorphic gates whose functionality differs with the standard cells only on Satisfiability Don’t Care conditions. Additional dummy fingerprint bits are also introduced to enhance the fingerprint’s robustness against attacks such as fingerprint removal and modification. Experimental results on ISCAS and MCNC benchmark circuits demonstrate that our scheme introduces low overhead. More specifically, the average overhead in area, speed and power are 4.04%, 6.97% and 4.15% respectively when we embed 64-bit fingerprint that consists of 32 real fingerprint bits and 32 dummy bits. This is only half of the overhead of the other known approach when they create 32-bit fingerprints. 
    more » « less
  3. The Schnorr signature scheme is an efficient digital signature scheme with short signature lengths, i.e., $4k$-bit signatures for $k$ bits of security. A Schnorr signature $\sigma$ over a group of size $p\approx 2^{2k}$ consists of a tuple $(s,e)$, where $e \in \{0,1\}^{2k}$ is a hash output and $s\in \mathbb{Z}_p$ must be computed using the secret key. While the hash output $e$ requires $2k$ bits to encode, Schnorr proposed that it might be possible to truncate the hash value without adversely impacting security. In this paper, we prove that \emph{short} Schnorr signatures of length $3k$ bits provide $k$ bits of multi-user security in the (Shoup's) generic group model and the programmable random oracle model. We further analyze the multi-user security of key-prefixed short Schnorr signatures against preprocessing attacks, showing that it is possible to obtain secure signatures of length $3k + \log S + \log N$ bits. Here, $N$ denotes the number of users and $S$ denotes the size of the hint generated by our preprocessing attacker, e.g., if $S=2^{k/2}$, then we would obtain secure $3.75k$-bit signatures for groups of up to $N \leq 2^{k/4}$ users. Our techniques easily generalize to several other Fiat-Shamir-based signature schemes, allowing us to establish analogous results for Chaum-Pedersen signatures and Katz-Wang signatures. As a building block, we also analyze the $1$-out-of-$N$ discrete-log problem in the generic group model, with and without preprocessing. 
    more » « less
  4. null (Ed.)
    Security of machine learning is increasingly becoming a major concern due to the ubiquitous deployment of deep learning in many security-sensitive domains. Many prior studies have shown external attacks such as adversarial examples that tamper the integrity of DNNs using maliciously crafted inputs. However, the security implication of internal threats (i.e., hardware vulnerabilities) to DNN models has not yet been well understood. In this paper, we demonstrate the first hardware-based attack on quantized deep neural networks–DeepHammer–that deterministically induces bit flips in model weights to compromise DNN inference by exploiting the rowhammer vulnerability. DeepHammer performs an aggressive bit search in the DNN model to identify the most vulnerable weight bits that are flippable under system constraints. To trigger deterministic bit flips across multiple pages within a reasonable amount of time, we develop novel system-level techniques that enable fast deployment of victim pages, memory-efficient rowhammering and precise flipping of targeted bits. DeepHammer can deliberately degrade the inference accuracy of the victim DNN system to a level that is only as good as random guess, thus completely depleting the intelligence of targeted DNN systems. We systematically demonstrate our attacks on real systems against 11 DNN architectures with 4 datasets corresponding to different application domains. Our evaluation shows that DeepHammer is able to successfully tamper DNN inference behavior at run-time within a few minutes. We further discuss several mitigation techniques from both algorithm and system levels to protect DNNs against such attacks. Our work highlights the need to incorporate security mechanisms in future deep learning systems to enhance the robustness against hardware-based deterministic fault injections. 
    more » « less
  5. The website fingerprinting attack allows a low-resource attacker to compromise the privacy guarantees provided by privacy enhancing tools such as Tor. In response, researchers have proposed defenses aimed at confusing the classification tools used by attackers. As new, more powerful attacks are frequently developed, raw attack accuracy has proven inadequate as the sole metric used to evaluate these defenses. In response, two security metrics have been proposed that allow for evaluating defenses based on hand-crafted features often used in attacks. Recent state-of-the-art attacks, however, use deep learning models capable of automatically learning abstract feature representations, and thus the proposed metrics fall short once again. In this study we examine two security metrics and (1) show how these methods can be extended to evaluate deep learning-based website fingerprinting attacks, and (2) compare the security metrics and identify their shortcomings. 
    more » « less