skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Explaining the Bag Gain in Batch Steganography
In batch steganography, the sender distributes the secret payload among multiple images from a “bag” to decrease the chance of being caught. Recent work on this topic described an experimentally discovered phenomenon, which we call the “bag gain”: for fixed communication rate, pooled detectors experience a decrease in statistical detectability for initially increasing bag sizes, providing an opportunity for the sender to gain in security. The bag gain phenomenon is universal in the sense of manifesting under a wide spectrum of conditions. In this paper, we explain this experimental observation by adopting a statistical model of detector response. Despite the simplicity of the model, it does capture observed trends in detectability as a function of the bag size, the rate, and cover source properties. Additionally, and surprisingly, the model predicts that in certain cover sources the sender should avoid bag sizes that are too small as this can lead to a bag loss.  more » « less
Award ID(s):
2028119
PAR ID:
10356416
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE transactions on information forensics and security
ISSN:
1556-6013
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper deals with the problem of batch steganography and pooled steganalysis when the sender uses a steganography detector to spread chunks of the payload across a bag of cover images while the Warden uses a possibly different detector for her pooled steganalysis. We investigate how much information can be communicated with increasing bag size at a fixed statistical detectability of Warden’s detector. Specifically, we are interested in the scaling exponent of the secure payload. We approach this problem both theoretically from a statistical model of the soft output of a detector and practically using experiments on real datasets when giving both actors different detectors implemented as convolutional neural networks and a classifier with a rich model. While the effect of the detector mismatch depends on the payload allocation algorithm and the type of mismatch, in general the mismatch decreases the constant of proportionality as well as the exponent. This stays true independently of who has the superior detector. Many trends observed in experiments qualitatively match the theoretical predictions derived within our model. Finally, we summarize our most important findings as lessons for the sender and for the Warden. 
    more » « less
  2. The bag gain relates to a gain in security due to spreading payload among multiple covers when the steganog- rapher maintains a positive communication rate. This gain is maximal for a certain optimal bag size, which depends on the embedding method, payload spreading strategy, communication rate, and the cover source. Originally discovered and analyzed in the spatial domain, in this paper we study this phenomenon for JPEG images across quality factors. Our experiments and theoretical analysis indicate that the bag gain is more pronounced for higher JPEG qualities, more aggressive batch senders, and for senders maintaining a fixed payload per bag in terms of bits per DCT rather than per non-zero AC DCT. 
    more » « less
  3. In batch steganography, the sender spreads the secret payload among multiple cover images forming a bag. The question investigated in this paper is how many and what kind of images the sender should select for her bag. We show that by forming bags with a bias towards selecting images that are more difficult to steganalyze, the sender can either lower the probability of being detected or save on bandwidth by sending a smaller bag. These improvements can be quite substantial. Our study begins with theoretical reasoning within a suitably simplified model. The findings are confirmed on experiments with real images and modern steganographic and steganalysis techniques. 
    more » « less
  4. In batch steganography, the sender communicates a secret message by hiding it in a bag of cover objects. The adversary performs the so-called pooled steganalysis in that she inspects the entire bag to detect the presence of secrets. This is typically realized by using a detector trained to de- tect secrets within a single object, applying it to all objects in the bag, and feeding the detector outputs to a pooling function to obtain the final detection statistic. This paper deals with the problem of building the pooler while keep- ing in mind that the Warden will need to be able to de- tect steganography in variable size bags carrying variable payload. We propose a flexible machine learning solution to this challenge in the form of a Transformer Encoder Pooler, which is easily trained to be agnostic to the bag size and payload and offers a better detection accuracy than pre- viously proposed poolers. 
    more » « less
  5. null (Ed.)
    The steganographic field is nowadays dominated by heuristic approaches for data hiding. While there exist a few model-based steganographic algorithms designed to minimize statistical detectability of the underlying model, many more algorithms based on costs of changing a specific pixel or a DCT coefficient have been over the last decade introduced. These costs are purely heuristic, as they are designed with feedback from detectors implemented as machine learning classifiers. For this reason, there is no apparent relation to statistical detectability, even though in practice they provide comparable security to model-based algorithms. Clearly, the security of such algorithms stands only on the assumption, that the detector used to assess the security, is the best one possible. Such assumption is of course completely unrealistic. Similarly, steganalysis is mainly implemented with empirical machine learning detectors, which use hand-crafted features computed from images or as deep learning detectors - convolutional neural networks. The biggest drawback of this approach is, that the steganalyst, even though having a very good detection power, has very little to no knowledge about what part of the image or the embedding algorithm contributes to the detection, because the detector is used as a black box. In this work, we will try to leave the heuristics behind and go towards statistical models. First, we introduce statistical models for current heuristic algorithms, which helps us understand and predict their security trends. Furthemore this allows us to improve the security of such algorithms. Next, we focus on steganalysis exploiting universal properties of JPEG images. Under certain realistic conditions, this leads to a very powerful attack against any steganography, because embedding even a very small secret message breaks the statistical model. Lastly, we show how we can improve security of JPEG compressed images through additional compression. 
    more » « less