skip to main content


Title: STATISTICAL MODELS FOR IMAGE STEGANOGRAPHY EXPLAINING AND REPLACING HEURISTICS
The steganographic field is nowadays dominated by heuristic approaches for data hiding. While there exist a few model-based steganographic algorithms designed to minimize statistical detectability of the underlying model, many more algorithms based on costs of changing a specific pixel or a DCT coefficient have been over the last decade introduced. These costs are purely heuristic, as they are designed with feedback from detectors implemented as machine learning classifiers. For this reason, there is no apparent relation to statistical detectability, even though in practice they provide comparable security to model-based algorithms. Clearly, the security of such algorithms stands only on the assumption, that the detector used to assess the security, is the best one possible. Such assumption is of course completely unrealistic. Similarly, steganalysis is mainly implemented with empirical machine learning detectors, which use hand-crafted features computed from images or as deep learning detectors - convolutional neural networks. The biggest drawback of this approach is, that the steganalyst, even though having a very good detection power, has very little to no knowledge about what part of the image or the embedding algorithm contributes to the detection, because the detector is used as a black box. In this work, we will try to leave the heuristics behind and go towards statistical models. First, we introduce statistical models for current heuristic algorithms, which helps us understand and predict their security trends. Furthemore this allows us to improve the security of such algorithms. Next, we focus on steganalysis exploiting universal properties of JPEG images. Under certain realistic conditions, this leads to a very powerful attack against any steganography, because embedding even a very small secret message breaks the statistical model. Lastly, we show how we can improve security of JPEG compressed images through additional compression.  more » « less
Award ID(s):
2028119
NSF-PAR ID:
10301787
Author(s) / Creator(s):
Date Published:
Journal Name:
Binghamton University magazine
ISSN:
1936-7066
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The JPEG compatibility attack is a steganalysis method for detect- ing messages embedded in the spatial representation of an image under the assumption that the cover image was a decompressed JPEG. This paper addresses a number of open problems in previous art, namely the lack of theoretical insight into how and why the attack works, low detection accuracy for high JPEG qualities, ro- bustness to the JPEG compressor and DCT coeffjcient quantizer, and real-life performance evaluation. To explain the main mechanism responsible for detection and to understand the trends exhibited by heuristic detectors, we adopt a model of quantization errors of DCT coefficients in the recompressed image, and within a simplified setup, we analyze the behavior of the most powerful detector. Em- powered by our analysis, we resolve the performance defficiencies using an SRNet trained on a two-channel input consisting of the image and its SQ error. This detector is compared with previous state of the art on four content-adaptive stego methods and for a wide range of payloads and quality factors. The last sections of this paper are devoted to studying robustness of this detector with re- spect to JPEG compressors, quantizers, and errors in estimating the JPEG quantization table. Finally, to demonstrate practical usability of this attack, we test our detector on stego images outputted by real steganographic tools available on the Internet. 
    more » « less
  2. While deep learning has revolutionized image steganalysis in terms of performance, little is known about how much modern data-driven detectors can still be improved. In this paper, we approach this difficult and currently wide open question by working with artificial but realistic looking images with a known statistical model that allows us to compute the detectability of modern content-adaptive algorithms with respect to the most powerful detectors. Multiple artificial image datasets are crafted with different levels of content complexity and noise power to assess their influence on the gap between both types of detectors. Experiments with SRNet as the heuristic detector indicate that in dependent noise contributes less to the performance gap than content of the same MSE. While this loss is rather small for smooth images, it can be quite large for textured images. A network trained on many realizations of a fixed textured scene will, however, recuperate most of the loss, suggesting that networks have the capacity to approximately learn the parameters of a cover source narrowed to a fixed scene. 
    more » « less
  3. While deep learning has revolutionized image steganalysis in terms of performance, little is known about how much modern data-driven detectors can still be improved. In this paper, we approach this difficult and currently wide open question by working with artificial but realistic looking images with a known statistical model that allows us to compute the detectability of modern content-adaptive algorithms with respect to the most powerful detectors. Multiple artificial image datasets are crafted with different levels of content complexity and noise power to assess their influence on the gap between both types of detectors. Experiments with SRNet as the heuristic detector indicate that independent noise contributes less to the performance gap than content of the same MSE. While this loss is rather small for smooth images, it can be quite large for textured images. A network trained on many realizations of a fixed textured scene will, however, recuperate most of the loss, suggesting that networks have the capacity to approximately learn the parameters of a cover source narrowed to a fixed scene. 
    more » « less
  4. null (Ed.)
    Abstract--- The JPEG compatibility attack is a steganalysis method for detecting messages embedded in the spatial representation of images under the assumption that the cover is a decompressed JPEG. This paper focuses on improving the detection accuracy for the difficult case of high JPEG qualities and content-adaptive stego algorithms. Close attention is paid to the robustness of the detection with respect to the JPEG compressor and DCT coefficient quantizer. A likelihood ratio detector derived from a model of quantization errors of DCT coefficients in the recompressed image is used to explain the main mechanism responsible for detection and to understand the results of experiments. The most accurate detector is an SRNet trained on a two-channel input consisting of the image and its SQ error. The detection performance is contrasted with state of the art on four content-adaptive stego methods, wide range of payloads and quality factors. 
    more » « less
  5. This paper addresses how to fairly compare ROCs of ad hoc (or data driven) detectors with tests derived from statistical models of digital media. We argue that the ways ROCs are typically drawn for each detector type correspond to different hypothesis testing problems with different optimality criteria, making the ROCs incomparable. To understand the problem and why it occurs, we model a source of natural images as a mixture of scene oracles and derive optimal detectors for the task of image steganalysis. Our goal is to guarantee that, when the data follows the statistical model adopted for the hypothesis test, the ROC of the optimal detector bounds the ROC of the ad hoc detector. While the results are applicable beyond the field of image steganalysis, we use this setup to point out possi- ble inconsistencies when comparing both types of detectors and explain guidelines for their proper comparison. Experiments on an artificial cover source with a known model with real stegano- graphic algorithms and deep learning detectors are used to confirm our claims. 
    more » « less