skip to main content


This content will become publicly available on June 30, 2024

Title: Advancing the JPEG Compatibility Attack: Theory, Performance, Robustness, and Practice
The JPEG compatibility attack is a steganalysis method for detect- ing messages embedded in the spatial representation of an image under the assumption that the cover image was a decompressed JPEG. This paper addresses a number of open problems in previous art, namely the lack of theoretical insight into how and why the attack works, low detection accuracy for high JPEG qualities, ro- bustness to the JPEG compressor and DCT coeffjcient quantizer, and real-life performance evaluation. To explain the main mechanism responsible for detection and to understand the trends exhibited by heuristic detectors, we adopt a model of quantization errors of DCT coefficients in the recompressed image, and within a simplified setup, we analyze the behavior of the most powerful detector. Em- powered by our analysis, we resolve the performance defficiencies using an SRNet trained on a two-channel input consisting of the image and its SQ error. This detector is compared with previous state of the art on four content-adaptive stego methods and for a wide range of payloads and quality factors. The last sections of this paper are devoted to studying robustness of this detector with re- spect to JPEG compressors, quantizers, and errors in estimating the JPEG quantization table. Finally, to demonstrate practical usability of this attack, we test our detector on stego images outputted by real steganographic tools available on the Internet.  more » « less
Award ID(s):
2028119
NSF-PAR ID:
10451219
Author(s) / Creator(s):
Date Published:
Journal Name:
ACM Information Hiding and Multimedia Security Workshop
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Abstract--- The JPEG compatibility attack is a steganalysis method for detecting messages embedded in the spatial representation of images under the assumption that the cover is a decompressed JPEG. This paper focuses on improving the detection accuracy for the difficult case of high JPEG qualities and content-adaptive stego algorithms. Close attention is paid to the robustness of the detection with respect to the JPEG compressor and DCT coefficient quantizer. A likelihood ratio detector derived from a model of quantization errors of DCT coefficients in the recompressed image is used to explain the main mechanism responsible for detection and to understand the results of experiments. The most accurate detector is an SRNet trained on a two-channel input consisting of the image and its SQ error. The detection performance is contrasted with state of the art on four content-adaptive stego methods, wide range of payloads and quality factors. 
    more » « less
  2. null (Ed.)
    The steganographic field is nowadays dominated by heuristic approaches for data hiding. While there exist a few model-based steganographic algorithms designed to minimize statistical detectability of the underlying model, many more algorithms based on costs of changing a specific pixel or a DCT coefficient have been over the last decade introduced. These costs are purely heuristic, as they are designed with feedback from detectors implemented as machine learning classifiers. For this reason, there is no apparent relation to statistical detectability, even though in practice they provide comparable security to model-based algorithms. Clearly, the security of such algorithms stands only on the assumption, that the detector used to assess the security, is the best one possible. Such assumption is of course completely unrealistic. Similarly, steganalysis is mainly implemented with empirical machine learning detectors, which use hand-crafted features computed from images or as deep learning detectors - convolutional neural networks. The biggest drawback of this approach is, that the steganalyst, even though having a very good detection power, has very little to no knowledge about what part of the image or the embedding algorithm contributes to the detection, because the detector is used as a black box. In this work, we will try to leave the heuristics behind and go towards statistical models. First, we introduce statistical models for current heuristic algorithms, which helps us understand and predict their security trends. Furthemore this allows us to improve the security of such algorithms. Next, we focus on steganalysis exploiting universal properties of JPEG images. Under certain realistic conditions, this leads to a very powerful attack against any steganography, because embedding even a very small secret message breaks the statistical model. Lastly, we show how we can improve security of JPEG compressed images through additional compression. 
    more » « less
  3. null (Ed.)
    In this work, we revisit Perturbed Quantization steganography with modern tools available to the steganographer today, including near-optimal ternary coding and content-adaptive embedding with side-information. In PQ, side-information in the form of rounding errors is manufactured by recompressing a JPEG image with a ju- diciously selected quality factor. This side-information, however, cannotbeusedinthesamefashionasinconventionalside-informed schemes nowadays as this leads to highly detectable embedding. As a remedy, we utilize the steganographic Fisher information to allocate the payload among DCT modes. In particular, we show that the embedding should not be constrained to contributing coef- ficients only as in the original PQ but should be expanded to the so-called “contributing DCT modes.” This approach is extended to color images by slightly modifying the SI-UNIWARD algorithm. Using the best detectors currently available, it is shown that by manufacturing side information with double compression, one can embedthesameamountofinformationintothedoubly-compressed cover image with a significantly better security than applying J- UNIWARD directly in the single-compressed image. At the end of the paper, we show that double compression with the same qual- ity makes side-informed steganography extremely detectable and should be avoided. 
    more » « less
  4. In this article, we study a recently proposed method for improving empirical security of steganography in JPEG images in which the sender starts with an additive embedding scheme with symmetrical costs of ±1 changes and then decreases the cost of one of these changes based on an image obtained by applying a deblocking (JPEG dequantization) algorithm to the cover JPEG. This approach provides rather significant gains in security at negligible embedding complexity overhead for a wide range of quality factors and across various embedding schemes. Challenging the original explanation of the inventors of this idea, which is based on interpreting the dequantized image as an estimate of the precover (uncompressed) image, we provide alternative arguments. The key observation and the main reason why this approach works is how the polarizations of individual DCT coefficients work together. By using a MiPOD model of content complexity of the uncompressed cover image, we show that the cost polarization technique decreases the chances of “bad” combinations of embedding changes that would likely be introduced by the original scheme with symmetric costs. This statement is quantified by computing the likelihood of the stego image w.r.t. the multivariate Gaussian precover distribution in DCT domain. Furthermore, it is shown that the cost polarization decreases spatial discontinuities between blocks (blockiness) in the stego image and enforces desirable correlations of embedding changes across blocks. To further prove the point, it is shown that in a source that adheres to the precover model, a simple Wiener filter can serve equally well as a deep-learning based deblocker. 
    more » « less
  5. The robustness and vulnerability of Deep Neural Networks (DNN) are quickly becoming a critical area of interest since these models are in widespread use across real-world applications (i.e., image and audio analysis, recommendation system, natural language analysis, etc.). A DNN's vulnerability is exploited by an adversary to generate data to attack the model; however, the majority of adversarial data generators have focused on image domains with far fewer work on audio domains. More recently, audio analysis models were shown to be vulnerable to adversarial audio examples (e.g., speech command classification, automatic speech recognition, etc.). Thus, one urgent open problem is to detect adversarial audio reliably. In this contribution, we incorporate a separate and yet related DNN technique to detect adversarial audio, namely model quantization. Then we propose an algorithm to detect adversarial audio by using a DNN's quantization error. Specifically, we demonstrate that adversarial audio typically exhibits a larger activation quantization error than benign audio. The quantization error is measured using character error rates. We use the difference in errors to discriminate adversarial audio. Experiments with three the-state-of-the-art audio attack algorithms against the DeepSpeech model show our detection algorithm achieved high accuracy on the Mozilla dataset. 
    more » « less