PDF, as one of most popular document file format, has been frequently utilized as a vector by attackers to covey malware due to its flexible file structure and the ability to embed different kinds of content. In this paper, we propose a new learning-based method to detect PDF malware using image processing and processing techniques. The PDF files are first converted to grayscale images using image visualization techniques. Then various image features representing the distinct visual characteristics of PDF malware and benign PDF files are extracted. Finally, learning algorithms are applied to create the classification models to classify a new PDF file as malicious or benign. The performance of the proposed method was evaluated using Contagio PDF malware dataset. The results show that the proposed method is a viable solution for PDF malware detection. It is also shown that the proposed method is more robust to resist reverse mimicry attacks than the state-of-art learning-based method.
more »
« less
Bayesian Attention Modules
More Like this
-
-
Abstract We present progress towards the first unpolarized gluon quasi-parton distribution function (PDF) from lattice quantum chromodynamics using high-statistics measurements for hadrons at two valence pion massesMπ ≈ 310 and 690 MeV computed on ana ≈ 0.12 fm ensemble with 2 + 1 + 1-flavors of highly improved staggered quark generated by the MILC collaboration. In this study, we consider two gluon operators for which the hybrid-ratio renormalization matching kernels have been recently derived and a third operator that has been used in prior pseudo-PDF studies of the gluon PDFs. We compare the matrix elements for each operator for both the nucleon and pion, at both pion masses, and using two gauge-smearing techniques. Focusing on the more phenomenologically studied nucleon gluon PDF, we compare the ratio and hybrid-ratio renormalized matrix elements at both pion masses and both smearings to those reconstructed from the nucleon gluon PDF from the CT18 global analysis. We identify the best choice of operator to study the gluon PDF and present the first gluon quasi-PDF under some caveats. Additionally, we explore the recent idea of Coulomb gauge fixing to improve signal at large Wilson-line displacement and find it could be a major help in improving the signal in the gluon matrix elements. This work helps identify the best operator for studying the gluon quasi-PDF, shows higher hadron boost momentum is needed to implement hybrid-ratio renormalization reliably, and suggests the need to study more diverse set of operators with their corresponding perturbative calculations for hybrid-ratio renormalization to further gluon quasi-PDF study.more » « less
-
PDF is a popular document file format with a flexible file structure that can embed diverse types of content, including images and JavaScript code. However, these features make it a favored vehicle for malware attackers. In this paper, we propose an image-based PDF malware detection method that utilizes pre-trained deep neural networks (DNNs). Specifically, we convert PDF files into fixed-size grayscale images using an image visualization technique. These images are then fed into pre-trained DNN models to classify them as benign or malicious. We investigated four classical pre-trained DNN models in our study. We evaluated the performance of the proposed method using the publicly available Contagio PDF malware dataset. Our results demonstrate that MobileNetv3 achieves the best detection performance with an accuracy of 0.9969 and exhibits low computational complexity, making it a promising solution for image-based PDF malware detection.more » « less
-
A novel automated high-throughput screening approach,ClusterFinder, is reported for finding candidate structures for atomic pair distribution function (PDF) structural refinements. Finding starting models for PDF refinements is notoriously difficult when the PDF originates from nanoclusters or small nanoparticles. The reportedClusterFinderalgorithm can screen 104to 105candidate structures from structural databases such as the Inorganic Crystal Structure Database (ICSD) in minutes, using the crystal structures as templates in which it looks for atomic clusters that result in a PDF similar to the target measured PDF. The algorithm returns a rank-ordered list of clusters for further assessment by the user. The algorithm has performed well for simulated and measured PDFs of metal–oxido clusters such as Keggin clusters. This is therefore a powerful approach to finding structural cluster candidates in a modelling campaign for PDFs of nanoparticles and nanoclusters.more » « less
-
Abstract In this Letter we investigate the dependency with scale of the empirical probability distribution functions (PDF) of Elsasser increments using large sets of WIND data (collected between 1995 and 2017) near 1 au. The empirical PDF are compared to the ones obtained from high-resolution numerical simulations of steadily driven, homogeneous reduced MHD turbulence on a 2048 3 rectangular mesh. A large statistical sample of Alfvénic increments is obtained by using conditional analysis based on the solar wind average properties. The PDF tails obtained from observations and numerical simulations are found to have exponential behavior in the inertial range, with an exponential decrement that satisfies power laws of the form α l ∝ l − μ , where l is the scale size, with μ between 0.17 and 0.25 for observations and 0.43 for simulations. PDF tails were extrapolated assuming their exponential behavior extends to arbitrarily large increments in order to determine structure function scaling laws at very high orders. Our results point to potentially universal scaling laws governing the PDF of Elsasser increments and to an alternative approach to investigate high-order statistics in solar wind observations.more » « less
An official website of the United States government

