Network Telescopes, often referred to as darknets, capture unsolicited traffic directed toward advertised but unused IP spaces, enabling researchers and operators to monitor malicious, Internet-wide network phenomena such as vulnerability scanning, botnet propagation, and DoS backscatter. Detecting these events, however,has become increasingly challenging due to the growing traffic volumes that telescopes receive. To address this, we introduce DarkSim,a novel analytic framework that utilizes Dynamic Time Warping to measure similarities within the high-dimensional time series of network traffic. DarkSim combines traditional raw packet processing with statistical approaches, identifying traffic anomalies and enabling rapid time-to-insight. We evaluate our framework against DarkGLASSO, an existing method based on the GraphicalLASSO algorithm, using data from the UCSD Network Telescope.Based on our manually classified detections, DarkSim showcased perfect precision and an overlap of up to 91% of DarkGLASSO’s detections in contrast to DarkGLASSO’s maximum of 73.3% precision and detection overlap of 37.5% with the former. We further demonstrate DarkSim’s capability to detect two real-world events in our case studies: (1) an increase in scanning activities surrounding CVE public disclosures, and (2) shifts in country and network-level scanning patterns that indicate aggressive scanning. DarkSim provides a detailed and interpretable analysis framework for time-series anomalies, representing a new contribution to network security analytics.
more »
« less
Renewal model for anomalous traffic in Internet2 links
We propose and estimate an alternating renewal model describing the propagation of anomalies in a backbone internet network in the United States. Internet anomalies, either caused by equipment malfunction, news events or malicious attacks, have been a focus of research in network engineering since the advent of the internet over 30 years ago. This article contributes to the understanding of statistical properties of the times between the arrivals of the anomalies, their duration and stochastic structure. Anomalous, or active, time periods are modelled as periods containing clusters or 1s, where 1 indicates a presence of an anomaly. The inactive periods consisting entirely of 0s dominate the 0–1 time series in every link. Since the active periods contain 0s, a separation parameter is introduced and estimated jointly with all other parameters of the model. Our statistical analysis shows that the integer-valued separation parameter and five other non-negative, scalar parameters satisfactorily describe all statistical properties of the observed 0–1 series.
more »
« less
- PAR ID:
- 10331916
- Date Published:
- Journal Name:
- Statistical Modelling
- ISSN:
- 1471-082X
- Page Range / eLocation ID:
- 1471082X1998314
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Internet traffic load is not uniformly distributed through the day; it is significantly higher during peak-periods, and comparatively idle during off-peak periods. In this context, we present CacheFlix, a time-shifted edge-caching solution that prefetches Netflix content during off-peak periods of network connectivity. We specifically focus on Netflix since it contributes to the largest percentage of global Internet traffic by a single application. We analyze a real-world dataset of Netflix viewing activity that we collected from 1060 users spanning a 1-year period and comprised of over 2.2 million Netflix TV shows and documentary series; we restrict the scope of our study to Netflix series that account for 65% of a typical user's Netflix load in terms of bytes fetched. We present insights on users' viewing behavior, and develop an accurate and efficient prediction algorithm using LSTM networks that caches episodes of Netflix series on storage constrained edge nodes, based on the user's past viewing activity. We evaluate CacheFlix on the collected dataset over various cache eviction policies, and find that CacheFlix is able to shift 70% of Netflix series traffic to off-peak hours.more » « less
-
null (Ed.)The recycling of used integrated circuits (ICs) has raised serious problems in ensuring the integrity of today's globalized semiconductor supply chain. This poses a serious threat to critical infrastructure due to potentially shorter lifetime, lower reliability, and poorer performance from these counterfeit new chips. Recently, we have proposed a highly effective approach for detecting such chips by exploiting the power-up state of on-chip SRAMs. Due to the symmetry of the memory array layout, an equal number of cells power-up to the 0 and 1 logic states in a new unused SRAM; this ratio gets skewed in time due to uneven NBTI aging from normal usage in the field. Although this solution is very effective in detecting recycled ICs, its applicability is somewhat limited as a large number older designs do not have large on-chip memories. In this paper, we propose an alternate approach based on the initial power-up state of scan flip-flops, which are present in virtually every digital circuit. Since the flip-flops, unlike SRAM cells, are generally not perfectly symmetrical in layout, an equal number of scan cells will not power-up to 0 or 1 logic states in most designs. Consequently, a stable time zero reference of 50% logic 0s and 1s cannot be used for determining the subsequent usage of a chip. To overcome this key limitation, we propose a novel solution in this paper that reliably identifies used ICs from testing the part alone, without the need for any additional reference data or even the netlist of the circuit. Through scan testing of the IC, we first identify a significant number of asymmetrically stressed flip-flops in the design, divided into two groups. One group of flip-flops is selected such that it mostly experiences the 1 logic state during functional operation, while the other group mostly experiences the 0 state. The resulting differential stress during operation causes growing disparity over time in the number of 0s (and 1s) observed in these two groups at power-up. When new and unaged, these two groups behave similarly, with similar percentage of 1s (or 0s). However, over time the differential stress makes these counts diverge. We show that this changing count can be a measure of operational aging. Our simulation results show that it is possible to reliably detect used ICs after as little as three months of operation.more » « less
-
We investigate the asymptotic and finite sample behavior of the Hill estimator applied to time series contaminated by measurement or other errors. We show that for all discrete time models used in practice, whose non‐contaminated marginal distributions are regularly varying, the Hill estimator is consistent. Essentially, the only assumption on the errors is that they have lighter tails than the underlying unobservable process. The asymptotic justification however depends on the specific class of models assumed for the underlying unobservable process. We show by means of a simulation study that the asymptotic robustness of the Hill estimator is clearly manifested in finite samples. We further illustrate this robustness by a numerical study of the interarrival times of anomalies in a backbone internet network, the Internet2 in the United States; the anomalies arrival times are computed with a roundoff error.more » « less
-
null (Ed.)Concerning power systems, real-time monitoring of cyber–physical security, false data injection attacks on wide-area measurements are of major concern. However, the database of the network parameters is just as crucial to the state estimation process. Maintaining the accuracy of the system model is the other part of the equation, since almost all applications in power systems heavily depend on the state estimator outputs. While much effort has been given to measurements of false data injection attacks, seldom reported work is found on the broad theme of false data injection on the database of network parameters. State-of-the-art physics-based model solutions correct false data injection on network parameter database considering only available wide-area measurements. In addition, deterministic models are used for correction. In this paper, an overdetermined physics-based parameter false data injection correction model is presented. The overdetermined model uses a parameter database correction Jacobian matrix and a Taylor series expansion approximation. The method further applies the concept of synthetic measurements, which refers to measurements that do not exist in the real-life system. A machine learning linear regression-based model for measurement prediction is integrated in the framework through deriving weights for synthetic measurements creation. Validation of the presented model is performed on the IEEE 118-bus system. Numerical results show that the approximation error is lower than the state-of-the-art, while providing robustness to the correction process. Easy-to-implement model on the classical weighted-least-squares solution, highlights real-life implementation potential aspects.more » « less
An official website of the United States government

