skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Numerical study of missing survival data: Supplementary Materials
This is published by "Computational Statistics and Data Analysis", vol. 198, pp.1-34 results of numerical stuy and analysis of real life examples.  more » « less
Award ID(s):
1915845
PAR ID:
10574236
Author(s) / Creator(s):
;
Publisher / Repository:
Computational Statistical and Data Analysis
Date Published:
Volume:
198
Page Range / eLocation ID:
1-34
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This tutorial provides a review of the state-of-the-art research and the applications of Artificial Intelligence and Machine Learning for malware analysis. We will provide an overview, background and results with respect to the three main malware analysis approaches: static malware analysis, dynamic malware analysis and online malware analysis. Further, we will provide a simplified hands-on tutorial of applying ML algorithm for dynamic malware analysis in cloud IaaS. 
    more » « less
  2. null (Ed.)
    When working with real world programs, dynamic analyses often must be run on a whole-system instead of just a single binary. Existing whole-system dynamic analysis platforms generally require analyses to be written in compiled languages, a suboptimal choice for many iterative analysis tasks. Furthermore, these platforms leave analysts with a split view between the behavior of the system under analysis and the analysis itself---in particular the system being analyzed must commonly be controlled manually while analysis scripts are run. To improve this process, we designed and implemented PyPANDA, a Python interface to the PANDA dynamic analysis platform. PyPANDA unifies the gap between guest virtual machines behavior and analysis tasks; enables painless integrations with other program analysis tools; and greatly lowers the barrier of entry to whole-system dynamic analysis. The capabilities of PyPANDA are demonstrated by using it to dynamically evaluate the accuracy of three binary analysis frameworks, track heap allocations across multiple processes, and synchronize state between PANDA and a binary analysis platform. Significant challenges were overcome to integrate a scripting language into PANDA with minimal performance impact. 
    more » « less
  3. Software vulnerabilities in emerging systems, such as the Internet of Things (IoT), allow for multiple attack vectors that are exploited by adversaries for malicious intents. One of such vectors is malware, where limited efforts have been dedicated to IoT malware analysis, characterization, and understanding. In this paper, we analyze recent IoT malware through the lenses of static analysis. Towards this, we reverse-engineer and perform a detailed analysis of almost 2,900 IoT malware samples of eight different architectures across multiple analysis directions. We conduct string analysis, unveiling operation, unique textual characteristics, and network dependencies. Through the control flow graph analysis, we unveil unique graph-theoretic features. Through the function analysis, we address obfuscation by function approximation. We then pursue two applications based on our analysis: 1) Combining various analysis aspects, we reconstruct the infection lifecycle of various prominent malware families, and 2) using multiple classes of features obtained from our static analysis, we design a machine learning-based detection model with features that are robust and an average detection rate of 99.8%. 
    more » « less
  4. There are two approaches to automatically deriving symbolic worst-case resource bounds for programs: static analysis of the source code and data-driven analysis of cost measurements obtained by running the program. Static resource analysis is usually sound but incomplete. Data-driven analysis can always return a result, but its lack of robustness often leads to unsound results. This paper presents the design, implementation, and empirical evaluation of hybrid resource bound analyses that tightly integrate static analysis and data-driven analysis. The static analysis part builds on automatic amortized resource analysis (AARA), a state-of-the-art type-based resource analysis method that performs cost bound inference using linear optimization. The data-driven part is rooted in novel Bayesian modeling and inference techniques that improve upon previous data-driven analysis methods by reporting an entire probability distribution over likely resource cost bounds. A key innovation is a new type inference system calledHybrid AARAthat coherently integrates Bayesian inference into conventional AARA, combining the strengths of both approaches. Hybrid AARA is proven to be statistically sound under standard assumptions on the runtime cost data. An experimental evaluation on a challenging set of benchmarks shows that Hybrid AARA (i) effectively mitigates the incompleteness of purely static resource analysis; and (ii) is more accurate and robust than purely data-driven resource analysis. 
    more » « less
  5. Multiverse analyses involve conducting all combinations of reasonable choices in a data analysis process. A reader of a study containing a multiverse analysis might question—are all the choices included in the multiverse reasonable and equally justifiable? How much do results vary if we make different choices in the analysis process? In this work, we identify principles for validating the composition of, and interpreting the uncertainty in, the results of a multiverse analysis. We present Milliways, a novel interactive visualisation system to support principled evaluation of multiverse analyses. Milliways provides interlinked panels presenting result distributions, individual analysis composition, multiverse code specification, and data summaries. Milliways supports interactions to sort, filter and aggregate results based on the analysis specification to identify decisions in the analysis process to which the results are sensitive. To represent the two qualitatively different types of uncertainty that arise in multiverse analyses—probabilistic uncertainty from estimating unknown quantities of interest such as regression coefficients, and possibilistic uncertainty from choices in the data analysis—Milliways uses consonance curves and probability boxes. Through an evaluative study with five users familiar with multiverse analysis, we demonstrate how Milliways can support multiverse analysis tasks, including a principled assessment of the results of a multiverse analysis. 
    more » « less