skip to main content

Search for: All records

Creators/Authors contains: "Hannig, Jan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. String edit distances have been used for decades in applications ranging from spelling correction and web search suggestions to DNA analysis. Most string edit distances are variations of the Levenshtein distance and consider only single-character edits. In forensic applications polymorphic genetic markers such as short tandem repeats (STRs) are used. At these repetitive motifs the DNA copying errors consist of more than just single base differences. More often the phenomenon of “stutter” is observed, where the number of repeated units differs (by whole units) from the template. To adapt the Levenshtein distance to be suitable for forensic applications where DNA sequence similarity is of interest, a generalized string edit distance is defined that accommodates the addition or deletion of whole motifs in addition to single-nucleotide edits. A dynamic programming implementation is developed for computing this distance between sequences. The novelty of this algorithm is in handling the complex interactions that arise between multiple- and single-character edits. Forensic examples illustrate the purpose and use of the Restricted Forensic Levenshtein (RFL) distance measure, but applications extend to sequence alignment and string similarity in other biological areas, as well as dynamic programming algorithms more broadly.
    Free, publicly-accessible full text available July 1, 2023
  2. Abstract Zero-inflated and hurdle models are widely applied to count data possessing excess zeros, where they can simultaneously model the process from how the zeros were generated and potentially help mitigate the effects of overdispersion relative to the assumed count distribution. Which model to use depends on how the zeros are generated: zero-inflated models add an additional probability mass on zero, while hurdle models are two-part models comprised of a degenerate distribution for the zeros and a zero-truncated distribution. Developing confidence intervals for such models is challenging since no closed-form function is available to calculate the mean. In this study, generalized fiducial inference is used to construct confidence intervals for the means of zero-inflated Poisson and Poisson hurdle models. The proposed methods are assessed by an intensive simulation study. An illustrative example demonstrates the inference methods.
  3. Abstract

    The use of likelihood ratios for quantifying the strength of forensic evidence in criminal cases is gaining widespread acceptance in many forensic disciplines. Although some forensic scientists feel that subjective likelihood ratios are a reasonable way of expressing expert opinion regarding strength of evidence in criminal trials, legal requirements of reliability of expert evidence in the United Kingdom, United States and some other countries have encouraged researchers to develop likelihood ratio systems based on statistical modelling using relevant empirical data. Many such systems exhibit exceptional power to discriminate between the scenario presented by the prosecution and an alternate scenario implying the innocence of the defendant. However, such systems are not necessarily well calibrated. Consequently, verbal explanations to triers of fact, by forensic experts, of the meaning of the offered likelihood ratio may be misleading. In this article, we put forth a statistical approach for testing the calibration discrepancy of likelihood ratio systems using ground truth known empirical data. We provide point estimates as well as confidence intervals for the calibration discrepancy. Several examples, previously discussed in the literature, are used to illustrate our method. Results from a limited simulation study concerning the performance of the proposed approach are alsomore »provided.

    « less