skip to main content


Title: Building knowledge graphs of homicide investigation chronologies
Homicide investigations generate large and diverse data in the form of witness interview transcripts, physical evidence, photographs, DNA, etc. Homicide case chronologies are summaries of these data created by investigators that consist of short text-based entries documenting specific steps taken in the investigation. A chronology tracks the evolution of an investigation, including when and how persons involved and items of evidence became part of a case. In this article we discuss a framework for creating knowledge graphs of case chronologies that may aid investigators in analyzing homicide case data and also allow for post hoc analysis of the key features that determine whether a homicide is ultimately solved. Our method consists of 1) performing named entity recognition to determine witnesses, suspects, and detectives from chronology entries 2) using keyword expansion to identify documentary, physical, and forensic evidence in each entry and 3) linking entities and evidence to construct a homicide investigation knowledge graph. We compare the performance of several choices of methodologies for these sub-tasks using homicide investigation chronologies from Los Angeles, California. We then analyze the association between network statistics of the knowledge graphs and homicide solvability.  more » « less
Award ID(s):
1737585
NSF-PAR ID:
10276739
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE International Conference on Data Mining Workshops
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract: 100 words Jurors are increasingly exposed to scientific information in the courtroom. To determine whether providing jurors with gist information would assist in their ability to make well-informed decisions, the present experiment utilized a Fuzzy Trace Theory-inspired intervention and tested it against traditional legal safeguards (i.e., judge instructions) by varying the scientific quality of the evidence. The results indicate that jurors who viewed high quality evidence rated the scientific evidence significantly higher than those who viewed low quality evidence, but were unable to moderate the credibility of the expert witness and apply damages appropriately resulting in poor calibration. Summary: <1000 words Jurors and juries are increasingly exposed to scientific information in the courtroom and it remains unclear when they will base their decisions on a reasonable understanding of the relevant scientific information. Without such knowledge, the ability of jurors and juries to make well-informed decisions may be at risk, increasing chances of unjust outcomes (e.g., false convictions in criminal cases). Therefore, there is a critical need to understand conditions that affect jurors’ and juries’ sensitivity to the qualities of scientific information and to identify safeguards that can assist with scientific calibration in the courtroom. The current project addresses these issues with an ecologically valid experimental paradigm, making it possible to assess causal effects of evidence quality and safeguards as well as the role of a host of individual difference variables that may affect perceptions of testimony by scientific experts as well as liability in a civil case. Our main goal was to develop a simple, theoretically grounded tool to enable triers of fact (individual jurors) with a range of scientific reasoning abilities to appropriately weigh scientific evidence in court. We did so by testing a Fuzzy Trace Theory-inspired intervention in court, and testing it against traditional legal safeguards. Appropriate use of scientific evidence reflects good calibration – which we define as being influenced more by strong scientific information than by weak scientific information. Inappropriate use reflects poor calibration – defined as relative insensitivity to the strength of scientific information. Fuzzy Trace Theory (Reyna & Brainerd, 1995) predicts that techniques for improving calibration can come from presentation of easy-to-interpret, bottom-line “gist” of the information. Our central hypothesis was that laypeople’s appropriate use of scientific information would be moderated both by external situational conditions (e.g., quality of the scientific information itself, a decision aid designed to convey clearly the “gist” of the information) and individual differences among people (e.g., scientific reasoning skills, cognitive reflection tendencies, numeracy, need for cognition, attitudes toward and trust in science). Identifying factors that promote jurors’ appropriate understanding of and reliance on scientific information will contribute to general theories of reasoning based on scientific evidence, while also providing an evidence-based framework for improving the courts’ use of scientific information. All hypotheses were preregistered on the Open Science Framework. Method Participants completed six questionnaires (counterbalanced): Need for Cognition Scale (NCS; 18 items), Cognitive Reflection Test (CRT; 7 items), Abbreviated Numeracy Scale (ABS; 6 items), Scientific Reasoning Scale (SRS; 11 items), Trust in Science (TIS; 29 items), and Attitudes towards Science (ATS; 7 items). Participants then viewed a video depicting a civil trial in which the defendant sought damages from the plaintiff for injuries caused by a fall. The defendant (bar patron) alleged that the plaintiff (bartender) pushed him, causing him to fall and hit his head on the hard floor. Participants were informed at the outset that the defendant was liable; therefore, their task was to determine if the plaintiff should be compensated. Participants were randomly assigned to 1 of 6 experimental conditions: 2 (quality of scientific evidence: high vs. low) x 3 (safeguard to improve calibration: gist information, no-gist information [control], jury instructions). An expert witness (neuroscientist) hired by the court testified regarding the scientific strength of fMRI data (high [90 to 10 signal-to-noise ratio] vs. low [50 to 50 signal-to-noise ratio]) and gist or no-gist information both verbally (i.e., fairly high/about average) and visually (i.e., a graph). After viewing the video, participants were asked if they would like to award damages. If they indicated yes, they were asked to enter a dollar amount. Participants then completed the Positive and Negative Affect Schedule-Modified Short Form (PANAS-MSF; 16 items), expert Witness Credibility Scale (WCS; 20 items), Witness Credibility and Influence on damages for each witness, manipulation check questions, Understanding Scientific Testimony (UST; 10 items), and 3 additional measures were collected, but are beyond the scope of the current investigation. Finally, participants completed demographic questions, including questions about their scientific background and experience. The study was completed via Qualtrics, with participation from students (online vs. in-lab), MTurkers, and non-student community members. After removing those who failed attention check questions, 469 participants remained (243 men, 224 women, 2 did not specify gender) from a variety of racial and ethnic backgrounds (70.2% White, non-Hispanic). Results and Discussion There were three primary outcomes: quality of the scientific evidence, expert credibility (WCS), and damages. During initial analyses, each dependent variable was submitted to a separate 3 Gist Safeguard (safeguard, no safeguard, judge instructions) x 2 Scientific Quality (high, low) Analysis of Variance (ANOVA). Consistent with hypotheses, there was a significant main effect of scientific quality on strength of evidence, F(1, 463)=5.099, p=.024; participants who viewed the high quality evidence rated the scientific evidence significantly higher (M= 7.44) than those who viewed the low quality evidence (M=7.06). There were no significant main effects or interactions for witness credibility, indicating that the expert that provided scientific testimony was seen as equally credible regardless of scientific quality or gist safeguard. Finally, for damages, consistent with hypotheses, there was a marginally significant interaction between Gist Safeguard and Scientific Quality, F(2, 273)=2.916, p=.056. However, post hoc t-tests revealed significantly higher damages were awarded for low (M=11.50) versus high (M=10.51) scientific quality evidence F(1, 273)=3.955, p=.048 in the no gist with judge instructions safeguard condition, which was contrary to hypotheses. The data suggest that the judge instructions alone are reversing the pattern, though nonsignificant, those who received the no gist without judge instructions safeguard awarded higher damages in the high (M=11.34) versus low (M=10.84) scientific quality evidence conditions F(1, 273)=1.059, p=.30. Together, these provide promising initial results indicating that participants were able to effectively differentiate between high and low scientific quality of evidence, though inappropriately utilized the scientific evidence through their inability to discern expert credibility and apply damages, resulting in poor calibration. These results will provide the basis for more sophisticated analyses including higher order interactions with individual differences (e.g., need for cognition) as well as tests of mediation using path analyses. [References omitted but available by request] Learning Objective: Participants will be able to determine whether providing jurors with gist information would assist in their ability to award damages in a civil trial. 
    more » « less
  2. Over the last decade, userland memory forensics techniques and algorithms have gained popularity among practitioners, as they have proven to be useful in real forensics and cybercrime investigations. These techniques analyze and recover objects and artifacts from process memory space that are of critical importance in investigations. Nonetheless, the major drawback of existing techniques is that they cannot determine the origin and context within which the recovered object exists without prior knowledge of the application logic. Thus, in this research, we present a solution to close the gap between application-specific and application-generic techniques. We introduce OAGen, a post-execution and app-agnostic semantic analysis approach designed to help investigators establish concrete evidence by identifying the provenance and relationships between in-memory objects in a process memory image. OAGen utilizes Points-to analysis to reconstruct a runtime’s object allocation network. The resulting graph is then fed as an input into our semantic analysis algorithms to determine objects’ origin, context, and scope in the network. The results of our experiments exhibit OAGen’s ability to effectively create an allocation network even for memory-intensive applications with thousands of objects, like Facebook. The performance evaluation of our approach across fourteen different Android apps shows OAGen can efficiently search and decode nodes, and identify their references with a modest throughput rate. Further practical application of OAGen demonstrated in two case studies shows that our approach can aid investigators in the recovery of deleted messages and the detection of malware functionality in post-execution program analysis. 
    more » « less
  3. Abstract. Volcanic fallout in polar ice sheets provides important opportunities to date and correlate ice-core records as well as to investigate theenvironmental impacts of eruptions. Only the geochemical characterization of volcanic ash (tephra) embedded in the ice strata can confirm the sourceof the eruption, however, and is a requisite if historical eruption ages are to be used as valid chronological checks on annual ice layercounting. Here we report the investigation of ash particles in a Greenland ice core that are associated with a volcanic sulfuric acid layer previouslyattributed to the 79 CE eruption of Vesuvius. Major and trace element composition of the particles indicates that the tephra does not derive fromVesuvius but most likely originates from an unidentified eruption in the Aleutian arc. Using ash dispersal modeling, we find that only an eruptionlarge enough to include stratospheric injection is likely to account for the sizable (24–85 µm) ash particles observed in the Greenlandice at this time. Despite its likely explosivity, this event does not appear to have triggered significant climate perturbations, unlike some otherlarge extratropical eruptions. In light of a recent re-evaluation of the Greenland ice-core chronologies, our findings further challenge the previousassignation of this volcanic event to 79 CE. We highlight the need for the revised Common Era ice-core chronology to be formally accepted by the widerice-core and climate modeling communities in order to ensure robust age linkages to precisely dated historical and paleoclimate proxy records. 
    more » « less
  4. Claydon, John A. (Ed.)
    Reef fishes support important fisheries throughout the Caribbean, but a combination of factors in the tropics makes otolith microstructure difficult to interpret for age estimation. Therefore, validation of ageing methods, via application of Δ 14 C is a major research priority. Utilizing known-age otolith material from north Caribbean fishes, we determined that a distinct regional Δ 14 C chronology exists, differing from coral-based chronologies compiled for ageing validation from a wide-ranging area of the Atlantic and from an otolith-based chronology from the Gulf of Mexico. Our north Caribbean Δ 14 C chronology established a decline series with narrow prediction intervals that proved successful in ageing validation of three economically important reef fish species. In examining why our north Caribbean Δ 14 C chronology differed from some of the coral-based Δ 14 C data reported from the region, we determined differences among study objectives and research design impact Δ 14 C temporal relationships. This resulted in establishing the first of three important considerations relevant to applying Δ 14 C chronologies for ageing validation: 1) evaluation of the applicability of original goal/objectives and study design of potential Δ 14 C reference studies. Next, we determined differences between our Δ 14 C chronology and those from Florida and the Gulf of Mexico were explained by differences in regional patterns of oceanic upwelling, resulting in the second consideration for future validation work: 2) evaluation of the applicability of Δ 14 C reference data to the region/location where fish samples were obtained. Lastly, we emphasize the application of our north Caribbean Δ 14 C chronology should be limited to ageing validation studies of fishes from this region known to inhabit shallow water coral habitat as juveniles. Thus, we note the final consideration to strengthen findings of future age validation studies: 3) use of Δ 14 C analysis for age validation should be limited to species whose juvenile habitat is known to reflect the regional Δ 14 C reference chronology. 
    more » « less
  5. Biehl, Peter F. (Ed.)
    The timeframe of Indigenous settlements in Northeast North America in the 15 th -17 th centuries CE has until very recently been largely described in terms of European material culture and history. An independent chronology was usually absent. Radiocarbon dating has recently begun to change this conventional model radically. The challenge, if an alternative, independent timeframe and history is to be created, is to articulate a high-resolution chronology appropriate and comparable with the lived histories of the Indigenous village settlements of the period. Improving substantially on previous initial work, we report here high-resolution defined chronologies for the three most extensively excavated and iconic ancestral Kanienʼkehá꞉ka (Mohawk) village sites in New York (Smith-Pagerie, Klock and Garoga), and a fourth early historic Indigenous site, Brigg’s Run, and re-assess the wider chronology of the Mohawk River Valley in the mid-15 th to earlier 17 th centuries. This new chronology confirms initial suggestions from radiocarbon that a wholesale reappraisal of past assumptions is necessary, since our dates conflict completely with past dates and the previously presumed temporal order of these three iconic sites. In turn, a wider reassessment of northeastern North American early history and re-interpretation of Atlantic connectivities in the later 15 th through early 17 th centuries is required. Our new closely defined date ranges are achieved employing detailed archival analysis of excavation records to establish the contextual history for radiocarbon-dated samples from each site, tree-ring defined short time series from wood charcoal samples fitted against the radiocarbon calibration curve (‘wiggle-matching’), and Bayesian chronological modelling for each of the individual sites integrating all available prior knowledge and radiocarbon dating probabilities. We define (our preferred model) most likely (68.3% highest posterior density) village occupation ranges for Smith-Pagerie of ~1478–1498, Klock of ~1499–1521, Garoga of ~1550–1582, and Brigg’s Run of ~1619–1632. 
    more » « less