skip to main content

Title: The Machine Learning landscape of top taggers
Based on the established task of identifying boosted, hadronicallydecaying top quarks, we compare a wide range of modern machine learningapproaches. Unlike most established methods they rely on low-levelinput, for instance calorimeter output. While their networkarchitectures are vastly different, their performance is comparativelysimilar. In general, we find that these new approaches are extremelypowerful and great fun.
Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; ; ; « less
Award ID(s):
1836650
Publication Date:
NSF-PAR ID:
10167451
Journal Name:
SciPost Physics
Volume:
7
Issue:
1
ISSN:
2542-4653
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract
    These data are from four separate projects undertaken between 1997 and 2017. The first of these are two snow manipulation (freeze) projects: 1) In 1997, as part of a study of the relationships between snow depth, soil freezing and nutrient cycling, we established eight 10 x 10-m plots located within four stands; two dominated (80%) by sugar maple (SM1 and SM2) and two dominated by yellow birch(YB1 and YB2), with one snow reduction (shoveling) and one reference plot in each stand. 2) In 2001, we established eight new 10-m x 10-m plots (4 treatment, 4 reference) in four new sites; two high elevation, north facing and (East Kineo and West Kineo) two low elevation, south facing (Upper Valley and Lower Valley) maple-beech-birch stands. To establish plots, we cleared minor amounts of understory vegetation from all (both treatment and reference) plots (to facilitate shoveling). Treatments (keeping plots snow free by shoveling through the end of January) were applied in the winters of 1997/98, 1998/99, 2002/2003 and 2003/2004. The Climate Gradient Project was established in October 2010. Here we evaluated relationships between snow depth, soil freezing and nutrient cycling along an elevation/aspect gradient that created variation in climate with little variationMore>>
  2. Abstract

    Since the first Spanish settlers brought horses to America centuries ago, several local varieties and breeds have been established in the New World. These were generally a consequence of the admixture of the different breeds arriving from Europe. In some instances, local horses have been selectively bred for specific traits, such as appearance, endurance, strength, and gait. We looked at the genetics of two breeds, the Puerto Rican Non-Purebred (PRNPB) (also known as the “Criollo”) horses and the Puerto Rican Paso Fino (PRPF), from the Caribbean Island of Puerto Rico. While it is reasonable to assume that there was a historic connection between the two, the genetic link between them has never been established. In our study, we started by looking at the genetic ancestry and diversity of current Puerto Rican horse populations using a 668 bp fragment of the mitochondrial DNA D-loop (HVR1) in 200 horses from 27 locations on the island. We then genotyped all 200 horses in our sample for the “gait-keeper”DMRT3mutant allele previously associated with the paso gait especially cherished in this island breed. We also genotyped a subset of 24 samples with the Illumina Neogen Equine Community genome-wide array (65,000 SNPs). This data was further combined with the publicly availablemore »PRPF genomes from other studies. Our analysis show an undeniable genetic connection between the two varieties in Puerto Rico, consistent with the hypothesis that PRNPB horses represent the descendants of the original genetic pool, a mix of horses imported from the Iberian Peninsula and elsewhere in Europe. Some of the original founders of PRNRB population must have carried the “gait-keeper”DMRT3allele upon arrival to the island. From this admixture, the desired traits were selected by the local people over the span of centuries. We propose that the frequency of the mutant “gait-keeper” allele originally increased in the local horses due to the selection for the smooth ride and other characters, long before the PRPF breed was established. To support this hypothesis, we demonstrate that PRNPB horses, and not the purebred PRPF, carry a signature of selection in the genomic region containing theDMRT3locus to this day. The lack of the detectable signature of selection associated with theDMRT3in the PRPF would be expected if this native breed was originally derived from the genetic pool of PRNPB horses established earlier and most of the founders already had the mutant allele. Consequently, selection specific to PRPF later focused on allels in other genes (includingCHRM5, CYP2E1, MYH7, SRSF1, PAM, PRNand others) that have not been previously associated with the prized paso gait phenotype in Puerto Rico or anywhere else.

    « less
  3. Abstract Alignment is a crucial issue in molecular phylogenetics because different alignment methods can potentially yield very different topologies for individual genes. But it is unclear if the choice of alignment methods remains important in phylogenomic analyses, which incorporate data from dozens, hundreds, or thousands of genes. For example, problematic biases in alignment might be multiplied across many loci, whereas alignment errors in individual genes might become irrelevant. The issue of alignment trimming (i.e. removing poorly aligned regions or missing data from individual genes) is also poorly explored. Here, we test the impact of 12 different combinations of alignment and trimming methods on phylogenomic analyses. We compare these methods using published phylogenomic data from ultraconserved elements (UCEs) from squamate reptiles (lizards and snakes), birds, and tetrapods. We compare the properties of alignments generated by different alignment and trimming methods (e.g., length, informative sites, missing data). We also test whether these datasets can recover well-established clades when analyzed with concatenated (RAxML) and species-tree methods (ASTRAL-III), using the full data (∼5,000 loci) and subsampled datasets (10% and 1% of loci). We show that different alignment and trimming methods can significantly impact various aspects of phylogenomic datasets (e.g. length, informative sites). However, thesemore »different methods generally had little impact on the recovery and support values for well-established clades, even across very different numbers of loci. Nevertheless, our results suggest several “best practices” for alignment and trimming. Intriguingly, the choice of phylogenetic methods impacted the results most strongly, with concatenated analyses recovering significantly more well-established clades (with stronger support) than the species-tree analyses.« less
  4. Maize ( Zea mays ssp. mays ) is a popular genetic model due to its ease of crossing, well-established toolkits, and its status as a major global food crop. Recent technology developments for precise manipulation of the genome are further impacting both basic biological research and biotechnological application in agriculture. Crop gene editing often requires a process of genetic transformation in which the editing reagents are introduced into plant cells. In maize, this procedure is well-established for a limited number of public lines that are amenable for genetic transformation. Fast-Flowering Mini-Maize (FFMM) lines A and B were recently developed as an open-source tool for maize research by reducing the space requirements and the generation time. Neither line of FFMM were competent for genetic transformation using traditional protocols, a necessity to its status as a complete toolkit for public maize genetic research. Here we report the development of new lines of FFMM that have been bred for amenability to genetic transformation. By hybridizing a transformable maize genotype high Type-II callus parent A (Hi-II A) with line A of FFMM, we introgressed the ability to form embryogenic callus from Hi-II A into the FFMM-A genetic background. Through multiple generations of iterative self-hybridizationmore »or doubled-haploid method, we established maize lines that have a strong ability to produce embryogenic callus from immature embryos and maintain resemblance to FFMM-A in flowering time and stature. Using an Agrobacterium -mediated standard transformation method, we successfully introduced the CRISPR-Cas9 reagents into immature embryos and generated transgenic and mutant lines displaying the expected mutant phenotypes and genotypes. The transformation frequencies of the tested genotypes, defined as the numbers of transgenic event producing T1 seeds per 100 infected embryos, ranged from 0 to 17.1%. Approximately 80% of transgenic plants analyzed in this study showed various mutation patterns at the target site. The transformable FFMM line, FFMM-AT, can serve as a useful genetic and genomic resource for the maize community.« less
  5. The goal of motion tomography is to recover a description of a vector flow field using measurements along the trajectory of a sensing unit. In this paper, we develop a predictor corrector algorithm designed to recover vector flow fields from trajectory data with the use of occupation kernels developed by Rosenfeld et al. [9,10]. Specifically, we use the occupation kernels as an adaptive basis; that is, the trajectories defining our occupation kernels are iteratively updated to improve the estimation in the next stage. Initial estimates are established, then under mild assumptions, such as relatively straight trajectories, convergence is proven using the Contraction Mapping Theorem. We then compare the developed method with the established method by Chang et al. [5] by defining a set of error metrics. We found that for simulated data, where a ground truth is available, our method offers a marked improvement over [5]. For a real-world example, where ground truth is not available, our results are similar results to the established method.