skip to main content

Title: Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels
Pesticides benefit agriculture by increasing crop yield, quality, and security. However, pesticides may inadvertently harm bees, which are valuable as pollinators. Thus, candidate pesticides in development pipelines must be assessed for toxicity to bees. Leveraging a dataset of 382 molecules with toxicity labels from honey bee exposure experiments, we train a support vector machine (SVM) to predict the toxicity of pesticides to honey bees. We compare two representations of the pesticide molecules: (i) a random walk feature vector listing counts of length- L walks on the molecular graph with each vertex- and edge-label sequence and (ii) the Molecular ACCess System (MACCS) structural key fingerprint (FP), a bit vector indicating the presence/absence of a list of pre-defined subgraph patterns in the molecular graph. We explicitly construct the MACCS FPs but rely on the fixed-length- L random walk graph kernel (RWGK) in place of the dot product for the random walk representation. The L-RWGK-SVM achieves an accuracy, precision, recall, and F1 score (mean over 2000 runs) of 0.81, 0.68, 0.71, and 0.69, respectively, on the test data set—with L = 4 being the mode optimal walk length. The MACCS-FP-SVM performs on par/marginally better than the L-RWGK-SVM, lends more interpretability, but varies more in performance. We interpret the MACCS-FP-SVM by illuminating which subgraph patterns in the molecules tend to strongly push them toward the toxic/non-toxic side of the separating hyperplane.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
The Journal of Chemical Physics
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Each year, millions of kilograms of insecticides are applied to crops in the US. While insecticide use supports food, fuel, and fiber production, it can also threaten non-target organisms, a concern underscored by mounting evidence of widespread decline of pollinator populations. Here, we integrate several public datasets to generate county-level annual estimates of total ‘bee toxic load’ (honey bee lethal doses) for insecticides applied in the US between 1997–2012, calculated separately for oral and contact toxicity. To explore the underlying components of the observed changes, we divide bee toxic load into extent (area treated) and intensity (application rate x potency). We show that while contact-based bee toxic load remained relatively steady, oral-based bee toxic load increased roughly 9-fold, with reductions in application rate outweighed by disproportionate increases in potency (toxicity/kg) and extent. This pattern varied markedly by region, with the greatest increase seen in Heartland (121-fold increase), likely driven by use of neonicotinoid seed treatments in corn and soybean. In this “potency paradox”, farmland in the central US has become more hazardous to bees despite lower volumes of insecticides applied, raising concerns about insect conservation and highlighting the importance of integrative approaches to pesticide use monitoring.

    more » « less
  2. Statistics of small subgraph counts such as triangles, four-cycles, and s-t paths of short lengths reveal important structural properties of the underlying graph. These problems have been widely studied in social network analysis. In most relevant applications, the graphs are not only massive but also change dynamically over time. Most of these problems become hard in the dynamic setting when considering the worst case. In this paper, we ask whether the question of small subgraph counting over dynamic graphs is hard also in the average case. We consider the simplest possible average case model where the updates follow an Erdős-Rényi graph: each update selects a pair of vertices (u, v) uniformly at random and flips the existence of the edge (u, v). We develop new lower bounds and matching algorithms in this model for counting four-cycles, counting triangles through a specified point s, or a random queried point, and st paths of length 3, 4 and 5. Our results indicate while computing st paths of length 3, and 4 are easy in the average case with O(1) update time (note that they are hard in the worst case), it becomes hard when considering st paths of length 5. We introduce new techniques which allow us to get average-case hardness for these graph problems from the worst-case hardness of the Online Matrix vector problem (OMv). Our techniques rely on recent advances in fine-grained average-case complexity. Our techniques advance this literature, giving the ability to prove new lower bounds on average-case dynamic algorithms. Read More: 
    more » « less
  3. Abstract Background

    Aluminum is the third most prevalent element in the earth’s crust. In most conditions, it is tightly bound to form inaccessible compounds, however in low soil pH, the ionized form of aluminum can be taken up by plant roots and distributed throughout the plant tissue. Following this uptake, nectar and pollen concentrations in low soil pH regions can reach nearly 300 mg/kg. Inhibition of acetylcholinesterase (AChE) has been demonstrated following aluminum exposure in mammal and aquatic invertebrate species. In honey bees, behaviors consistent with AChE inhibition have been previously recorded; however, the physiological mechanism has not been tested, nor has aversive conditioning.


    This article presents results of ingested aqueous aluminum chloride exposure on AChE as well as acute exposure effects on aversive conditioning in anApis mellifera ligusticahive. Contrary to previous findings, AChE activity significantly increased as compared to controls following exposure to 300 mg/L Al3+. In aversive conditioning studies, using an automated shuttlebox, there were time and dose-dependent effects on learning and reduced movement following 75 and 300 mg/L exposures.


    These findings, in comparison to previous studies, suggest that aluminum toxicity in honey bees may depend on exposure period, subspecies, and study metrics. Further studies are encouraged at the moderate-high exposure concentrations as there may be multiple variables that affect toxicity which should be teased apart further.

    more » « less
  4. Abstract

    Human‐mediated species introductions provide real‐time experiments in how communities respond to interspecific competition. For example, managed honey beesApis mellifera(L.) have been widely introduced outside their native range and may compete with native bees for pollen and nectar. Indeed, multiple studies suggest that honey bees and native bees overlap in their use of floral resources. However, for resource overlap to negatively impact resource collection by native bees, resource availability must also decline, and few studies investigate impacts of honey bee competition on native bee floral visits and floral resource availability simultaneously.

    In this study, we investigate impacts of increasing honey bee abundance on native bee visitation patterns, pollen diets, and nectar and pollen resource availability in two Californian landscapes: wildflower plantings in the Central Valley and montane meadows in the Sierra.

    We collected data on bee visits to flowers, pollen and nectar availability, and pollen carried on bee bodies across multiple sites in the Sierra and Central Valley. We then constructed plant‐pollinator visitation networks to assess how increasing honey bee abundance impacted perceived apparent competition (PAC), a measure of niche overlap, and pollinator specialization (d'). We also compared PAC values against null expectations to address whether observed changes in niche overlap were greater or less than what we would expect given the relative abundances of interacting partners.

    We find clear evidence of exploitative competition in both ecosystems based on the following results: (1) honey bee competition increased niche overlap between honey bees and native bees, (2) increased honey bee abundance led to decreased pollen and nectar availability in flowers, and (3) native bee communities responded to competition by shifting their floral visits, with some becoming more specialized and others becoming more generalized depending on the ecosystem and bee taxon considered.

    Although native bees can adapt to honey bee competition by shifting their floral visits, the coexistence of honey bees and native bees is tenuous and will depend on floral resource availability. Preserving and augmenting floral resources is therefore essential in mitigating negative impacts of honey bee competition. In two California ecosystems, honey bee competition decreases pollen and nectar resource availability in flowers and alters native bee diets with potential implications for bee conservation and wildlands management.

    more » « less
  5. Abstract

    Conflict between genes inherited from the mother (matrigenes) and the father (patrigenes) is predicted to arise during social interactions among offspring if these genes are not evenly distributed among offspring genotypes. This intragenomic conflict drives parent-specific transcription patterns in offspring resulting from parent-specific epigenetic modifications. Previous tests of the kinship theory of intragenomic conflict in honey bees (Apis mellifera) provided evidence in support of theoretical predictions for variation in worker reproduction, which is associated with extreme variation in morphology and behavior. However, more subtle behaviors – such as aggression – have not been extensively studied. Additionally, the canonical epigenetic mark (DNA methylation) associated with parent-specific transcription in plant and mammalian model species does not appear to play the same role as in honey bees, and thus the molecular mechanisms underlying intragenomic conflict in this species is an open area of investigation. Here, we examined the role of intragenomic conflict in shaping aggression in honey bee workers through a reciprocal cross design and Oxford Nanopore direct RNA sequencing. We attempted to probe the underlying regulatory basis of this conflict through analyses of parent-specific RNA m6A and alternative splicing patterns. We report evidence that intragenomic conflict occurs in the context of honey bee aggression, with increased paternal and maternal allele-biased transcription in aggressive compared to non-aggressive bees, and higher paternal allele-biased transcription overall. However, we found no evidence to suggest that RNA m6A or alternative splicing mediate intragenomic conflict in this species.

    more » « less