Harris County, Texas, remains at continuous risk to mosquito-borne diseases due to its geographic landscape and abundance of medically important mosquito vectors. Targeted mitigation of these mosquitoes requires accurate identification of these mosquitoes taxa. Currently, there is a paucity of genetic information to inform molecular identification and phylogenetic relationships beyond well-studied mosquito species. Here we utilized a genome skimming approach using shallow shot gun sequencing to generate data and assemble the mitochondrial genomes of 37 mosquito species collected in Harris County, Texas. This report includes the complete mitochondrial genome for 25 newly sequenced species spanning 10 genera; the genomes were consistent with reference genomes in the GenBank database having 37 genes (13 protein-coding, 2 rRNA and 22 tRNA), and average AT content of 78.74%. Bayesian and maximum likelihood tree topologies using just the easily aligned 13 concatenated protein coding genes confirmed phylogenetic placement of species for Aedes, Anopheles and Culex genera clustering in single clades as expected. Furthermore, this approach provided more robust phylogenetic placement/identity of study taxa when compared to the use of the traditional cytochrome oxidase I partial gene barcode sequence for molecular identification. This study demonstrates the utility of genome skimming as a cost-effective alternative approach to generate reference sequences for the validation of mosquito identification and taxonomic rectification, knowledge necessary for guiding targeted vector interventions.
more »
« less
Mosquito species identification using convolutional neural networks with a multitiered ensemble model for novel species detection
Abstract With over 3500 mosquito species described, accurate species identification of the few implicated in disease transmission is critical to mosquito borne disease mitigation. Yet this task is hindered by limited global taxonomic expertise and specimen damage consistent across common capture methods. Convolutional neural networks (CNNs) are promising with limited sets of species, but image database requirements restrict practical implementation. Using an image database of 2696 specimens from 67 mosquito species, we address the practical open-set problem with a detection algorithm for novel species. Closed-set classification of 16 known species achieved 97.04 ± 0.87% accuracy independently, and 89.07 ± 5.58% when cascaded with novelty detection. Closed-set classification of 39 species produces a macro F1-score of 86.07 ± 1.81%. This demonstrates an accurate, scalable, and practical computer vision solution to identify wild-caught mosquitoes for implementation in biosurveillance and targeted vector control programs, without the need for extensive image database development for each new target region.
more »
« less
- Award ID(s):
- 2039534
- PAR ID:
- 10385838
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 11
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The ability to distinguish between the abdominal conditions of adult female mosquitoes has important utility for the surveillance and control of mosquito-borne diseases. However, doing so requires entomological training and time-consuming manual effort. Here, we design computer vision techniques to determine stages in the gonotrophic cycle of female mosquitoes from images. Our dataset was collected from 139 adult female mosquitoes across three medically important species—Aedes aegypti,Anopheles stephensi, andCulex quinquefasciatus—and all four gonotrophic stages of the cycle (unfed, fully fed, semi-gravid, and gravid). From these mosquitoes and stages, a total of 1959 images were captured on a plain background via multiple smartphones. Subsequently, we trained four distinct AI model architectures (ResNet50,MobileNetV2,EfficientNet-B0, andConvNeXtTiny), validated them using unseen data, and compared their overall classification accuracies. Additionally, we analyzed t-SNE plots to visualize the formation of decision boundaries in a lower-dimensional space. Notably,ResNet50andEfficientNet-B0demonstrated outstanding performance with an overall accuracy of 97.44% and 93.59%, respectively.EfficientNet-B0demonstrated the best overall performance considering computational efficiency, model size, training speed, and t-SNE decision boundaries. We also assessed the explainability of thisEfficientNet-B0model, by implementing Grad-CAMs—a technique that highlights pixels in an image that were prioritized for classification. We observed that the highest weight was for those pixels representing the mosquito abdomen, demonstrating that our AI model has indeed learned correctly. Our work has significant practical impact. First, image datasets for gonotrophic stages of mosquitoes are not yet available. Second, our algorithms can be integrated with existing citizen science platforms that enable the public to record and upload biological observations. With such integration, our algorithms will enable the public to contribute to mosquito surveillance and gonotrophic stage identification. Finally, we are aware of work today that uses computer vision techniques for automated mosquito species identification, and our algorithms in this paper can augment these efforts by enabling the automated detection of gonotrophic stages of mosquitoes as well.more » « less
-
Abstract Mosquitoes have profoundly affected human history and continue to threaten human health through the transmission of a diverse array of pathogens. The phylogeny of mosquitoes has remained poorly characterized due to difficulty in taxonomic sampling and limited availability of genomic data beyond the most important vector species. Here, we used phylogenomic analysis of 709 single copy ortholog groups from 256 mosquito species to produce a strongly supported phylogeny that resolves the position of the major disease vector species and the major mosquito lineages. Our analyses support an origin of mosquitoes in the early Triassic (217 MYA [highest posterior density region: 188–250 MYA]), considerably older than previous estimates. Moreover, we utilize an extensive database of host associations for mosquitoes to show that mosquitoes have shifted to feeding upon the blood of mammals numerous times, and that mosquito diversification and host-use patterns within major lineages appear to coincide in earth history both with major continental drift events and with the diversification of vertebrate classes.more » « less
-
Context Land use change and deforestation drive both biodiversity loss and zoonotic disease transmission in tropical countrysides. For mosquito communities that can include disease vectors, forest loss has been linked to reduced biodiversity and increased vector presence. The spatial scales at which land use and tree cover shape mosquito communities present a knowledge gap relevant to both biodiversity and public health. Objectives We investigated the responses of mosquito species richness and Aedes albopictus disease vector presence to land use and to tree cover surrounding survey sites at different spatial scales. We also investigated species compositional turnover across land uses and along environmental gradients. Methods We paired a field survey of mosquito communities in agricultural, residential, and forested lands in rural southern Costa Rica with remotely sensed tree cover data. We compared mosquito richness and vector presence responses to tree cover measured across scales from 30 to 1000 m, and across land uses. We analyzed mosquito community compositional turnover between land uses and along environmental gradients of tree cover, temperature, elevation, and geographic distance. Results Tree cover was both positively correlated with mosquito species richness and negatively correlated with the presence of the common invasive dengue vector Ae. albopictus at small spatial scales of 90–250 m. Land use predicted community composition and Ae. albopictus presence. Conclusions The results suggest that local tree cover preservation and expansion can support mosquito species richness and reduce disease vector presence. The identified spatial range at which tree cover shapes mosquito communities can inform the development of land management practices to protect both ecosystem and public health.more » « less
-
The spatial distribution of forest stands is one of the fundamental properties of forests. Timely and accurately obtained stand distribution can help people better understand, manage, and utilize forests. The development of remote sensing technology has made it possible to map the distribution of tree species in a timely and accurate manner. At present, a large amount of remote sensing data have been accumulated, including high-spatial-resolution images, time-series images, light detection and ranging (LiDAR) data, etc. However, these data have not been fully utilized. To accurately identify the tree species of forest stands, various and complementary data need to be synthesized for classification. A curve matching based method called the fusion of spectral image and point data (FSP) algorithm was developed to fuse high-spatial-resolution images, time-series images, and LiDAR data for forest stand classification. In this method, the multispectral Sentinel-2 image and high-spatial-resolution aerial images were first fused. Then, the fused images were segmented to derive forest stands, which are the basic unit for classification. To extract features from forest stands, the gray histogram of each band was extracted from the aerial images. The average reflectance in each stand was calculated and stacked for the time-series images. The profile curve of forest structure was generated from the LiDAR data. Finally, the features of forest stands were compared with training samples using curve matching methods to derive the tree species. The developed method was tested in a forest farm to classify 11 tree species. The average accuracy of the FSP method for ten performances was between 0.900 and 0.913, and the maximum accuracy was 0.945. The experiments demonstrate that the FSP method is more accurate and stable than traditional machine learning classification methods.more » « less
An official website of the United States government

