skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.
Attention:The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 7:00 AM ET to 7:30 AM ET on Friday, April 24 due to maintenance. We apologize for the inconvenience.


Title: CellFMCount: A Fluorescence Microscopy Dataset, Benchmark, and Methods for Cell Counting
Abstract—Accurate cell counting is essential in various biomedical research and clinical applications, including cancer diagnosis, stem cell research, and immunology. Manual counting is labor-intensive and error-prone, motivating automation through deep learning techniques. However, training reliable deep learning models requires large amounts of high-quality annotated data, which is difficult and time-consuming to produce manually. Consequently, existing cell-counting datasets are often limited, frequently containing fewer than 500 images. In this work, we introduce a large-scale annotated dataset comprising 3,023 images from immunocytochemistry experiments related to cellular differentiation, containing over 430,000 manually annotated cell locations. The dataset presents significant challenges: high cell density, overlapping and morphologically diverse cells, a long-tailed distribution of cell count per image, and variation in staining protocols. We benchmark three categories of existing methods: regression-based, crowd-counting, and cell-counting techniques on a test set with cell counts ranging from 10 to 2,126 cells per image. We also evaluate how the Segment Anything Model (SAM) can be adapted for microscopy cell counting using only dot-annotated datasets. As a case study, we implement a density-map-based adaptation of SAM (SAM-Counter) and report a mean absolute error (MAE) of 22.12, which outperforms existing approaches (second-best MAE of 27.46). Our results underscore the value of the dataset and the benchmarking framework for driving progress in automated cell counting and provide a robust foundation for future research and development.  more » « less
Award ID(s):
2152117
PAR ID:
10678709
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3315-9599-9
Page Range / eLocation ID:
613 to 622
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a new annotated microscopic cellular image dataset to improve the effectiveness of machine learning methods for cellular image analysis. Cell counting is an important step in cell analysis. Typically, domain experts manually count cells in a microscopic image. Automated cell counting can potentially eliminate this tedious, time-consuming process. However, a good, labeled dataset is required for training an accurate machine learning model. Our dataset includes microscopic images of cells, and for each image, the cell count and the location of individual cells. The data were collected as part of an ongoing study investigating the potential of electrical stimulation to modulate stem cell differentiation and possible applications for neural repair. Compared to existing publicly available datasets, our dataset has more images of cells stained with more variety of antibodies (protein components of immune responses against invaders) typically used for cell analysis. The experimental results on this dataset indicate that none of the five existing models under this study are able to achieve sufficiently accurate count to replace the manual methods. The dataset is available at https://figshare.com/articles/dataset/Dataset/21970604. 
    more » « less
  2. Cell counting in immunocytochemistry is vital for biomedical research, supporting the diagnosis and treatment of diseases such as neurological disorders, autoimmune conditions, and cancer. However, traditional counting methods are manual, time-consuming, and error-prone, while deep learning solutions require costly labeled datasets, limiting scalability. We introduce the Immunocytochemistry Dataset Cell Counting with Segment Anything Model (IDCC-SAM), a novel application of the Segment Anything Model (SAM), designed to adapt the model for zero-shot-based cell counting in fluorescent microscopic immunocytochemistry datasets. IDCC-SAM leverages Meta AI’s SAM, pre-trained on 11 million images, to eliminate the need for annotations, enhancing scalability and efficiency. Evaluated on three public datasets (IDCIA, ADC, and VGG), IDCC-SAM achieved the lowest Mean Absolute Error (26, 28, 52) on VGG and ADC and the highest Acceptable Absolute Error (28%, 26%, 33%) across all datasets, outperforming state-of-the-art supervised models like U-Net and Mask R-CNN, as well as zero-shot benchmarks like NP-SAM and SAM4Organoid. These results demonstrate IDCC-SAM’s potential to improve cell-counting accuracy while reducing reliance on specialized models and manual annotations. 
    more » « less
  3. Abstract—Generating realistic synthetic microscopy images is critical for training deep learning models in label-scarce environments, such as cell counting with many cells per image. However, traditional domain adaptation methods often struggle to bridge the domain gap when synthetic images lack the complex textures and visual patterns of real samples. In this work, we adapt the Inversion-Based Style Transfer (InST) framework originally designed for artistic style transfer to biomedical microscopy images. Our method combines latent-space Adaptive Instance Normalization with stochastic inversion in a diffusion model to transfer the style from real fuorescence microscopy images to synthetic ones, while weakly preserving content structure. We evaluate the effectiveness of our InST-based synthetic dataset for downstream cell counting by pre-training and fne tuning EffcientNet-B0 models on various data sources, including real data, hard-coded synthetic data, and the public Cell200-s dataset. Models trained with our InST-synthesized images achieve up to 37% lower Mean Absolute Error (MAE) compared to models trained on hard-coded synthetic data, and a 52% reduction in MAE compared to models trained on Cell200-s (from 53.70 to 25.95 MAE). Notably, our approach also outperforms models trained on real data alone (25.95 vs. 27.74 MAE). Further improvements are achieved when combining InST-synthesized data with lightweight domain adaptation techniques such as DACS with CutMix. These findings demonstrate that InST-based style transfer most effectively reduces the domain gap between synthetic and real microscopy data. Our approach offers a scalable path for enhancing cell counting performance while minimizing manual labeling effort. The source code and resources are publicly available at: https://github.com/MohammadDehghan/ InST-Microscopy 
    more » « less
  4. Plant counting is a critical aspect of crop management, providing farmers with valuable insights into seed germination success and within-field variation in crop population density, both of which are key indicators of crop yield and quality. Recent advancements in Unmanned Aerial System (UAS) technology, coupled with deep learning techniques, have facilitated the development of automated plant counting methods. Various computer vision models based on UAS images are available for detecting and classifying crop plants. However, their accuracy relies largely on the availability of substantial manually labeled training datasets. The objective of this study was to develop a robust corn counting model by developing and integrating an automatic image annotation framework. This study used high-spatial-resolution images collected with a DJI Mavic Pro 2 at the V2–V4 growth stage of corn plants from a field in Wooster, Ohio. The automated image annotation process involved extracting corn rows and applying image enhancement techniques to automatically annotate images as either corn or non-corn, resulting in 80% accuracy in identifying corn plants. The accuracy of corn stand identification was further improved by training four deep learning (DL) models, including InceptionV3, VGG16, VGG19, and Vision Transformer (ViT), with annotated images across various datasets. Notably, VGG16 outperformed the other three models, achieving an F1 score of 0.955. When the corn counts were compared to ground truth data across five test regions, VGG achieved an R2 of 0.94 and an RMSE of 9.95. The integration of an automated image annotation process into the training of the DL models provided notable benefits in terms of model scaling and consistency. The developed framework can efficiently manage large-scale data generation, streamlining the process for the rapid development and deployment of corn counting DL models. 
    more » « less
  5. ABSTRACT Cochlear hair cell stereocilia bundles are key organelles required for normal hearing. Often, deafness mutations cause aberrant stereocilia heights or morphology that are visually apparent but challenging to quantify. Actin-based structures, stereocilia are easily and most often labeled with phalloidin then imaged with 3D confocal microscopy. Unfortunately, phalloidin non-specifically labels all the actin in the tissue and cells and therefore results in a challenging segmentation task wherein the stereocilia phalloidin signal must be separated from the rest of the tissue. This can require many hours of manual human effort for each 3D confocal image stack. Currently, there are no existing software pipelines that provide an end-to-end automated solution for 3D stereocilia bundle instance segmentation. Here we introduce VASCilia, a Napari plugin designed to automatically generate 3D instance segmentation and analysis of 3D confocal images of cochlear hair cell stereocilia bundles stained with phalloidin. This plugin combines user-friendly manual controls with advanced deep learning-based features to streamline analyses. With VASCilia, users can begin their analysis by loading image stacks. The software automatically preprocesses these samples and displays them in Napari. At this stage, users can select their desired range of z-slices, adjust their orientation, and initiate 3D instance segmentation. After segmentation, users can remove any undesired regions and obtain measurements including volume, centroids, and surface area. VASCilia introduces unique features that measures bundle heights, determines their orientation with respect to planar polarity axis, and quantifies the fluorescence intensity within each bundle. The plugin is also equipped with trained deep learning models that differentiate between inner hair cells and outer hair cells and predicts their tonotopic position within the cochlea spiral. Additionally, the plugin includes a training section that allows other laboratories to fine-tune our model with their own data, provides responsive mechanisms for manual corrections through event-handlers that check user actions, and allows users to share their analyses by uploading a pickle file containing all intermediate results. We believe this software will become a valuable resource for the cochlea research community, which has traditionally lacked specialized deep learning-based tools for obtaining high-throughput image quantitation. Furthermore, we plan to release our code along with a manually annotated dataset that includes approximately 55 3D stacks featuring instance segmentation. This dataset comprises a total of 1,870 instances of hair cells, distributed between 410 inner hair cells and 1,460 outer hair cells, all annotated in 3D. As the first open-source dataset of its kind, we aim to establish a foundational resource for constructing a comprehensive atlas of cochlea hair cell images. Together, this open-source tool will greatly accelerate the analysis of stereocilia bundles and demonstrates the power of deep learning-based algorithms for challenging segmentation tasks in biological imaging research. Ultimately, this initiative will support the development of foundational models adaptable to various species, markers, and imaging scales to advance and accelerate research within the cochlea research community. 
    more » « less