<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Accurately Identifying Sound vs. Rotten Cranberries Using Convolutional Neural Network</title></titleStmt>
			<publicationStmt>
				<publisher>MDPI</publisher>
				<date>11/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10616042</idno>
					<idno type="doi">10.3390/info15110731</idno>
					<title level='j'>Information</title>
<idno>2078-2489</idno>
<biblScope unit="volume">15</biblScope>
<biblScope unit="issue">11</biblScope>					

					<author>Sayed Mehedi Azim</author><author>Austin Spadaro</author><author>Joseph Kawash</author><author>James Polashock</author><author>Iman Dehzangi</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<p>Cranberries, native to North America, are known for their nutritional value and human health benefits. One hurdle to commercial production is losses due to fruit rot. Cranberry fruit rot results from a complex of more than ten filamentous fungi, challenging breeding for resistance. Nonetheless, our collaborative breeding program has fruit rot resistance as a significant target. This program currently relies heavily on manual sorting of sound vs. rotten cranberries. This process is labor-intensive and time-consuming, prompting the need for an automated classification (sound vs. rotten) system. Although many studies have focused on classifying different fruits and vegetables, no such approach has been developed for cranberries yet, partly because datasets are lacking for conducting the necessary image analyses. This research addresses this gap by introducing a novel image dataset comprising sound and rotten cranberries to facilitate computational analysis. In addition, we developed CARP (Cranberry Assessment for Rot Prediction), a convolutional neural network (CNN)-based model to distinguish sound cranberries from rotten ones. With an accuracy of 97.4%, a sensitivity of 97.2%, and a specificity of 97.2% on the training dataset and 94.8%, 95.4%, and 92.7% on the independent dataset, respectively, our proposed CNN model shows its effectiveness in accurately differentiating between sound and rotten cranberries.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Fruits significantly impact human health due to their high fiber content, vitamins, minerals, and antioxidants <ref type="bibr">[1]</ref>. Indigenous to North America, the American cranberry (Vaccinium macrocarpon Aiton) is a healthy fruit that contributes nutritional value. Previous research indicates that cranberry nutrients help prevent various cancers, including oral cancer, liver cancer, and stomach cancer, cardiovascular diseases, infections involving urinary tract disorders, and Helicobacter pylori-induced stomach ulcers <ref type="bibr">[2]</ref><ref type="bibr">[3]</ref><ref type="bibr">[4]</ref><ref type="bibr">[5]</ref>. Furthermore, cranberries are valued for their bioactive functional properties, making them popular for consumers seeking a well-balanced diet.</p><p>Breeding for superior cranberry varieties is in its infancy relative to most other crops. The first cranberry breeding program was started in the U.S. in 1929 in response to a disease outbreak <ref type="bibr">[6]</ref>. The first selections from this breeding program were made available in 1950 <ref type="bibr">[7]</ref>. Until this time, the commercial industry relied entirely on wild varieties. Public breeding programs established in Wisconsin and New Jersey have released several improved varieties since 2003. These programs are still active today.</p><p>Cranberry fruit rot, caused by a complex of 10 to 15 filamentous fungi, is economically the most important disease of cranberries <ref type="bibr">[8]</ref>. Without protection by fungicides, rot can po-tentially destroy 50% to 100% of the fruit in the field <ref type="bibr">[8]</ref>. Storage rot is less of a concern since most of the cranberries harvested are processed and not sold fresh. The cranberry industry in New Jersey ranked fruit rot resistance as the most vital trait to develop in cranberry, while Wisconsin listed fruit rot resistance as second in priority <ref type="bibr">[9]</ref>. Climate change may exacerbate cranberry fruit rot, especially in Wisconsin and the Pacific Northwest, where it is less of an issue <ref type="bibr">[10]</ref>. Finally, there is pressure to reduce pesticide use for human and environmental health reasons. Additionally, banning many effective fungicides has greatly limited the chemical control options <ref type="bibr">[11]</ref>.</p><p>In the face of the issues noted above, a sustainable approach to reducing fruit rot in cranberries is through breeding for resistance. A critical component of the breeding process is phenotyping traits of interest. Cranberries are woody perennials, requiring multi-year analyses. Thousands of plots are typically planted in the field and evaluated over 3-5 years <ref type="bibr">[12]</ref>. Fruits are collected from a specified plot size and manually assessed to determine the amount afflicted by rot. This process is very time-consuming and laborintensive. We propose a new machine learning model that automatically identifies sound vs. rotten cranberries from image data to speed up this process and reduce the hands-on labor requirements. Such a system (i.e., a system that can determine sound vs. rotted fruit) will significantly improve the efficiency of the sorting process, reduce the need for manual labor, and streamline cranberry breeding programs.</p><p>Over the past decades, advancements in image analysis and data-centric methods have shown tremendous success in agriculture <ref type="bibr">[13]</ref><ref type="bibr">[14]</ref><ref type="bibr">[15]</ref><ref type="bibr">[16]</ref><ref type="bibr">[17]</ref><ref type="bibr">[18]</ref><ref type="bibr">[19]</ref>. Due to its quick and reproducible information processing for a specific product, red-green-blue (RGB) image analysis is becoming more and more popular in industrial applications. The flexibility of performing a detailed analysis of each pixel in an image gives information about a sample's local and global color features, making it suitable for evaluating color-related aspects of highly heterogeneous matrices like those of most food products <ref type="bibr">[20]</ref><ref type="bibr">[21]</ref><ref type="bibr">[22]</ref>.</p><p>Thus, a substantial amount of research has been conducted on the classification of various fruits and vegetables using RGB images. Giraudo et al. developed an automated method to detect defective hazelnuts from RGB images using a tree-structure hierarchical classification approach <ref type="bibr">[16]</ref>. Linag et al. proposed a grading method for defective apples using BiSeNet v2 and a pruned YOLO v4 network, which was further applied to a separate fruit tray machine to sort the apples based on the area of defects in the images <ref type="bibr">[17]</ref>. Foong et al. used a convolutional-neural-network-based model, ResNet50, to detect rotten fruits, specifically bananas, apples, and oranges <ref type="bibr">[18]</ref>. Parashar et al. developed a VGG-16 architecture to classify sound and rotten bananas, apples, and oranges with an accuracy of 93.52% <ref type="bibr">[19]</ref>.</p><p>In addition, researchers have studied the filtering of various fruits and vegetables using images of different visible spectra. El-Bendary et al. used a support vector machine (SVM) to examine tomato ripeness using 250 visible spectrum photos and achieved 90.8% accuracy <ref type="bibr">[13]</ref>. Elhariri et al. experimented with 175 visible spectrum images and applied SVM to determine tomato ripeness with an accuracy of 92.72% <ref type="bibr">[14]</ref>. Nguyen et al. developed a deep-learning-based tool for detecting viral diseases of grapevines using hyperspectral images captured by a SPECIM IQ 400-1000 nm hyperspectral sensor <ref type="bibr">[15]</ref>.</p><p>However, as no such datasets are available for cranberries, filtering rotten cranberries using machine learning (ML) techniques from images has yet to be investigated. This research presents first-of-their-kind image datasets from rotten and sound cranberry images. The main purpose of this research is to develop CARP (Cranberry Assessment for Rot Prediction), a convolutional-neural-network-based model, as a proof of concept to accurately and efficiently distinguish between sound and rotten cranberries, thereby supporting cranberry breeding and quality control efforts. Our generated dataset and standalone tool, CARP, are publicly available at: <ref type="url">https://mlbclab.camden.rutgers.edu/usda-collaboration/</ref> (accessed on 1 November 2024).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Materials and Methods</head><p>In this section, we will introduce our newly generated benchmark dataset and explain the data collection, classification, and evaluation techniques used in this study.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Data Collection</head><p>The cranberries used in this study were collected from commercial farms in and around Chatsworth, NJ. We captured the images of cranberry samples using a Micro-Hyperspec Extended VNIR device in the visible light range (manufacturer: Headwall Photonics, Inc., Bolton, MA, USA). The camera specifications used in the study are Micro-Hyperspec XVNIR R640 (Headwall Photonics, Bolton, MA, USA), wavelength ranging from 600 nm to 1700 nm with up to 267 spectral bands and up to 640 spatial bands, a pixel pitch of 15 &#181;m, which refers to the center-to-center distance between two pixels, and frame rates up to 120 Hz.</p><p>This device collects full hyperspectral data for every pixel. The dataset is collected in a light spectrum and later converted to RGB images. This wavelength is significant for detecting specific biochemical characteristics and assessing the health of plant tissues and fruits. Additionally, images taken in this wavelength can be useful for identifying pigments present in fruits and vegetables, such as carotenoids, which play a crucial role in coloration and are associated with ripeness and decay processes <ref type="bibr">[23,</ref><ref type="bibr">24]</ref>. To capture the images, the camera was positioned perpendicularly above the tray containing the cranberries. A total of 40 trays containing cranberries were captured to create the dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Benchmark Dataset</head><p>We first segmented the original images to classify rotten vs. sound cranberries from the scanned RGB images to segregate individual cranberries. To do this, we used Otsu's thresholding method to find the best threshold value to distinguish between our region of interest and the background in the image <ref type="bibr">[25]</ref><ref type="bibr">[26]</ref><ref type="bibr">[27]</ref>. This approach is a nonparametric and unsupervised method for automatic threshold selection, which selects an optimal threshold based on the different discriminant criteria <ref type="bibr">[28]</ref>. The primary criterion is to maximize the discriminant measure &#951;. After processing, this method generates an image histogram containing the distribution of pixels. This method obtains the threshold value by maximizing the between-class variance, where a class means a set of pixels in a region. This technique produces satisfactory results in bimodal images, as the histogram of such images has two clearly expressed peaks with different intensity values <ref type="bibr">[29]</ref><ref type="bibr">[30]</ref><ref type="bibr">[31]</ref>. After calculating the threshold value, we segment the cranberries in the image by obtaining the contours using OpenCV (Version 3.4) <ref type="bibr">[32]</ref>.</p><p>From a total of 40 scanned images (an image sample is available in Figure <ref type="figure">1</ref>), a total of 1140 images consisting of 874 sound cranberries and 266 rotten cranberries were extracted. We created a training set consisting of 910 samples, of which 699 were sound and 211 were rotten. Consequently, the testing set contains 230 images with 175 sound and 55 rotten samples. The higher ratio of sound images reflects the actual harvest results.</p><p>To feed the data into our model for the classification task and ensure consistency across different models, we utilized the original images for model training. For the CNN model, the images are used directly as input, enabling convolution operations. For the other machine learning methods (such as SVM, RF, and KNN), the images are flattened into onedimensional vectors. This process involves reshaping each image into a one-dimensional element vector. This standardized representation ensures that the same data are used across all models, allowing for a fair performance comparison. To feed the data into our model for the classification task and ensure consistency across different models, we utilized the original images for model training. For the CNN model, the images are used directly as input, enabling convolution operations. For the other machine learning methods (such as SVM, RF, and KNN), the images are fla ened into one-dimensional vectors. This process involves reshaping each image into a one-dimensional element vector. This standardized representation ensures that the same data are used across all models, allowing for a fair performance comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Classifier</head><p>Convolutional neural network is a commonly used technique in image classification problems <ref type="bibr">[33]</ref>. From high-complexity images such as histopathology images to biological sequence data, using CNN has provided tremendous success <ref type="bibr">[34]</ref><ref type="bibr">[35]</ref><ref type="bibr">[36]</ref>. CNN extracts features (automatic feature extraction) by convolution operations and has shown superior results in image classification tasks compared to other machine learning and deep learning (DL) techniques. Previous studies have demonstrated that deeper CNN models commonly provide be er results <ref type="bibr">[37]</ref>. However, different studies have shown that increasing the convolutional layer's depth does not necessarily improve prediction accuracy, especially for smaller datasets like ours <ref type="bibr">[38]</ref>. Additionally, using a small CNN architecture reduces the chance of overfi ing and requires fewer samples for training <ref type="bibr">[39,</ref><ref type="bibr">40]</ref>.</p><p>The CNN architecture we use is depicted in Figure <ref type="figure">2</ref>. Our CNN classifier consists of three Conv2D layers with the number of filters and kernel sizes of <ref type="bibr">[16, (3,3)</ref>], <ref type="bibr">[32, (3,3)</ref>], and <ref type="bibr">[16, (3,3)</ref>], respectively. After each of these layers, we added the Maxpooling2D layer. The output from the fla ening layer is passed directly to a fully connected layer and then to the prediction layer. We used the rectified linear unit (ReLU) as an activation function for each intermediate layer, as it is widely used in machine learning for its simplicity and effectiveness <ref type="bibr">[41,</ref><ref type="bibr">42]</ref>. In the output layer, sigmoid was used as an activation function.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Classifier</head><p>Convolutional neural network is a commonly used technique in image classification problems <ref type="bibr">[33]</ref>. From high-complexity images such as histopathology images to biological sequence data, using CNN has provided tremendous success <ref type="bibr">[34]</ref><ref type="bibr">[35]</ref><ref type="bibr">[36]</ref>. CNN extracts features (automatic feature extraction) by convolution operations and has shown superior results in image classification tasks compared to other machine learning and deep learning (DL) techniques. Previous studies have demonstrated that deeper CNN models commonly provide better results <ref type="bibr">[37]</ref>. However, different studies have shown that increasing the convolutional layer's depth does not necessarily improve prediction accuracy, especially for smaller datasets like ours <ref type="bibr">[38]</ref>. Additionally, using a small CNN architecture reduces the chance of overfitting and requires fewer samples for training <ref type="bibr">[39,</ref><ref type="bibr">40]</ref>.</p><p>The CNN architecture we use is depicted in Figure <ref type="figure">2</ref>. Our CNN classifier consists of three Conv2D layers with the number of filters and kernel sizes of <ref type="bibr">[16, (3,3)</ref>], <ref type="bibr">[32, (3,3)</ref>], and <ref type="bibr">[16, (3,3)</ref>], respectively. After each of these layers, we added the Maxpooling2D layer. The output from the flattening layer is passed directly to a fully connected layer and then to the prediction layer. We used the rectified linear unit (ReLU) as an activation function for each intermediate layer, as it is widely used in machine learning for its simplicity and effectiveness <ref type="bibr">[41,</ref><ref type="bibr">42]</ref>. In the output layer, sigmoid was used as an activation function. To feed the data into our model for the classification task and ensure consistency across different models, we utilized the original images for model training. For the CNN model, the images are used directly as input, enabling convolution operations. For the other machine learning methods (such as SVM, RF, and KNN), the images are fla ened into one-dimensional vectors. This process involves reshaping each image into a one-dimensional element vector. This standardized representation ensures that the same data are used across all models, allowing for a fair performance comparison.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Classifier</head><p>Convolutional neural network is a commonly used technique in image classification problems <ref type="bibr">[33]</ref>. From high-complexity images such as histopathology images to biological sequence data, using CNN has provided tremendous success <ref type="bibr">[34]</ref><ref type="bibr">[35]</ref><ref type="bibr">[36]</ref>. CNN extracts features (automatic feature extraction) by convolution operations and has shown superior results in image classification tasks compared to other machine learning and deep learning (DL) techniques. Previous studies have demonstrated that deeper CNN models commonly provide be er results <ref type="bibr">[37]</ref>. However, different studies have shown that increasing the convolutional layer's depth does not necessarily improve prediction accuracy, especially for smaller datasets like ours <ref type="bibr">[38]</ref>. Additionally, using a small CNN architecture reduces the chance of overfi ing and requires fewer samples for training <ref type="bibr">[39,</ref><ref type="bibr">40]</ref>.</p><p>The CNN architecture we use is depicted in Figure <ref type="figure">2</ref>. Our CNN classifier consists of three Conv2D layers with the number of filters and kernel sizes of <ref type="bibr">[16, (3,3)</ref>], <ref type="bibr">[32, (3,3)</ref>], and [16, (3,3)], respectively. After each of these layers, we added the Maxpooling2D layer. The output from the fla ening layer is passed directly to a fully connected layer and then to the prediction layer. We used the rectified linear unit (ReLU) as an activation function for each intermediate layer, as it is widely used in machine learning for its simplicity and effectiveness <ref type="bibr">[41,</ref><ref type="bibr">42]</ref>. In the output layer, sigmoid was used as an activation function.  To optimize the model during the training process, we use Adam optimizer with a learning rate of 0.001 <ref type="bibr">[43]</ref>. We also use binary cross-entropy as the loss function <ref type="bibr">[44]</ref>, a standard formula of which is illustrated in Equation <ref type="bibr">(1)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Evaluation Methods</head><p>We use two different evaluation methods to ensure the effectiveness and generality of our proposed model. One of the most commonly used techniques to evaluate the machine learning pipeline is k-fold cross-validation, where we split the data into k subsets and use the k-1 subsets to train the model and the remaining subset to evaluate the model. We repeat this process until all k subsets are used once and only once for evaluation. This way, we test on all the subsets, so the model is not biased toward a particular batch of samples <ref type="bibr">[45]</ref>. In this study, we used 5-fold cross-validation to report the model's performance on the training dataset.</p><p>Additionally, the most widely used technique for testing computer vision models is to test the model on unseen data <ref type="bibr">[46,</ref><ref type="bibr">47]</ref>. Here, we evaluate our model's performance on the independent test set we prepared, which the model did not use during training.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.5.">Evaluation Metrix</head><p>Here we use accuracy, sensitivity, specificity, and the Matthews correlation coefficient (MCC) evaluation metrics, which are widely used in the literature <ref type="bibr">[48]</ref> and are calculated as follows:</p><p>where TP represents the number of true positives, TN is the number of true negatives, FP represents the number of false positives, and FN denotes the number of false negatives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Comparison with Different Machine Learning Models</head><p>To the best of our knowledge, this study is the first to use machine learning methods to identify rotten and sound cranberries from image data. The lack of a viable dataset is one of the main reasons for this purpose, which we address in this study by generating such a dataset. Since CARP is the first predictive model, comparing its performance with other studies is impossible. Hence, to investigate the efficacy of our proposed approach, we applied traditional machine learning algorithms to our dataset and compared the performance of our proposed model with them.</p><p>Here, we use several of the most popular conventional machine learning models, such as support vector machine (SVM), naive Bayes (NB), logistic regression (LR), decision tree (DT), and K-nearest neighbor (KNN). The results achieved using these methods compared to CARP, for five-fold cross-validation and an independent test set, are presented in Table <ref type="table">1</ref>. As shown in Table <ref type="table">1</ref>, CARP outperforms all the other traditional machine learning methods for five-fold cross-validation and independent test sets by a significant margin. CARP achieves an accuracy of 97.4%, a sensitivity of 97.9%, a specificity of 97.2%, and an MCC of 0.92 in the five-fold cross-validation. For our independent test set, CARP achieves 94.8%, 95.4%, 92.7%, and 0.86 in terms of accuracy, sensitivity, specificity, and MCC, respectively. This demonstrates the strength of our proposed model in distinguishing between sound and rotten cranberries. Higher sensitivity and specificity in five-fold crossvalidation (imbalance data) and independent test (balance data) sets show the proposed method's capability to provide generalizable prediction performance.</p><p>Achieving 94.8% prediction accuracy on the independent test set demonstrates the effectiveness of CARP in predicting sound vs. rotten cranberries. In Figure <ref type="figure">3</ref>, the receiver operating characteristic curves (ROC curves) clearly depict the capability of CARP in distinguishing between the rotten and sound cranberries. Such promising results demonstrate the possibility of further improvement using more complex models in the future. However, using more complex models requires larger datasets to avoid overfitting. Hence, our future direction is to add more samples to our current dataset and generate a more extensive benchmark. In addition, sharing our dataset paves the way for future studies to tackle this problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>SVM</head><p>86.5 95.8 85.4 0.59 86.1 92.6 85.2 0.59 KNN 94.3 98.1 93.4 0.84 95.7 94.1 96.1 0.87 DT 90.1 79.2 93.3 0.72 93.5 83.3 97.1 0.83 LR 92.3 87.7 93.4 0.77 94.7 89.1 96.6 0.85 NB 89.1 73.1 94.9 0.71 92.6 81.6 96.5 0.80 As shown in Table <ref type="table">1</ref>, CARP outperforms all the other traditional machine learning methods for five-fold cross-validation and independent test sets by a significant margin. CARP achieves an accuracy of 97.4%, a sensitivity of 97.9%, a specificity of 97.2%, and an MCC of 0.92 in the five-fold cross-validation. For our independent test set, CARP achieves 94.8%, 95.4%, 92.7%, and 0.86 in terms of accuracy, sensitivity, specificity, and MCC, respectively. This demonstrates the strength of our proposed model in distinguishing between sound and ro en cranberries. Higher sensitivity and specificity in five-fold crossvalidation (imbalance data) and independent test (balance data) sets show the proposed method's capability to provide generalizable prediction performance.</p><p>Achieving 94.8% prediction accuracy on the independent test set demonstrates the effectiveness of CARP in predicting sound vs. ro en cranberries. In Figure <ref type="figure">3</ref>, the receiver operating characteristic curves (ROC curves) clearly depict the capability of CARP in distinguishing between the ro en and sound cranberries. Such promising results demonstrate the possibility of further improvement using more complex models in the future. However, using more complex models requires larger datasets to avoid overfi ing. Hence, our future direction is to add more samples to our current dataset and generate a more extensive benchmark. In addition, sharing our dataset paves the way for future studies to tackle this problem. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Discussion</head><p>Indigenous to North America, cranberries are known for their nutritional value and significant human health benefits due to their high fiber content, vitamins, minerals, and antioxidants. Previous studies indicate that cranberry nutrients help prevent various diseases, and the bioactive functional properties in cranberries make them popular among consumers seeking a well-balanced diet, making it a billion-dollar industry. One major </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Discussion</head><p>Indigenous to North America, cranberries are known for their nutritional value and significant human health benefits due to their high fiber content, vitamins, minerals, and antioxidants. Previous studies indicate that cranberry nutrients help prevent various diseases, and the bioactive functional properties in cranberries make them popular among consumers seeking a well-balanced diet, making it a billion-dollar industry. One major hurdle to commercial production is losses due to fruit rot. Cranberry fruit rot, caused by a complex of 10 to 15 filamentous fungi, is economically the most important disease of cranberries. To resolve this issue, a sustainable approach to breeding for resistance is taken by the United States Department of Agriculture. However, this process requires a multi-year analysis (typically 3-5 years) wherein the cranberries are manually assessed to determine the amount afflicted by rot. This process is very time-consuming and labor-intensive.</p><p>In this research, we propose a new machine learning model that automatically identifies sound vs. rotten cranberries from image data to speed up this process and reduce the hands-on labor requirements to streamline cranberry breeding programs. For this study, we created an RGB image dataset of sound and rotten cranberries taken using a hyperspectral camera to automatically sort sound and rotten cranberries. Since the dataset is too small, we focused on building a simple yet effective CNN-based method to classify the sound and rotten cranberries. Our proposed method (CARP) showed significant performance even with a limited number of samples in both the training dataset and the independent test set. CARP achieves an accuracy of 97.4% on the training dataset and 94.8% on the independent dataset, with a balanced performance in predicting both rotten and sound samples. This performance indicates the feasibility of building an automated system for rotten cranberry filtration in the field. As a result of this study, we present the first machine learning method for this task, which is publicly available as a standalone tool. We also have generated a publicly available benchmark to use in future studies. In our future work, we aim to incorporate more samples into the dataset and build more complex deep learning models to identify sound vs. rotten cranberries with a better prediction performance.</p><p>Even though our model provides noteworthy performance in sorting rotten and sound cranberries utilizing RGB images, RGB imaging has several limitations. The RGB cameras can only provide phenotypic information. Thus, detecting fruit rot related to fungal infections that are internal and still do not have any external phenotype using RGB is not possible. As a result, we are utilizing the hyperspectral camera as it provides access to spectral and spatial information <ref type="bibr">[48]</ref>. Hyperspectral imaging has shown tremendous success in tasks where the focus is to obtain information related to the structural damage inside fruits or objects <ref type="bibr">[49,</ref><ref type="bibr">50]</ref>. Our aim for the future is to work on much more complex images, for which hyperspectral data will be of great help. In the next phase, we aim to work on long trays where cranberries are adjacent and difficult to detect and segment (a large number of adjacent cranberries on a long tray). Later, we aim to identify rotten cranberries due to fungal infection with no external phenotype. Finally, we aim to identify sound vs. rotten cranberries from the filed images. For all these cases, having hyperspectral information is necessary and important. Considering this direction, we use the hyperspectral camera for imaging.</p><p>The overall goal of this work is to provide a proof of concept, demonstrating the feasibility of sorting rotten and sound cranberries. This approach will significantly aid the cranberry breeding program by reducing the labor required for sorting. In the future, we aim to advance this technology to production applications in the field, optimizing the sorting process by leveraging the strengths of machine learning and hyperspectral imaging techniques.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Practical Applications</head><p>In this study, we examine the feasibility of sorting rotten and sound cranberries from tray images, with the ultimate goal of deploying this technology in the field for practical use. The next phase of our research is designed to filter out rotten cranberries from a conveyor belt, which will play a crucial role in enhancing the efficiency of the cranberry breeding program. To achieve this, we will employ hyperspectral imaging technology, which allows us to capture detailed spectral information that can identify the quality of the cranberries. The system will be configured to scan the conveyor belt beneath the camera, enabling real-time detection and sorting of the cranberries as they move along the belt.</p><p>Furthermore, we plan to integrate the developed deep learning models into this process to enhance the accuracy and speed of sorting cranberries directly from field images. These models will be trained to recognize specific visual indicators of quality, leveraging the strengths of hyperspectral imaging to detect structural damages caused by fungal infections in the cranberries. By utilizing this advanced technology, our integrated system aims to significantly reduce the need for manual labor and the time required for sorting, thereby streamlining operations in the cranberry breeding program.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusions</head><p>In this study, we addressed the pressing challenge of cranberry fruit rot by proposing a novel approach to filtering rotten cranberries by harnessing the power of machine learning and image recognition technology. We presented an image dataset consisting of rotten and sound cranberries captured using a Micro-Hyperspec Extended VNIR device in the visible light range. Additionally, we developed CARP, a convolutional-neural-network-based tool for distinguishing between rotten and sound cranberries.</p><p>Results achieved by our proposed model demonstrate that CARP can be adapted to filter rotten cranberries from the field. Furthermore, we can leverage its ability to distinguish rotten cranberries to filter other rotten fruits, saving harvesters time and money. Despite CARP's significant performance on RGB data, it still lacks the ability to classify internal rot in cranberries. In future iterations of our work, we aim to use hyperspectral images with all the bands available to predict fruit rot more accurately, even when the rot is not visible.</p></div></body>
		</text>
</TEI>
