skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A No-Reference Image Quality Model for Object Detection on Embedded Cameras
Automatic video analysis tools are an indispensable component in imaging applications. Object detection, the first and the most important step for automatic video analysis, is implemented in many embedded cameras. The accuracy of object detection relies on the quality of images that are processed. This paper proposes a new image quality model for predicting the performance of object detection on embedded cameras. A video data set is constructed that considers different factors for quality degradation in the imaging process, such as reduced resolution, noise, and blur. The performances of commonly used low-complexity object detection algorithms are obtained for the data set. A no-reference regression model based on a bagging ensemble of regression trees is built to predict the accuracy of object detection using observable features in an image. Experimental results show that the proposed model provides more accurate predictions of image quality for object detection than commonly known image quality measures.  more » « less
Award ID(s):
1644946
PAR ID:
10123736
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
International Journal of Multimedia Data Engineering and Management
Volume:
10
Issue:
1
ISSN:
1947-8534
Page Range / eLocation ID:
22 to 39
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Occupancy detection systems are commonly equipped with high quality cameras and a processor with high computational power to run detection algorithms. This paper presents a human occupancy detection system that uses battery-free cameras and a deep learning model implemented on a low-cost hub to detect human presence. Our low-resolution camera harvests energy from ambient light and transmits data to the hub using backscatter communication. We implement the state-of-the-art YOLOv5 network detection algorithm that offers high detection accuracy and fast inferencing speed on a Raspberry Pi 4 Model B. We achieve an inferencing speed of ∼100ms per image and an overall detection accuracy of >90% with only 2GB CPU RAM on the Raspberry Pi. In the experimental results, we also demonstrate that the detection is robust to noise, illuminance, occlusion, and angle of depression. 
    more » « less
  2. Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these cameras change, and our experiments show that these changes can adversely impact the accuracy of insights from video analytics. This is because the camera parameter settings, though optimal at deployment time, are not the best settings for good-quality video capture as the environmental conditions and scenes around a camera change during operation. Capturing poor-quality video adversely affects the accuracy of analytics. To mitigate the loss in accuracy of insights, we propose a novel, reinforcement-learning based system APT that dynamically, and remotely (over 5G networks), tunes the camera parameters, to ensure a high-quality video capture, which mitigates any loss in accuracy of video analytics. As a result, such tuning restores the accuracy of insights when environmental conditions or scene content change. APT uses reinforcement learning, with no-reference perceptual quality estimation as the reward function. We conducted extensive real-world experiments, where we simultaneously deployed two cameras side-by-side overlooking an enterprise parking lot (one camera only has manufacturer-suggested default setting, while the other camera is dynamically tuned by APT during operation). Our experiments demonstrated that due to dynamic tuning by APT, the analytics insights are consistently better at all times of the day: the accuracy of object detection video analytics application was improved on average by ∼ 42%. Since our reward function is independent of any analytics task, APT can be readily used for different video analytics tasks. 
    more » « less
  3. In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters that directly affect the quality of the video stream capture. While a few of such parameters, e.g., exposure, focus, white balance are automatically adjusted by the camera internally, the remaining ones are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this paper, we first show that environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. We then present CamTuner, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CamTuner is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%--4.2% additional cars (without any false positives) in a large enterprise parking lot and 9.7% additional cars in a 5G smart traffic intersection scenario, which enables a new usecase of accurate and reliable automatic vehicle collision prediction (AVCP). CamTuner opens doors for new ways to significantly enhance video analytics accuracy beyond incremental improvements from refining deep-learning models. 
    more » « less
  4. null (Ed.)
    Due to the importance of object detection in video analysis and image annotation, it is widely utilized in a number of computer vision tasks such as face recognition, autonomous vehicles, activity recognition, tracking objects and identity verification. Object detection does not only involve classification and identification of objects within images, but also involves localizing and tracing the objects by creating bounding boxes around the objects and labelling them with their respective prediction scores. Here, we leverage and discuss how connected vision systems can be used to embed cameras, image processing, Edge Artificial Intelligence (AI), and data connectivity capabilities for flood label detection. We favored the engineering definition of label detection that a label is a sequence of discrete measurable observations obtained using a capturing device such as web cameras, smart phone, etc. We built a Big Data service of around 1000 images (image annotation service) including the image geolocation information from various flooding events in the Carolinas (USA) with a total of eight different object categories. Our developed platform has several smart AI tools and task configurations that can detect objects’ edges or contours which can be manually adjusted with a threshold setting so as to best segment the image. The tool has the ability to train the dataset and predict the labels for large scale datasets which can be used as an object detector to drastically reduce the amount of time spent per object particularly for real-time image-based flood forecasting. This research is funded by the US National Science Foundation (NSF). 
    more » « less
  5. Cracks and pores are two common defects in metallic additive manufacturing (AM) parts. In this paper, deep learning-based image analysis is performed for defect (cracks and pores) classification/detection based on SEM images of metallic AM parts. Three different levels of complexities, namely, defect classification, defect detection and defect image segmentation, are successfully achieved using a simple CNN model, the YOLOv4 model and the Detectron2 object detection library, respectively. The tuned CNN model can classify any single defect as either a crack or pore at almost 100% accuracy. The other two models can identify more than 90% of the cracks and pores in the testing images. In addition to the application of static image analysis, defect detection is also successfully applied on a video which mimics the AM process control images. The trained Detectron2 model can identify almost all the pores and cracks that exist in the original video. This study lays a foundation for future in situ process monitoring of the 3D printing process. 
    more » « less