In recent years, satellites capable of capturing videos have been developed and launched to provide high definition satellite videos that enable applications far beyond the capabilities of remotely sensed imagery. Moving object detection and moving object tracking are among the most essential and challenging tasks, but existing studies have mainly focused on vehicles. To accurately detect and then track more complex moving objects, specifically airplanes, we need to address the challenges posed by the new data. First, slow-moving airplanes may cause foreground aperture problem during detection. Second, various disturbances, especially parallax motion, may cause false detection. Third, airplanes may perform complex motions, which requires a rotation-invariant and scale-invariant tracking algorithm. To tackle these difficulties, we first develop an Improved Gaussian-based Background Subtractor (IPGBBS) algorithm for moving airplane detection. This algorithm adopts a novel strategy for background and foreground adaptation, which can effectively deal with the foreground aperture problem. Then, the detected moving airplanes are tracked by a Primary Scale Invariant Feature Transform (P-SIFT) keypoint matching algorithm. The P-SIFT keypoint of an airplane exhibits high distinctiveness and repeatability. More importantly, it provides a highly rotation-invariant and scale-invariant feature vector that can be used in the matching process to determine the new locations of the airplane in the frame sequence. The method was tested on a satellite video with eight moving airplanes. Compared with state-of-the-art algorithms, our IPGBBS algorithm achieved the best detection accuracy with the highest F1 score of 0.94 and also demonstrated its superiority on parallax motion suppression. The P-SIFT keypoint matching algorithm could successfully track seven out of the eight airplanes. Based on the tracking results, movement trajectories of the airplanes and their dynamic properties were also estimated.
more »
« less
Detecting and Tracking Moving Airplanes from Space Based on Normalized Frame Difference Labeling and Improved Similarity Measures
The emerging satellite videos provide the opportunity to detect moving objects and track their trajectories, which were not possible for remotely sensed imagery with limited temporal resolution. So far, most studies using satellite video data have been concentrated on traffic monitoring through detecting and tracking moving cars, whereas the studies on other moving objects such as airplanes are limited. In this paper, an integrated method for monitoring moving airplanes from a satellite video is proposed. First, we design a normalized frame difference labeling (NFDL) algorithm to detect moving airplanes, which adopts a non-recursive strategy to deliver stable detection throughout the whole video. Second, the template matching (TM) technique is utilized for tracking the detected moving airplanes in the frame sequence by improved similarity measures (ISMs) with better rotation invariance and model drift suppression ability. Template matching with improved similarity measures (TM-ISMs) is further implemented to handle the leave-the-scene problem. The developed method is tested on a satellite video to detect and track eleven moving airplanes. Our NFDL algorithm successfully detects all the moving airplanes with the highest F1 score of 0.88 among existing algorithms. The performance of TM-ISMs is compared with both its traditional counterparts and other state-of-the-art tracking algorithms. The experimental results show that TM-ISMs can handle both rotation and leave-the-scene problems. Moreover, TM-ISMs achieve a very high tracking accuracy of 0.921 and the highest tracking speed of 470.62 frames per second.
more »
« less
- Award ID(s):
- 1826839
- PAR ID:
- 10287806
- Date Published:
- Journal Name:
- Remote Sensing
- Volume:
- 12
- Issue:
- 21
- ISSN:
- 2072-4292
- Page Range / eLocation ID:
- 3589
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Integral imaging has proven useful for three-dimensional (3D) object visualization in adverse environmental conditions such as partial occlusion and low light. This paper considers the problem of 3D object tracking. Two-dimensional (2D) object tracking within a scene is an active research area. Several recent algorithms use object detection methods to obtain 2D bounding boxes around objects of interest in each frame. Then, one bounding box can be selected out of many for each object of interest using motion prediction algorithms. Many of these algorithms rely on images obtained using traditional 2D imaging systems. A growing literature demonstrates the advantage of using 3D integral imaging instead of traditional 2D imaging for object detection and visualization in adverse environmental conditions. Integral imaging’s depth sectioning ability has also proven beneficial for object detection and visualization. Integral imaging captures an object’s depth in addition to its 2D spatial position in each frame. A recent study uses integral imaging for the 3D reconstruction of the scene for object classification and utilizes the mutual information between the object’s bounding box in this 3D reconstructed scene and the 2D central perspective to achieve passive depth estimation. We build over this method by using Bayesian optimization to track the object’s depth in as few 3D reconstructions as possible. We study the performance of our approach on laboratory scenes with occluded objects moving in 3D and show that the proposed approach outperforms 2D object tracking. In our experimental setup, mutual information-based depth estimation with Bayesian optimization achieves depth tracking with as few as two 3D reconstructions per frame which corresponds to the theoretical minimum number of 3D reconstructions required for depth estimation. To the best of our knowledge, this is the first report on 3D object tracking using the proposed approach.more » « less
-
Despite recent progress in Multiple Object Tracking (MOT), several obstacles such as occlusions, similar objects, and complex scenes remain an open challenge. Meanwhile, a systematic study of the cost-performance tradeoff for the popular tracking-by-detection paradigm is still lacking. This paper introduces SMILEtrack, an innovative object tracker that effectively addresses these challenges by integrating an efficient object detector with a Siamese network-based Similarity Learning Module (SLM). The technical contributions of SMILETrack are twofold. First, we propose an SLM that calculates the appearance similarity between two objects, overcoming the limitations of feature descriptors in Separate Detection and Embedding (SDE) models. The SLM incorporates a Patch Self-Attention (PSA) block inspired by the vision Transformer, which generates reliable features for accurate similarity matching. Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames, further enhancing MOT performance. Together, these innovations help SMILETrack achieve an improved trade-off between the cost (e.g., running speed) and performance (e.g., tracking accuracy) over several existing state-of-the-art benchmarks, including the popular BYTETrack method. SMILETrack outperforms BYTETrack by 0.4-0.8 MOTA and 2.1-2.2 HOTA points on MOT17 and MOT20 datasets. Code is available at http://github.com/pingyang1117/SMILEtrack_official.more » « less
-
Abstract Object tracking in microscopy videos is crucial for understanding biological processes. While existing methods often require fine-tuning tracking algorithms to fit the image dataset, here we explored an alternative paradigm: augmenting the image time-lapse dataset to fit the tracking algorithm. To test this approach, we evaluated whether generative video frame interpolation can augment the temporal resolution of time-lapse microscopy and facilitate object tracking in multiple biological contexts. We systematically compared the capacity of Latent Diffusion Model for Video Frame Interpolation (LDMVFI), Real-time Intermediate Flow Estimation (RIFE), Compression-Driven Frame Interpolation (CDFI), and Frame Interpolation for Large Motion (FILM) to generate synthetic microscopy images derived from interpolating real images. Our testing image time series ranged from fluorescently labeled nuclei to bacteria, yeast, cancer cells, and organoids. We showed that the off-the-shelf frame interpolation algorithms produced bio-realistic image interpolation even without dataset-specific retraining, as judged by high structural image similarity and the capacity to produce segmentations that closely resemble results from real images. Using a simple tracking algorithm based on mask overlap, we confirmed that frame interpolation significantly improved tracking across several datasets without requiring extensive parameter tuning and capturing complex trajectories that were difficult to resolve in the original image time series. Taken together, our findings highlight the potential of generative frame interpolation to improve tracking in time-lapse microscopy across diverse scenarios, suggesting that a generalist tracking algorithm for microscopy could be developed by combining deep learning segmentation models with generative frame interpolation.more » « less
-
This paper proposes a system architecture for tracking multiple ground-based objects using a team of unmanned air systems (UAS). In the architecture pipeline, video data is processed by each UAS to detect motion in the image frame. The ground-based location of the detected motion is estimated using a geolocation algorithm. The subsequent data points are then process by the recently introduced Recursive RANSAC (R-RANSASC) algorithm to produce a set of tracks. These tracks are then communicated over the network and the error in the coordinate frames between vehicles must be estimated. After the tracks have been placed in the same coordinate frame, a track-to-track association algorithm is used to determine which tracks in each camera correspond to tracks in other cameras. Associated tracks are then fused using a distributed information filter. The proposed method is demonstrated on data collected from two multi-rotors tracking a person walking on the ground.more » « less
An official website of the United States government

