skip to main content


Title: SWARM: Adaptive Load Balancing in Distributed Streaming Systems for Big Spatial Data
The proliferation of GPS-enabled devices has led to the development of numerous location-based services. These services need to process massive amounts of streamed spatial data in real-time. The current scale of spatial data cannot be handled using centralized systems. This has led to the development of distributed spatial streaming systems. Existing systems are using static spatial partitioning to distribute the workload. In contrast, the real-time streamed spatial data follows non-uniform spatial distributions that are continuously changing over time. Distributed spatial streaming systems need to react to the changes in the distribution of spatial data and queries. This article introduces SWARM, a lightweight adaptivity protocol that continuously monitors the data and query workloads across the distributed processes of the spatial data streaming system and redistributes and rebalances the workloads as soon as performance bottlenecks get detected. SWARM is able to handle multiple query-execution and data-persistence models. A distributed streaming system can directly use SWARM to adaptively rebalance the system’s workload among its machines with minimal changes to the original code of the underlying spatial application. Extensive experimental evaluation using real and synthetic datasets illustrate that, on average, SWARM achieves 2 improvement in throughput over a static grid partitioning that is determined based on observing a limited history of the data and query workloads. Moreover, SWARM reduces execution latency on average 4 compared with the other technique.  more » « less
Award ID(s):
1815796 1910216
NSF-PAR ID:
10301806
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Spatial Algorithms and Systems
Volume:
7
Issue:
3
ISSN:
2374-0353
Page Range / eLocation ID:
1 to 43
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    The wide spread of GPS-enabled devices and the Internet of Things (IoT) has increased the amount of spatial data being generated every second. The current scale of spatial data cannot be handled using centralized systems. This has led to the development of distributed spatial data streaming systems that scale to process in real-time large amounts of streamed spatial data. The performance of distributed streaming systems relies on how even the workload is distributed among their machines. However, it is challenging to estimate the workload of each machine because spatial data and query streams are skewed and rapidly change with time and users' interests. Moreover, a distributed spatial streaming system often does not maintain a global system workload state because it requires high network and processing overheads to be collected from the machines in the system. This paper introduces TrioStat; an online workload estimation technique that relies on a probabilistic model for estimating the workload of partitions and machines in a distributed spatial data streaming system. It is infeasible to collect and exchange statistics with a centralized unit because it requires high network overhead. Instead, TrioStat uses a decentralised technique to collect and maintain the required statistics in real-time locally in each machine. TrioStat enables distributed spatial data streaming systems to compare the workloads of machines as well as the workloads of data partitions. TrioStat requires minimal network and storage overhead. Moreover, the required storage is distributed across the system's machines. 
    more » « less
  2. Obeid, Iyad Selesnick (Ed.)
    Electroencephalography (EEG) is a popular clinical monitoring tool used for diagnosing brain-related disorders such as epilepsy [1]. As monitoring EEGs in a critical-care setting is an expensive and tedious task, there is a great interest in developing real-time EEG monitoring tools to improve patient care quality and efficiency [2]. However, clinicians require automatic seizure detection tools that provide decisions with at least 75% sensitivity and less than 1 false alarm (FA) per 24 hours [3]. Some commercial tools recently claim to reach such performance levels, including the Olympic Brainz Monitor [4] and Persyst 14 [5]. In this abstract, we describe our efforts to transform a high-performance offline seizure detection system [3] into a low latency real-time or online seizure detection system. An overview of the system is shown in Figure 1. The main difference between an online versus offline system is that an online system should always be causal and has minimum latency which is often defined by domain experts. The offline system, shown in Figure 2, uses two phases of deep learning models with postprocessing [3]. The channel-based long short term memory (LSTM) model (Phase 1 or P1) processes linear frequency cepstral coefficients (LFCC) [6] features from each EEG channel separately. We use the hypotheses generated by the P1 model and create additional features that carry information about the detected events and their confidence. The P2 model uses these additional features and the LFCC features to learn the temporal and spatial aspects of the EEG signals using a hybrid convolutional neural network (CNN) and LSTM model. Finally, Phase 3 aggregates the results from both P1 and P2 before applying a final postprocessing step. The online system implements Phase 1 by taking advantage of the Linux piping mechanism, multithreading techniques, and multi-core processors. To convert Phase 1 into an online system, we divide the system into five major modules: signal preprocessor, feature extractor, event decoder, postprocessor, and visualizer. The system reads 0.1-second frames from each EEG channel and sends them to the feature extractor and the visualizer. The feature extractor generates LFCC features in real time from the streaming EEG signal. Next, the system computes seizure and background probabilities using a channel-based LSTM model and applies a postprocessor to aggregate the detected events across channels. The system then displays the EEG signal and the decisions simultaneously using a visualization module. The online system uses C++, Python, TensorFlow, and PyQtGraph in its implementation. The online system accepts streamed EEG data sampled at 250 Hz as input. The system begins processing the EEG signal by applying a TCP montage [8]. Depending on the type of the montage, the EEG signal can have either 22 or 20 channels. To enable the online operation, we send 0.1-second (25 samples) length frames from each channel of the streamed EEG signal to the feature extractor and the visualizer. Feature extraction is performed sequentially on each channel. The signal preprocessor writes the sample frames into two streams to facilitate these modules. In the first stream, the feature extractor receives the signals using stdin. In parallel, as a second stream, the visualizer shares a user-defined file with the signal preprocessor. This user-defined file holds raw signal information as a buffer for the visualizer. The signal preprocessor writes into the file while the visualizer reads from it. Reading and writing into the same file poses a challenge. The visualizer can start reading while the signal preprocessor is writing into it. To resolve this issue, we utilize a file locking mechanism in the signal preprocessor and visualizer. Each of the processes temporarily locks the file, performs its operation, releases the lock, and tries to obtain the lock after a waiting period. The file locking mechanism ensures that only one process can access the file by prohibiting other processes from reading or writing while one process is modifying the file [9]. The feature extractor uses circular buffers to save 0.3 seconds or 75 samples from each channel for extracting 0.2-second or 50-sample long center-aligned windows. The module generates 8 absolute LFCC features where the zeroth cepstral coefficient is replaced by a temporal domain energy term. For extracting the rest of the features, three pipelines are used. The differential energy feature is calculated in a 0.9-second absolute feature window with a frame size of 0.1 seconds. The difference between the maximum and minimum temporal energy terms is calculated in this range. Then, the first derivative or the delta features are calculated using another 0.9-second window. Finally, the second derivative or delta-delta features are calculated using a 0.3-second window [6]. The differential energy for the delta-delta features is not included. In total, we extract 26 features from the raw sample windows which add 1.1 seconds of delay to the system. We used the Temple University Hospital Seizure Database (TUSZ) v1.2.1 for developing the online system [10]. The statistics for this dataset are shown in Table 1. A channel-based LSTM model was trained using the features derived from the train set using the online feature extractor module. A window-based normalization technique was applied to those features. In the offline model, we scale features by normalizing using the maximum absolute value of a channel [11] before applying a sliding window approach. Since the online system has access to a limited amount of data, we normalize based on the observed window. The model uses the feature vectors with a frame size of 1 second and a window size of 7 seconds. We evaluated the model using the offline P1 postprocessor to determine the efficacy of the delayed features and the window-based normalization technique. As shown by the results of experiments 1 and 4 in Table 2, these changes give us a comparable performance to the offline model. The online event decoder module utilizes this trained model for computing probabilities for the seizure and background classes. These posteriors are then postprocessed to remove spurious detections. The online postprocessor receives and saves 8 seconds of class posteriors in a buffer for further processing. It applies multiple heuristic filters (e.g., probability threshold) to make an overall decision by combining events across the channels. These filters evaluate the average confidence, the duration of a seizure, and the channels where the seizures were observed. The postprocessor delivers the label and confidence to the visualizer. The visualizer starts to display the signal as soon as it gets access to the signal file, as shown in Figure 1 using the “Signal File” and “Visualizer” blocks. Once the visualizer receives the label and confidence for the latest epoch from the postprocessor, it overlays the decision and color codes that epoch. The visualizer uses red for seizure with the label SEIZ and green for the background class with the label BCKG. Once the streaming finishes, the system saves three files: a signal file in which the sample frames are saved in the order they were streamed, a time segmented event (TSE) file with the overall decisions and confidences, and a hypotheses (HYP) file that saves the label and confidence for each epoch. The user can plot the signal and decisions using the signal and HYP files with only the visualizer by enabling appropriate options. For comparing the performance of different stages of development, we used the test set of TUSZ v1.2.1 database. It contains 1015 EEG records of varying duration. The any-overlap performance [12] of the overall system shown in Figure 2 is 40.29% sensitivity with 5.77 FAs per 24 hours. For comparison, the previous state-of-the-art model developed on this database performed at 30.71% sensitivity with 6.77 FAs per 24 hours [3]. The individual performances of the deep learning phases are as follows: Phase 1’s (P1) performance is 39.46% sensitivity and 11.62 FAs per 24 hours, and Phase 2 detects seizures with 41.16% sensitivity and 11.69 FAs per 24 hours. We trained an LSTM model with the delayed features and the window-based normalization technique for developing the online system. Using the offline decoder and postprocessor, the model performed at 36.23% sensitivity with 9.52 FAs per 24 hours. The trained model was then evaluated with the online modules. The current performance of the overall online system is 45.80% sensitivity with 28.14 FAs per 24 hours. Table 2 summarizes the performances of these systems. The performance of the online system deviates from the offline P1 model because the online postprocessor fails to combine the events as the seizure probability fluctuates during an event. The modules in the online system add a total of 11.1 seconds of delay for processing each second of the data, as shown in Figure 3. In practice, we also count the time for loading the model and starting the visualizer block. When we consider these facts, the system consumes 15 seconds to display the first hypothesis. The system detects seizure onsets with an average latency of 15 seconds. Implementing an automatic seizure detection model in real time is not trivial. We used a variety of techniques such as the file locking mechanism, multithreading, circular buffers, real-time event decoding, and signal-decision plotting to realize the system. A video demonstrating the system is available at: https://www.isip.piconepress.com/projects/nsf_pfi_tt/resources/videos/realtime_eeg_analysis/v2.5.1/video_2.5.1.mp4. The final conference submission will include a more detailed analysis of the online performance of each module. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation Partnership for Innovation award number IIP-1827565 and the Pennsylvania Commonwealth Universal Research Enhancement Program (PA CURE). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] A. Craik, Y. He, and J. L. Contreras-Vidal, “Deep learning for electroencephalogram (EEG) classification tasks: a review,” J. Neural Eng., vol. 16, no. 3, p. 031001, 2019. https://doi.org/10.1088/1741-2552/ab0ab5. [2] A. C. Bridi, T. Q. Louro, and R. C. L. Da Silva, “Clinical Alarms in intensive care: implications of alarm fatigue for the safety of patients,” Rev. Lat. Am. Enfermagem, vol. 22, no. 6, p. 1034, 2014. https://doi.org/10.1590/0104-1169.3488.2513. [3] M. Golmohammadi, V. Shah, I. Obeid, and J. Picone, “Deep Learning Approaches for Automatic Seizure Detection from Scalp Electroencephalograms,” in Signal Processing in Medicine and Biology: Emerging Trends in Research and Applications, 1st ed., I. Obeid, I. Selesnick, and J. Picone, Eds. New York, New York, USA: Springer, 2020, pp. 233–274. https://doi.org/10.1007/978-3-030-36844-9_8. [4] “CFM Olympic Brainz Monitor.” [Online]. Available: https://newborncare.natus.com/products-services/newborn-care-products/newborn-brain-injury/cfm-olympic-brainz-monitor. [Accessed: 17-Jul-2020]. [5] M. L. Scheuer, S. B. Wilson, A. Antony, G. Ghearing, A. Urban, and A. I. Bagic, “Seizure Detection: Interreader Agreement and Detection Algorithm Assessments Using a Large Dataset,” J. Clin. Neurophysiol., 2020. https://doi.org/10.1097/WNP.0000000000000709. [6] A. Harati, M. Golmohammadi, S. Lopez, I. Obeid, and J. Picone, “Improved EEG Event Classification Using Differential Energy,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium, 2015, pp. 1–4. https://doi.org/10.1109/SPMB.2015.7405421. [7] V. Shah, C. Campbell, I. Obeid, and J. Picone, “Improved Spatio-Temporal Modeling in Automated Seizure Detection using Channel-Dependent Posteriors,” Neurocomputing, 2021. [8] W. Tatum, A. Husain, S. Benbadis, and P. Kaplan, Handbook of EEG Interpretation. New York City, New York, USA: Demos Medical Publishing, 2007. [9] D. P. Bovet and C. Marco, Understanding the Linux Kernel, 3rd ed. O’Reilly Media, Inc., 2005. https://www.oreilly.com/library/view/understanding-the-linux/0596005652/. [10] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Front. Neuroinform., vol. 12, pp. 1–6, 2018. https://doi.org/10.3389/fninf.2018.00083. [11] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011. https://dl.acm.org/doi/10.5555/1953048.2078195. [12] J. Gotman, D. Flanagan, J. Zhang, and B. Rosenblatt, “Automatic seizure detection in the newborn: Methods and initial evaluation,” Electroencephalogr. Clin. Neurophysiol., vol. 103, no. 3, pp. 356–362, 1997. https://doi.org/10.1016/S0013-4694(97)00003-9. 
    more » « less
  3. The constant flux of data and queries alike has been pushing the boundaries of data analysis systems. The increasing size of raw data files has made data loading an expensive operation that delays the data-to-insight time. To alleviate the loading cost, in situ query processing systems operate directly over raw data and offer instant access to data. At the same time, analytical workloads have increasing number of queries. Typically, each query focuses on a constantly shifting—yet small—range. As a result, minimizing the workload latency requires the benefits of indexing in in situ query processing. In this paper, we present an online partitioning and indexing scheme, along with a partitioning and indexing tuner tailored for in situ querying engines. The proposed system design improves query execution time by taking into account user query patterns, to (i) partition raw data files logically and (ii) build lightweight partition-specific indexes for each partition. We build an in situ query engine called Slalom to showcase the impact of our design. Slalom employs adaptive partitioning and builds non-obtrusive indexes in different partitions on-the-fly based on lightweight query access pattern monitoring. As a result of its lightweight nature, Slalom achieves efficient query processing over raw data with minimal memory consumption. Our experimentation with both microbenchmarks and real-life workloads shows that Slalom outperforms state-of-the-art in situ engines and achieves comparable query response times with fully indexed DBMS, offering lower cumulative query execution times for query workloads with increasing size and unpredictable access patterns. 
    more » « less
  4. Aref, Walid G. (Ed.)

    The proliferation of mobile phones and location-based services has given rise to an explosive growth in spatial data. In order to enable spatial data analytics, spatial data needs to be streamed into a data stream warehouse system that can provide real-time analytical results over the most recent and historical spatial data in the warehouse. Existing data stream warehouse systems are not tailored for spatial data. In this paper, we introduce theSTARsystem.STARis a distributed in-memory data stream warehouse system that provides low-latency and up-to-date analytical results over a fast-arriving spatial data stream.STARsupports both snapshot and continuous queries that are composed of aggregate functions and ad hoc query constraints over spatial, textual, and temporal data attributes.STARimplements a cache-based mechanism to facilitate the processing of snapshot queries that collectively utilizes the techniques of query-based caching (i.e., view materialization) and object-based caching. Moreover, to speed-up processing continuous queries,STARproposes a novel index structure that achieves high efficiency in both object checking and result updating. Extensive experiments over real data sets demonstrate the superior performance ofSTARover existing systems.

     
    more » « less
  5. We are witnessing a race to meet the ever-growing computation requirements of emerging AI applications to provide perception and control in autonomous vehicles — e.g., self-driving cars and UAVs. To remain competitive, vendors are packing more processing units (CPUs, programmable logic, GPUs, and hardware accelerators) into next-generation multiprocessor systems-on-a-chip (MPSoC). As a result, modern embedded platforms are achieving new heights in peak computational capacity. Unfortunately, however, the collateral and inevitable increase in complexity represents a major obstacle for the development of correct-by-design safety-critical real-time applications. Due to the ever-growing gap between fast-paced hardware evolution and comparatively slower evolution of real-time operating systems (RTOS), there is a need for real-time oriented full-platform management frameworks to complement traditional RTOS designs. In this work, we propose one such framework, namely the X-Stream framework, for the definition, synthesis, and analysis of real-time workloads targeting state-of-the-art accelerator-augmented embedded platforms. Our X-Stream framework is designed around two cardinal principles. First, computation and data movements are orchestrated to achieve predictability by design. For this purpose, iterative computation over large data chunks is divided into subsequent segments. These segments are then streamed leveraging the three-phase execution model (load, execute and unload). Second, the framework is workflow-centric: system designers can specify their workflow and the necessary code for workflow orchestration is automatically generated. In addition to automating the deployment of user-defined hardware-accelerated workloads, X-Stream supports the deployment of some computation segments on traditional CPUs. Finally, X-Stream allows the definition of real-time partitions. Each partition groups applications belonging to the same criticality level and that share the same set of hardware resources, with support for preemptive priority-driven scheduling. Conversely, freedom from interference for applications deployed in different partitions is guaranteed by design. We provide a full-system implementation that includes RTOS integration and showcase the proposed X-Stream framework on a Xilinx Ultrascale+ platform by focusing on a matrix-multiplication and addition kernel use-case. 
    more » « less