skip to main content

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 11:00 PM ET on Friday, December 13 until 2:00 AM ET on Saturday, December 14 due to maintenance. We apologize for the inconvenience.


Title: Efficient Data Diffusion and Elimination Control Method for Spatio-Temporal Data Retention System
With the development and spread of Internet of Things technologies, various types of data for IoT applications can be generated anywhere and at any time. Among such data, there are data that depend heavily on generation time and location. We define these data as spatiotemporal data (STD). In previous studies, we proposed a STD retention system using vehicular networks to achieve the “Local production and consumption of STD” paradigm. The system can quickly provide STD for users within a specific location by retaining the STD within the area. However, this system does not take into account that each type of STD has different requirements for STD retention. In particular, the lifetime of STD and the diffusion time to the entire area directly influence the performance of STD retention. Therefore, we propose an efficient diffusion and elimination control method for retention based on the requirements of STD. The results of simulation evaluation demonstrated that the proposed method can satisfy the requirements of STD, while maintaining a high coverage rate in the area.  more » « less
Award ID(s):
1818884
PAR ID:
10289949
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEICE transactions on communications
Volume:
E104-B
ISSN:
0916-8516
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Producing high-resolution near-real-time forecasts of fire behavior and smoke impact that are useful for fire and air quality management requires accurate initialization of the fire location. One common representation of the fire progression is through the fire arrival time, which defines the time that the fire arrives at a given location. Estimating the fire arrival time is critical for initializing the fire location within coupled fire-atmosphere models. We present a new method that utilizes machine learning to estimate the fire arrival time from satellite data in the form of burning/not burning/no data rasters. The proposed method, based on a support vector machine (SVM), is tested on the 10 largest California wildfires of the 2020 fire season, and evaluated using independent observed data from airborne infrared (IR) fire perimeters. The SVM method results indicate a good agreement with airborne fire observations in terms of the fire growth and a spatial representation of the fire extent. A 12% burned area absolute percentage error, a 5% total burned area mean percentage error, a 0.21 False Alarm Ratio average, a 0.86 Probability of Detection average, and a 0.82 Sørensen’s coefficient average suggest that this method can be used to monitor wildfires in near-real-time and provide accurate fire arrival times for improving fire modeling even in the absence of IR fire perimeters. 
    more » « less
  2. Data privacy requirements are a complex and quickly evolving part of the data management domain. Especially in Healthcare (e.g., United States Health Insurance Portability and Accountability Act and Veterans Affairs requirements), there has been a strong emphasis on data privacy and protection. Data storage is governed by multiple sources of policy requirements, including internal policies and legal requirements imposed by external governing organizations. Within a database, a single value can be subject to multiple requirements on how long it must be preserved and when it must be irrecoverably destroyed. This often results in a complex set of overlapping and potentially conflicting policies. Existing storage systems are lacking sufficient support functionality for these critical and evolving rules, making compliance an underdeveloped aspect of data management. As a result, many organizations must implement manual ad-hoc solutions to ensure compliance. As long as organizations depend on manual approaches, there is an increased risk of non-compliance and threat to customer data privacy. In this paper, we detail and implement an automated comprehensive data management compliance framework facilitating retention and purging compliance within a database management system. This framework can be integrated into existing databases without requiring changes to existing business processes. Our proposed implementation uses SQL to set policies and automate compliance. We validate this framework on a Postgres database, and measure the factors that contribute to our reasonable performance overhead (13% in a simulated real-world workload). 
    more » « less
  3. Modern last-level caches are partitioned into slices that are spread across the chip, giving rise to varying access latencies dictated by the physical location of the accessing core and the cache slice being accessed. Although, prior work has shown that dynamically determining the best location for blocks within such Non-Uniform Cache Access architectures can provide significant performance benefits, current hardware does not implement this functionality. Instead, modern processors hash blocks across the LLC slices, obscuring the non-uniform architecture of the underlying cache and forfeiting the performance benefits of placing data in the nearest cache slices. Moreover, while prior work advocated improving performance by delegating control over block placement to the operating system at page granularity, modern processor hardware thwarts these approaches by hashing cache slice selection at cache block granularity. In this work, we make two observations that enable us to improve software performance on modern NUCA architectures. First, we find that software can undo the hashing performed by hardware and efficiently manage data placement at cache block granularity. Second, that the complexity of fine-grained data placement can be hidden from the developer by embedding it in the dynamic memory allocator. Leveraging these observations, we design a new specialized memory allocator, NUCAlloc, suitable for use with C++ containers such as std::map and std::set. NUCAlloc handles the complexity of NUCA-aware block placement, improving the performance of containers by placing their data into the nearest LLC slices. We demonstrate that our NUCAlloc prototype consistently outperforms std::allocator and jemalloc for LLC-resident containers, improving performance by up to 20% in both single-threaded and multi-threaded software. 
    more » « less
  4. In Location-Based Services (LBS), users are required to disclose their precise location information to query a service provider. An untrusted service provider can abuse those queries to infer sensitive information on a user through spatio-temporal and historical data analyses. Depicting the drawbacks of existing privacy-preserving approaches in LBS, we propose a user-centric obfuscation approach, called KLAP, based on the three fundamental obfuscation requirements: k number of locations, l-diversity, and privacy area preservation. Considering user's sensitivity to different locations and utilizing Real-Time Traffic Information (RTTI), KLAP generates a convex Concealing Region (CR) to hide user's location such that the locations, forming the CR, resemble similar sensitivity and are resilient against a wide range of inferences in spatio-temporal domain. For the first time, a novel CR pruning technique is proposed to significantly improve the delay between successive CR submissions. We carry out an experiment with a real dataset to show its effectiveness for sporadic, frequent, and continuous service use cases. 
    more » « less
  5. null (Ed.)
    RNA sequencing data have been abundantly generated in biomedical research for biomarker discovery and other studies. Such data at the exon level are usually heavily tailed and correlated. Conventional statistical tests based on the mean or median difference for differential expression likely suffer from low power when the between-group difference occurs mostly in the upper or lower tail of the distribution of gene expression. We propose a tail-based test to make comparisons between groups in terms of a specific distribution area rather than a single location. The proposed test, which is derived from quantile regression, adjusts for covariates and accounts for within-sample dependence among the exons through a specified correlation structure. Through Monte Carlo simulation studies, we show that the proposed test is generally more powerful and robust in detecting differential expression than commonly used tests based on the mean or a single quantile. An application to TCGA lung adenocarcinoma data demonstrates the promise of the proposed method in terms of biomarker discovery. 
    more » « less