skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Efficient Data Diffusion and Elimination Control Method for Spatio-Temporal Data Retention System
With the development and spread of Internet of Things technologies, various types of data for IoT applications can be generated anywhere and at any time. Among such data, there are data that depend heavily on generation time and location. We define these data as spatiotemporal data (STD). In previous studies, we proposed a STD retention system using vehicular networks to achieve the “Local production and consumption of STD” paradigm. The system can quickly provide STD for users within a specific location by retaining the STD within the area. However, this system does not take into account that each type of STD has different requirements for STD retention. In particular, the lifetime of STD and the diffusion time to the entire area directly influence the performance of STD retention. Therefore, we propose an efficient diffusion and elimination control method for retention based on the requirements of STD. The results of simulation evaluation demonstrated that the proposed method can satisfy the requirements of STD, while maintaining a high coverage rate in the area.  more » « less
Award ID(s):
1818884
PAR ID:
10289949
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
IEICE transactions on communications
Volume:
E104-B
ISSN:
0916-8516
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Producing high-resolution near-real-time forecasts of fire behavior and smoke impact that are useful for fire and air quality management requires accurate initialization of the fire location. One common representation of the fire progression is through the fire arrival time, which defines the time that the fire arrives at a given location. Estimating the fire arrival time is critical for initializing the fire location within coupled fire-atmosphere models. We present a new method that utilizes machine learning to estimate the fire arrival time from satellite data in the form of burning/not burning/no data rasters. The proposed method, based on a support vector machine (SVM), is tested on the 10 largest California wildfires of the 2020 fire season, and evaluated using independent observed data from airborne infrared (IR) fire perimeters. The SVM method results indicate a good agreement with airborne fire observations in terms of the fire growth and a spatial representation of the fire extent. A 12% burned area absolute percentage error, a 5% total burned area mean percentage error, a 0.21 False Alarm Ratio average, a 0.86 Probability of Detection average, and a 0.82 Sørensen’s coefficient average suggest that this method can be used to monitor wildfires in near-real-time and provide accurate fire arrival times for improving fire modeling even in the absence of IR fire perimeters. 
    more » « less
  2. Data privacy requirements are a complex and quickly evolving part of the data management domain. Especially in Healthcare (e.g., United States Health Insurance Portability and Accountability Act and Veterans Affairs requirements), there has been a strong emphasis on data privacy and protection. Data storage is governed by multiple sources of policy requirements, including internal policies and legal requirements imposed by external governing organizations. Within a database, a single value can be subject to multiple requirements on how long it must be preserved and when it must be irrecoverably destroyed. This often results in a complex set of overlapping and potentially conflicting policies. Existing storage systems are lacking sufficient support functionality for these critical and evolving rules, making compliance an underdeveloped aspect of data management. As a result, many organizations must implement manual ad-hoc solutions to ensure compliance. As long as organizations depend on manual approaches, there is an increased risk of non-compliance and threat to customer data privacy. In this paper, we detail and implement an automated comprehensive data management compliance framework facilitating retention and purging compliance within a database management system. This framework can be integrated into existing databases without requiring changes to existing business processes. Our proposed implementation uses SQL to set policies and automate compliance. We validate this framework on a Postgres database, and measure the factors that contribute to our reasonable performance overhead (13% in a simulated real-world workload). 
    more » « less
  3. Modern last-level caches are partitioned into slices that are spread across the chip, giving rise to varying access latencies dictated by the physical location of the accessing core and the cache slice being accessed. Although, prior work has shown that dynamically determining the best location for blocks within such Non-Uniform Cache Access architectures can provide significant performance benefits, current hardware does not implement this functionality. Instead, modern processors hash blocks across the LLC slices, obscuring the non-uniform architecture of the underlying cache and forfeiting the performance benefits of placing data in the nearest cache slices. Moreover, while prior work advocated improving performance by delegating control over block placement to the operating system at page granularity, modern processor hardware thwarts these approaches by hashing cache slice selection at cache block granularity. In this work, we make two observations that enable us to improve software performance on modern NUCA architectures. First, we find that software can undo the hashing performed by hardware and efficiently manage data placement at cache block granularity. Second, that the complexity of fine-grained data placement can be hidden from the developer by embedding it in the dynamic memory allocator. Leveraging these observations, we design a new specialized memory allocator, NUCAlloc, suitable for use with C++ containers such as std::map and std::set. NUCAlloc handles the complexity of NUCA-aware block placement, improving the performance of containers by placing their data into the nearest LLC slices. We demonstrate that our NUCAlloc prototype consistently outperforms std::allocator and jemalloc for LLC-resident containers, improving performance by up to 20% in both single-threaded and multi-threaded software. 
    more » « less
  4. 115 privacy policies from the OPP-115 corpus have been re-annotated with the specific data retention periods disclosed, aligned with the GDPR requirements disclosed in Art. 13 (2)(a). Those retention periods have been categorized into the following 6 distinct cases: C0: No data retention period is indicated in the privacy policy/segment. C1: A specific data retention period is indicated (e.g., days, weeks, months...). C2: Indicate that the data will be stored indefinitely. C3: A criterion is determined during which a defined period during which the data will be stored can be understood (e.g., as long as the user has an active account). C4: It is indicated that personal data will be stored for an unspecified period, for fraud prevention, legal or security reasons. C5: It is indicated that personal data will be stored for an unspecified period, for purposes other than fraud prevention, legal, or security. Note: If the privacy policy or segment accounts for more than one case, the case with the highest value was annotated (e.g., if case C2 and case C4 apply, C4 is annotated). Then, the ground truth dataset served as validation for our proposed ChatGPT-based method, the results of which have also been included in this dataset. Columns description: - policy_id: ID of the policy in the OPP-115 dataset - policy_name: Domain of the privacy policy - policy_text: Privacy policy collected at the time of OPP-115 dataset creation - info_type_value: Type of personal data to which data retention refers - retention_period: Period of retention annotated by OPP-115 annotators - actual_case: Our annotated case ranging from C0-C5 - GPT_case: ChatGPT classification of the case identified in the segment - actual_Comply_GDPR: Boolean denoting True if they apparently comply with GDPR (cases C1-C5) or False if not (case C0) - GPT_Comply_GDPR: Boolean denoting True if they apparently comply with GDPR (cases C1-C5) or False if not (case C0) - paragraphs_retention_period: List containing the paragraphs annotated as Data Retention by OPP-115 annotators and our red text describing the relevant information used for our annotation decision 
    more » « less
  5. In Location-Based Services (LBS), users are required to disclose their precise location information to query a service provider. An untrusted service provider can abuse those queries to infer sensitive information on a user through spatio-temporal and historical data analyses. Depicting the drawbacks of existing privacy-preserving approaches in LBS, we propose a user-centric obfuscation approach, called KLAP, based on the three fundamental obfuscation requirements: k number of locations, l-diversity, and privacy area preservation. Considering user's sensitivity to different locations and utilizing Real-Time Traffic Information (RTTI), KLAP generates a convex Concealing Region (CR) to hide user's location such that the locations, forming the CR, resemble similar sensitivity and are resilient against a wide range of inferences in spatio-temporal domain. For the first time, a novel CR pruning technique is proposed to significantly improve the delay between successive CR submissions. We carry out an experiment with a real dataset to show its effectiveness for sporadic, frequent, and continuous service use cases. 
    more » « less