Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Kumar, Amit ; Ron-Zewi, Noga (Ed.)Estimating the size of the union of a stream of sets S₁, S₂, …, S_M where each set is a subset of a known universe Ω is a fundamental problem in data streaming. This problem naturally generalizes the well-studied 𝖥₀ estimation problem in the streaming literature, where each set contains a single element from the universe. We consider the general case when the sets S_i can be succinctly represented and allow efficient membership, cardinality, and sampling queries (called a Delphic family of sets). A notable example in this framework is the Klee’s Measure Problem (KMP), where every set S_i is an axis-parallel rectangle in d-dimensional spaces (Ω = [Δ]^d where [Δ] := {1, … ,Δ} and Δ ∈ ℕ). Recently, Meel, Chakraborty, and Vinodchandran (PODS-21, PODS-22) designed a streaming algorithm for (ε,δ)-estimation of the size of the union of set streams over Delphic family with space and update time complexity O((log³|Ω|)/ε² ⋅ log 1/δ) and Õ((log⁴|Ω|)/ε² ⋅ log 1/(δ)), respectively. This work presents a new, sampling-based algorithm for estimating the size of the union of Delphic sets that has space and update time complexity Õ((log²|Ω|)/ε² ⋅ log 1/(δ)). This improves the space complexity bound by a log|Ω| factor and update time complexity bound by a log² |Ω| factor. A critical question is whether quadratic dependence of log|Ω| on space and update time complexities is necessary. Specifically, can we design a streaming algorithm for estimating the size of the union of sets over Delphic family with space and complexity linear in log|Ω| and update time poly(log|Ω|)? While this appears technically challenging, we show that establishing a lower bound of ω(log|Ω|) with poly(log|Ω|) update time is beyond the reach of current techniques. Specifically, we show that under certain hard-to-prove computational complexity hypothesis, there is a streaming algorithm for the problem with optimal space complexity O(log|Ω|) and update time poly(log(|Ω|)). Thus, establishing a space lower bound of ω(log|Ω|) will lead to break-through complexity class separation results.more » « lessFree, publicly-accessible full text available September 16, 2025
-
In today's digital age, it is becoming increasingly prevalent to retain digital footprints in the cloud indefinitely. Nonetheless, there is a valid argument that entities should have the authority to decide whether their personal data remains within a specific database or is expunged. Indeed, nations across the globe are increasingly enacting legislation to uphold the Right To Be Forgotten for individuals. Investigating computational challenges, including the formalization and implementation of this notion, is crucial due to its relevance in the domains of data privacy and management.
This work introduces a new streaming model: the 'Right to be Forgotten Data Streaming Model' (RFDS model). The main feature of this model is that any element in the stream has the right to have its history removed from the stream. Formally, the input is a stream of updates of the form (a, Δ) where Δ ∈ {+, ⊥} and a is an element from a universe U. When the update Δ=+ occurs, the frequency of a, denoted as fa, is incremented to fa+1. When the update Δ=⊥, occurs, fais set to 0. This feature, which represents the forget request, distinguishes the present model from existing data streaming models.
This work systematically investigates computational challenges that arise while incorporating the notion of the right to be forgotten. Our initial considerations reveal that even estimating F1(sum of the frequencies of elements) of the stream is a non-trivial problem in this model. Based on the initial investigations, we focus on a modified model which we call α-RFDS where we limit the number of forget operations to be at most α fraction. In this modified model, we focus on estimating F0(number of distinct elements) and F1. We present algorithms and establish almost-matching lower bounds on the space complexity for these computational tasks.
Free, publicly-accessible full text available May 10, 2025 -
Given a data stream 𝒟 = ⟨ a₁, a₂, …, a_m ⟩ of m elements where each a_i ∈ [n], the Distinct Elements problem is to estimate the number of distinct elements in 𝒟. Distinct Elements has been a subject of theoretical and empirical investigations over the past four decades resulting in space optimal algorithms for it. All the current state-of-the-art algorithms are, however, beyond the reach of an undergraduate textbook owing to their reliance on the usage of notions such as pairwise independence and universal hash functions. We present a simple, intuitive, sampling-based space-efficient algorithm whose description and the proof are accessible to undergraduates with the knowledge of basic probability theory.more » « less
-
Given a family of sets (S1, S2,... SM) over a universe Ω, estimating the size of their union in the data streaming model is a fundamental computational problem with a wide variety of applications. The holy grail in the field of streaming is to seek design of algorithms that achieve (ε, δ)-approximation with poly(log |Ω|, ε-1, log δ-1) space and update time complexity. Earlier investigations achieve algorithms with desired space and update time complexity for restricted cases such as singletons (Distinct Elements problem), one-dimensional ranges, arithmetic progressions, and sub-cubes. However, techniques used in these works fail for many other simple structured sets. A prominent example is that of Klee's Measure Problem (KMP), wherein every set Si is represented by an axis-parallel rectangle in d-dimensional spaces. Despite extensive prior work, the best-known streaming algorithms for many of these cases depend on the size of the stream, and therefore the problem of whether there exists a streaming algorithm for estimations of size of the union of sets with poly(log |Ω|, ε-1, log δ-1) space and update time complexity has remained open. In this work, we focus on certain general families of sets called Delphic families (which allows efficient membership, sampling, and cardinality queries). Such families of sets capture several well-known problems, including KMP, test coverage, and hypervolume estimation. The primary contribution of our work is to resolve the above-mentioned open problem for streams over Delphic families. In particular, we design the first streaming algorithm for estimating |⋃i=1M Si| with poly(log |Ω|, ε-1, log δ-1) space and update time complexity (independent of M, the length of the stream) when each Si is a member from a Delphic family of sets. We further generalize our results to larger families of sets, called approximate-Delphic families, for which the size of a set can be known approximately but not exactly. Our results resolve two of the open problems listed in Meel, Vinodchandran, Chakraborty (PODS-21).more » « less
-
Here, we present the first example of reductive silylation for oxygen defect formation at the surface of a polyoxometalate. Upon addition of 1,4-bis(trimethylsilyl)dihydropyrazine (Pyz(SiMe 3 ) 2 ) to [V 6 O 7 (OMe) 12 ] 1− , quantitative formation of the oxygen-deficient vanadium oxide assembly, [V 6 O 6 (OMe) 12 ] 1− was observed. Substoichiometric reactions of Pyz(SiMe 3 ) 2 with the parent cluster revealed the mechanism of defect formation; addition of 0.5 equiv. of Pyz(SiMe 3 ) 2 to [V 6 O 7 (OMe) 12 ] 1− results in isolation of [V 6 O 6 (OSiMe 3 )(OMe) 12 ] 1− . This reactivity was extended to reduced and oxidized forms of the cluster, [V 6 O 7 (OMe) 12 ] n ( n = 2-, 0), revealing the consequences of modifying the oxidation states of remote transition metal ions on the stability of the siloxide functional group, and thus the extent of reactivity of the cluster surface with Pyz(SiMe 3 ) 2 . The work offers a new understanding of the mechanisms of surface activation of reducible metal oxides via reductive silylation, and reveals new chemical routes for the formation of oxygen atom vacancies in polyoxometalate ions.more » « less
-
We report the synthesis and characterisation of a series of siloxide-functionalised polyoxovanadate–alkoxide (POV–alkoxide) clusters, [V 6 O 6 (OSiMe 3 )(OMe) 12 ] n ( n = 1−, 2−), that serve as molecular models for proton and hydrogen-atom uptake in vanadium dioxide, respectively. Installation of a siloxide moiety on the surface of the Lindqvist core was accomplished via addition of trimethylsilyl trifluoromethylsulfonate to the fully-oxygenated cluster [V 6 O 7 (OMe) 12 ] 2− . Characterisation of [V 6 O 6 (OSiMe 3 )(OMe) 12 ] 1− by X-ray photoelectron spectroscopy reveals that the incorporation of the siloxide group does not result in charge separation within the hexavanadate assembly, an observation that contrasts directly with the behavior of clusters bearing substitutional dopants. The reduced assembly, [V 6 O 6 (OSiMe 3 )(OMe) 12 ] 2− , provides an isoelectronic model for H-doped VO 2 , with a vanadium( iii ) ion embedded within the cluster core. Notably, structural analysis of [V 6 O 6 (OSiMe 3 )(OMe) 12 ] 2− reveals bond perturbations at the siloxide-functionalised vanadium centre that resemble those invoked upon H-atom uptake in VO 2 through ab initio calculations. Our results offer atomically precise insight into the local structural and electronic consequences of the installation of hydrogen-atom-like dopants in VO 2 , and challenge current perspectives of the operative mechanism of electron–proton co-doping in these materials.more » « less
-
null (Ed.)Polyoxovanadate (POV) clusters are an important subclass of polyoxometalates with a broad range of molecular compositions and physicochemical properties. One relatively underdeveloped application of these polynuclear assemblies involves their use as atomically precise, homogenous molecular models for bulk metal oxides. Given the structural and electronic similarities of POVs and extended vanadium oxide materials, as well as the relative ease of modifying the homogenous congeners, investigation of the chemical and physical properties of pristine and modified cluster complexes presents a method toward understanding the influence of structural modifications ( e.g. crystal structure/phase, chemical makeup of surface ligands, elemental dopants) on the properties of extended solids. This review summarises recent advances in the use of POV clusters as atomically precise models for bulk metal oxides, with particular focus on the assembly of vanadium oxide clusters and the consequences of altering the molecular composition of the assembly via organofunctionalization and the incorporation of elemental “dopants”.more » « less