In this paper, we study streaming algorithms that minimize the number of changes made to their internal state (i.e., memory contents). While the design of streaming algorithms typically focuses on minimizing space and update time, these metrics fail to capture the asymmetric costs, inherent in modern hardware and database systems, of reading versus writing to memory. In fact, most streaming algorithms write to their memory on every update, which is undesirable when writing is significantly more expensive than reading. This raises the question of whether streaming algorithms with small space and number of memory writes are possible. We first demonstrate that, for the fundamental Fpmoment estimation problem with p ≥ 1, any streaming algorithm that achieves a constant factor approximation must make Ω(n1-1/p) internal state changes, regardless of how much space it uses. Perhaps surprisingly, we show that this lower bound can be matched by an algorithm which also has near-optimal space complexity. Specifically, we give a (1+ε)-approximation algorithm for Fpmoment estimation that use a near-optimal ~Oε(n1-1/p) number of state changes, while simultaneously achieving near-optimal space, i.e., for p∈[1,2), our algorithm uses poly(log n,1/ε) bits of space for, while for p>2, the algorithm uses ~Oε(n1-1/p) space. We similarly design streaming algorithms that are simultaneously near-optimal in both space complexity and the number of state changes for the heavy-hitters problem, sparse support recovery, and entropy estimation. Our results demonstrate that an optimal number of state changes can be achieved without sacrificing space complexity.
more »
« less
Single-lot, lot-streaming problem for a 1 + m hybrid flow shop
Abstract In this paper, we consider an application of lot-streaming for processing a lot of multiple items in a hybrid flow shop (HFS) for the objective of minimizing makespan. The HFS that we consider consists of two stages with a single machine available for processing in Stage 1 andmidentical parallel machines in Stage 2. We call this problem a 1 + mTSHFS-LSP (two-stage hybrid flow shop, lot streaming problem), and show it to be NP-hard in general, except for the case when the sublot sizes are treated to be continuous. The novelty of our work is in obtaining closed-form expressions for optimal continuous sublot sizes that can be solved in polynomial time, for a given number of sublots. A fast linear search algorithm is also developed for determining the optimal number of sublots for the case of continuous sublot sizes. For the case when the sublot sizes are discrete, we propose a branch-and-bound-based heuristic to determine both the number of sublots and sublot sizes and demonstrate its efficacy by comparing its performance against that of a direct solution of a mixed-integer formulation of the problem by CPLEX®.
more »
« less
- Award ID(s):
- 2034503
- PAR ID:
- 10515768
- Publisher / Repository:
- Springer
- Date Published:
- Journal Name:
- Journal of Global Optimization
- Volume:
- 89
- Issue:
- 2
- ISSN:
- 0925-5001
- Page Range / eLocation ID:
- 435 to 455
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
This paper models the crowdsourced labeling/classification problem as a sparsely encoded source coding problem, where each query answer, regarded as a code bit, is the XOR of a small number of labels, as source information bits. In this paper we leverage the connections between this problem and well-studied codes with sparse representations for the channel coding problem to provide querying schemes with almost optimal number of queries, each of which involving only a constant number of labels. We also extend this scenario to the case where some workers can be unresponsive. For this case, we propose querying schemes where each query involves only $$\log n$$ items, where $$n$$ is the total number of items to be labeled. Furthermore, we consider classification of two correlated labeling systems and provide two-stage querying schemes with almost optimal number of queries each involving a constant number of labels.more » « less
-
Stratified random sampling (SRS) is a widely used sampling technique for approximate query processing. We consider SRS on continuously arriving data streams, and make the following contributions. We present a lower bound that shows that any streaming algorithm for SRS must have (in the worst case) a variance that is Ω(r) factor away from the optimal, where r is the number of strata. We present S-VOILA, a streaming algorithm for SRS that is locally variance-optimal. Results from experiments on real and synthetic data show that S-VOILA results in a variance that is typically close to an optimal offline algorithm, which was given the entire input beforehand. We also present a variance-optimal offline algorithm VOILA for stratified random sampling. VOILA is a strict generalization of the well-known Neyman allocation, which is optimal only under the assumption that each stratum is abundant, i.e. has a large number of data points to choose from. Experiments show that VOILA can have significantly smaller variance (1.4x to 50x) than Neyman allocation on real-world data.more » « less
-
Abstract We study how supersonic streaming velocities of baryons relative to dark matter—a large-scale effect imprinted at recombination and coherent over ∼3 Mpc scales—affect the formation of dwarf galaxies atz≳ 5. We perform cosmological hydrodynamic simulations, including and excluding streaming velocities, in regions centered on halos withMvir(z= 0) ≈ 1010M⊙; the simulations are part of the Feedback In Realistic Environments (FIRE) project and run with FIRE-3 physics. Our simulations comprise many thousands of systems with halo masses betweenMvir= 2 × 105M⊙and 2 × 109M⊙in the redshift rangez= 20–5. A few hundred of these galaxies form stars and have stellar masses ranging from 100 to 107M⊙. While star formation is globally delayed by approximately 50 Myr in the streaming relative to nonstreaming simulations and the number of luminous galaxies is correspondingly suppressed at high redshift in the streaming runs, these effects decay with time. Byz= 5, the properties of the simulated galaxies are nearly identical in the streaming versus nonstreaming runs, indicating that any effects of streaming velocities on the properties of galaxies at the mass scale of classical dwarfs and larger do not persist toz= 0.more » « less
-
null (Ed.)This paper addresses a critical question pertaining to manufacturing sustainability: is it economically viable to implement an island microgrid to power a flow shop system under power demand and supply uncertainty? Though many studies on microgrid sizing are available, the majority assume the microgrid is interconnected with main grid. This paper aims to size wind turbine, photovoltaic and battery storage to energize a multi-stage flow shop system in island mode. A mixed-integer, non-linear programming model is formulated to optimize the renewable portfolio and capacity with the goal of minimizing the levelized cost of energy. The island microgrid is tested in three locations with diverse climate profiles. The results show that net zero energy flow shop production is economically feasible in the areas where the average wind speed exceed 8 m/s at 80-meter tower height, or the battery cost drops below $100,000/MWh. Sensitivity analyses are further carried out with respect to installation cost, demand response program, production scalability, and weather seasonality.more » « less
An official website of the United States government

