In this paper, we consider transmission scheduling in a status update system, where updates are generated periodically and transmitted over a Gilbert-Elliott fading channel. The goal is to minimize the long-run average age of information (AoI) under a long-run average energy constraint. We consider two practical cases to obtain channel state information (CSI): (i) without channel sensing and (ii) with delayed channel sensing. For (i), CSI is revealed by the feedback (ACK/NACK) of a transmission, but when no transmission occurs, CSI is not revealed. Thus, we have to balance tradeoffs across energy, AoI, channel exploration, and channel exploitation. The problem is formulated as a constrained partially observable Markov decision process (POMDP). We show that the optimal policy is a randomized mixture of no more than two stationary deterministic policies each of which is of a threshold-type in the belief on the channel. For (ii), (delayed) CSI is available via channel sensing. Then, the tradeoff is only between the AoI and energy. The problem is formulated as a constrained MDP. The optimal policy is shown to have a similar structure as in (i) but with an AoI associated threshold. With these, we develop an optimal structure-aware algorithm for each case.
more »
« less
Age Minimization with Energy and Distortion Constraints
In this paper, we consider a status update system, where an access point collects measurements from multiple sensors that monitor a common physical process, fuses them, and transmits the aggregated sample to the destination over an erasure channel. Under a typical information fusion scheme, the distortion of the fused sample is inversely proportional to the number of measurements received. Our goal is to minimize the long-term average age while satisfying the average energy and general age-based distortion requirements. Specifically, we focus on the setting in which the distortion requirement is stricter when the age of the update is older. We show that the optimal policy is a mixture of two stationary, deterministic, threshold-based policies, each of which is optimal for a parameterized problem that aims to minimize the weighted sum of the age and energy under the distortion constraint. We then derive analytically the associated optimal average age-cost function and characterize its performance in the large threshold regime, the results of which shed critical insights on the tradeoff among age, energy, and the distortion of the samples. We have also developed a closed-form solution for the special case when the distortion requirement is independent of the age, arguably the most important setting for practical applications.
more »
« less
- Award ID(s):
- 2107363
- PAR ID:
- 10532074
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9781450399265
- Page Range / eLocation ID:
- 101 to 110
- Subject(s) / Keyword(s):
- Age of information Markov decision process Threshold-type policy
- Format(s):
- Medium: X
- Location:
- Washington DC USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)In this paper, we consider a status update system, in which update packets are sent to the destination via a wireless medium that allows for multiple rates, where a higher rate also naturally corresponds to a higher error probability. The data freshness is measured using age of information, which is defined as the age of the recent update at the destination. A packet that is transmitted with a higher rate, will encounter a shorter delay and a higher error probability. Thus, the choice of the transmission rate affects the age at the destination. In this paper, we design a low-complexity scheduler that selects between two different transmission rate and error probability pairs to be used at each transmission epoch. This problem can be cast as a Markov Decision Process. We show that there exists a threshold-type policy that is age-optimal. More importantly, we show that the objective function is quasi-convex or non-decreasing in the threshold, based on the system parameters values. This enables us to devise a low-complexity algorithm to minimize the age. These results reveal an interesting phenomenon: While choosing the rate with minimum mean delay is delay-optimal, this does not necessarily minimize the age.more » « less
-
null (Ed.)In this paper, we study the problem of minimizing the age of information when a source can transmit status updates over two heterogeneous channels. Our work is motivated by recent developments in 5G mmWave technology, where transmissions may occur over an unreliable but fast (e.g., mmWave) channel or a slow reliable (e.g., sub-6GHz) channel. The unreliable channel is modeled as a time-correlated Gilbert-Elliot channel, where information can be transmitted at a high rate when the channel is in the "ON" state. The reliable channel provides a deterministic but lower data rate. The scheduling strategy determines the channel to be used for transmission with the aim to minimize the time-average age of information (AoI). The optimal scheduling problem is formulated as a Markov Decision Process (MDP), which in our setting poses some significant challenges because e.g., supermodularity does not hold for part of the state space. We show that there exists a multi-dimensional threshold-based scheduling policy that is optimal for minimizing the age. A low-complexity bisection algorithm is further devised to compute the optimal thresholds. Numerical simulations are provided to compare different scheduling policies.more » « less
-
We consider the problem of timely exchange of updates between a central station and a set of ground terminals V , via a mobile agent that traverses across the ground terminals along a mobility graph G = (V;E). We design the trajectory of the mobile agent to minimize peak and average age of information (AoI), two newly proposed metrics for measuring timeliness of information. We consider randomized trajectories, in which the mobile agent travels from terminal i to terminal j with probability Pi;j . For the information gathering problem, we show that a randomized trajectory is peak age optimal and factor-8H average age optimal, where H is the mixing time of the randomized trajectory on the mobility graph G. We also show that the average age minimization problem is NP-hard. For the information dissemination problem, we prove that the same randomized trajectory is factor-O(H) peak and average age optimal. Moreover, we propose an age-based trajectory, which utilizes information about current age at terminals, and show that it is factor-2 average age optimal in a symmetric setting.more » « less
-
Abstract Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic Markov decision process with $$S$$ states, $$A$$ actions and horizon length $$H$$, substantial progress has been achieved toward characterizing the minimax-optimal regret, which scales on the order of $$\sqrt{H^2SAT}$$ (modulo log factors) with $$T$$ the total number of samples. While several competing solution paradigms have been proposed to minimize regret, they are either memory-inefficient, or fall short of optimality unless the sample size exceeds an enormous threshold (e.g. $$S^6A^4 \,\mathrm{poly}(H)$$ for existing model-free methods). To overcome such a large sample size barrier to efficient RL, we design a novel model-free algorithm, with space complexity $O(SAH)$, that achieves near-optimal regret as soon as the sample size exceeds the order of $$SA\,\mathrm{poly}(H)$$. In terms of this sample size requirement (also referred to the initial burn-in cost), our method improves—by at least a factor of $S^5A^3$—upon any prior memory-efficient algorithm that is asymptotically regret-optimal. Leveraging the recently introduced variance reduction strategy (also called reference-advantage decomposition), the proposed algorithm employs an early-settled reference update rule, with the aid of two Q-learning sequences with upper and lower confidence bounds. The design principle of our early-settled variance reduction method might be of independent interest to other RL settings that involve intricate exploration–exploitation trade-offs.more » « less