Agricultural irrigation is a significant contributor to freshwater consumption. However, the current irrigation systems used in the field are not efficient. They rely mainly on soil moisture sensors and the experience of growers but do not account for future soil moisture loss. Predicting soil moisture loss is challenging because it is influenced by numerous factors, including soil texture, weather conditions, and plant characteristics. This article proposes a solution to improve irrigation efficiency, which is calledDRLIC(deep reinforcement learning for irrigation control).DRLICis a sophisticated irrigation system that uses deep reinforcement learning (DRL) to optimize its performance. The system employs a neural network, known as the DRL control agent, which learns an optimal control policy that considers both the current soil moisture measurement and the future soil moisture loss. We introduce an irrigation reward function that enables our control agent to learn from previous experiences. However, there may be instances in which the output of our DRL control agent is unsafe, such as irrigating too much or too little. To avoid damaging the health of the plants, we implement a safety mechanism that employs a soil moisture predictor to estimate the performance of each action. If the predicted outcome is deemed unsafe, we perform a relatively conservative action instead. To demonstrate the real-world application of our approach, we develop an irrigation system that comprises sprinklers, sensing and control nodes, and a wireless network. We evaluate the performance ofDRLICby deploying it in a testbed consisting of six almond trees. During a 15-day in-field experiment, we compare the water consumption ofDRLICwith a widely used irrigation scheme. Our results indicate thatDRLICoutperforms the traditional irrigation method by achieving water savings of up to 9.52%.
more »
« less
PnP-DRL: A Plug-and-Play Deep Reinforcement Learning Approach for Experience-Driven Networking
While Deep Reinforcement Learning has emerged as a de facto approach to many complex experience-driven networking problems, it remains challenging to deploy DRL into real systems. Due to the random exploration or half-trained deep neural networks during the online training process, the DRL agent may make unexpected decisions, which may lead to system performance degradation or even system crash. In this paper, we propose PnP-DRL, an offline-trained, plug and play DRL solution, to leverage the batch reinforcement learning approach to learn the best control policy from pre-collected transition samples without interacting with the system. After being trained without interaction with systems, our Plug and Play DRL agent will start working seamlessly, without additional exploration or possible disruption of the running systems. We implement and evaluate our PnP-DRL solution on a prevalent experience-driven networking problem, Dynamic Adaptive Streaming over HTTP (DASH). Extensive experimental results manifest that 1) The existing batch reinforcement learning method has its limits; 2) Our approach PnP-DRL significantly outperforms classical adaptive bitrate algorithms in average user Quality of Experience (QoE); 3) PnP-DRL, unlike the state-of-the-art online DRL methods, can be off and running without learning gaps, while achieving comparable performances.
more »
« less
- Award ID(s):
- 1704662
- PAR ID:
- 10300535
- Date Published:
- Journal Name:
- IEEE journal on selected areas in communications
- Volume:
- 39
- Issue:
- 8
- ISSN:
- 1558-0008
- Page Range / eLocation ID:
- 2476-2486
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Mostafa Sahraei-Ardakani; Mingxi Liu (Ed.)This paper explores the application of deep reinforcement learning (DRL) to create a coordinating mechanism between synchronous generators (SGs) and distributed energy resources (DERs) for improved primary frequency regulation. Renewable energy sources, such as wind and solar, may be used to aid in frequency regulation of the grid. Without proper coordination between the sources, however, the participation only results in a delay of SG governor response and frequency deviation. The proposed DRL application uses a deep deterministic policy gradient (DDPG) agent to create a generalized coordinating signal for DERs. The coordinating signal communicates the degree of distributed participation to the SG governor, resolving delayed governor response and reducing system rate of change of frequency (ROCOF). The validity of the coordinating signal is presented with a single-machine finite bus system. The use of DRL for signal creation is explored in an under-frequency event. While further exploration is needed for validation in large systems, the development of this concept shows promising results towards increased power grid stabilization.more » « less
-
In this work, we propose an energy-adaptive moni-toring system for a solar sensor-based smart animal farm (e.g., cattle). The proposed smart farm system aims to maintain high-quality monitoring services by solar sensors with limited and fluctuating energy against a full set of cyberattack behaviors including false data injection, message dropping, or protocol non-compliance. We leverage Subjective Logic (SL) as the belief model to consider different types of uncertainties in opinions about sensed data. We develop two Deep Reinforcement Learning (D RL) schemes leveraging the design concept of uncertainty maximization in SL for DRL agents running on gateways to collect high-quality sensed data with low uncertainty and high freshness. We assess the performance of the proposed energy-adaptive smart farm system in terms of accumulated reward, monitoring error, system overload, and battery maintenance level. We compare the performance of the two DRL schemes developed (i.e., multi-agent deep Q-Iearning, MADQN, and multi-agent proximal policy optimization, MAPPO) with greedy and random baseline schemes in choosing the set of sensed data to be updated to collect high-quality sensed data to achieve resilience against attacks. Our experiments demonstrate that MAPPO with the uncertainty maximization technique outperforms its counterparts.more » « less
-
Network slicing allows mobile network operators to virtualize infrastructures and provide customized slices for supporting various use cases with heterogeneous requirements. Online deep reinforcement learning (DRL) has shown promising potential in solving network problems and eliminating the simulation-to-reality discrepancy. Optimizing cross-domain resources with online DRL is, however, challenging, as the random exploration of DRL violates the service level agreement (SLA) of slices and resource constraints of infrastructures. In this paper, we propose OnSlicing, an online end-to-end network slicing system, to achieve minimal resource usage while satisfying slices' SLA. OnSlicing allows individualized learning for each slice and maintains its SLA by using a novel constraint-aware policy update method and proactive baseline switching mechanism. OnSlicing complies with resource constraints of infrastructures by using a unique design of action modification in slices and parameter coordination in infrastructures. OnSlicing further mitigates the poor performance of online learning during the early learning stage by offline imitating a rule-based solution. Besides, we design four new domain managers to enable dynamic resource configuration in radio access, transport, core, and edge networks, respectively, at a timescale of subseconds. We implement OnSlicing on an end-to-end slicing testbed designed based on OpenAirInterface with both 4G LTE and 5G NR, OpenDayLight SDN platform, and OpenAir-CN core network. The experimental results show that OnSlicing achieves 61.3% usage reduction as compared to the rule-based solution and maintains nearly zero violation (0.06%) throughout the online learning phase. As online learning is converged, OnSlicing reduces 12.5% usage without any violations as compared to the state-of-the-art online DRL solution.more » « less
-
While Reinforcement learning (RL), especially Deep RL (DRL), has shown outstanding performance in video games, little evidence has shown that DRL can be successfully applied to human-centric tasks where the ultimate RL goal is to make the \textit{human-agent interactions} productive and fruitful. In real-life, complex, human-centric tasks, such as education and healthcare, data can be noisy and limited. Batch RL is designed for handling such situations where data is \textit{limited yet noisy}, and where \textit{building simulations is challenging}. In two consecutive empirical studies, we investigated Batch DRL for pedagogical policy induction, to choose student learning activities in an Intelligent Tutoring System. In Fall 2018 (F18), we compared the Batch DRL policy to an Expert policy, but found no significant difference between the DRL and Expert policies. In Spring 2019 (S19), we augmented the Batch DRL-induced policy with \textit{a simple act of explanation} by showing a message such as \textit{"The AI agent thinks you should view this problem as a Worked Example to learn how some new rules work."}. We compared this policy against two conditions, the Expert policy, and a student decision making policy. Our results show that 1) the Batch DRL policy with explanations significantly improved student learning performance more than the Expert policy; and 2) no significant differences were found between the Expert policy and student decision making. Overall, our results suggest that \textit{pairing simple explanations with the Batch DRL policy} can be an important and effective technique for applying RL to real-life, human-centric tasks.more » « less