<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Demand-side and Utility-side Management Techniques for Increasing EV Charging Load</title></titleStmt>
			<publicationStmt>
				<publisher></publisher>
				<date>01/10/2023</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10418531</idno>
					<idno type="doi">10.1109/TSG.2023.3235903</idno>
					<title level='j'>IEEE Transactions on Smart Grid</title>
<idno>1949-3053</idno>
<biblScope unit="volume"></biblScope>
<biblScope unit="issue"></biblScope>					

					<author>Salman Sadiq Shuvo</author><author>Yasin Yilmaz</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Electricity authorities need capacity assessment and expansion plans for efficiently charging the growing Electric vehicle (EV) fleet. Specifically, the distribution grid needs significant capacity expansion as it faces the most impact to accommodate the high variant residential EV charging load. Utility companies employ different scheduling policies for the maintenance of their distribution transformers (hereinafter, XFR). However, they lack scenario-based plans to cope with the varying EV penetration across locations and time. The contributions of this paper are twofold. First, we propose a customer feedback-based EV charging scheduling to simultaneously minimize the peak load for the distribution XFR and satisfy the customer needs. Second, we present a deep reinforcement learning (DRL) method for XFR maintenance, which focuses on the XFR's effective age and loading to periodically choose the best candidate XFR for replacement. Our case study for a distribution feeder shows the adaptability and success of our EV load scheduling method in reducing the peak demand to extend the XFR life. Furthermore, our DRL-based XFR replacement policy outperforms the existing rule-based policies. Together, the two approaches provide a complete capacity planning tool for efficient XFR maintenance to cope with the increasing EV charging load.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Technological development throughout the previous decades paved the way for electric vehicles (EVs) to replace gasolinebased vehicles at an increasing rate. Specifically, the battery capacity and cost, which are the major impediments to EV adaptation, have been significantly improved.</p><p>As a result, today, governments, manufacturers, and customers are more convinced about EVs' environmental, commercial, and economic benefits, escalating EV popularity and adoption. According to Bloomberg New Energy Finance, which provides a comprehensive analysis of predictions from different entities like oil manufacturing companies and independent research groups <ref type="bibr">[1]</ref>, there are already 13 million EVs on the road globally, with 2.7 million sales in 2021. Following the planned expansion of charging infrastructure, EV growth predictions are mostly optimistic. For instance, the International Energy Agency predicts the total number of EVs will go over 250 million by 2030 from the estimated 5 million on the streets globally in 2018 <ref type="bibr">[2]</ref>.</p><p>While expanding the charging infrastructure is critical for large-scale EV adoption, a significant portion of daily EV charging occurs at homes and creates stress on distribution</p><p>The authors are with the Department of Electrical Engineering at the University of South Florida, Tampa, FL 33620, USA ({salmansadiq, yasiny}@usf.edu). transformers (XFRs). Since EV charging requires significantly higher power than the other loads in a household, a combination of effective demand-side management (DSM) techniques for EV charge scheduling and utility-side management (USM) policies to cope with the increasing stress on the XFRs for XFR maintenance is needed. Utility companies try to flatten the electricity demand curve to decrease the stress on the distribution XFRs by providing day-ahead or hour-ahead dynamic electricity pricing schemes for the customers <ref type="bibr">[3]</ref>. Numerous existing works proposed scheduling techniques for the time-shiftable appliances (e.g., Dishwasher, washer dryer, EV charging, etc.) of a household to capitalize the dynamic pricing <ref type="bibr">[4]</ref>- <ref type="bibr">[6]</ref>.</p><p>Although such DSM techniques can flatten the demand curve to an extent, they do not sufficiently address the increasing stress of EV charging on the distribution XFRs since they lack the utility-side management of the problem. Motivated by this research gap, we take a comprehensive look at the problem of increasing EV charging stress on the distribution XFRs. Specifically, we consider both the demand-side (i.e., EV charge scheduling) and the utility-side (i.e., XFR maintenance) management of the problem. While the proposed DSM technique helps with load flattening to minimize transformer aging, the proposed USM technique enables timely (proactive) maintenance of distribution transformers to prevent costly transformer failures and blackouts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. DSM for EV Charge Scheduling</head><p>Centralized collaboration among EV users served by the same distribution XFR may provide the most effective way of minimizing the peak demand of the XFR <ref type="bibr">[7]</ref>- <ref type="bibr">[9]</ref>. The work <ref type="bibr">[7]</ref> shows that coordination among the EV chargers under a distribution XFR minimizes peak demand to extend the XFR lifetime at the expense of consumers' arbitrage benefit. However, their approach lacks consumer comfort, ignoring that delayed EV charging may compromise user comfort. Another work <ref type="bibr">[8]</ref> opts to minimize the EV owner's energy arbitrage benefit and distribution network maintenance cost through an optimal charging schedule. However, the objective function of this work also lacks user discomfort due to delayed charging. The paper <ref type="bibr">[9]</ref> proposes a fuzzy logic system for the demand-side operator to devise a centralized EV charging schedule. This approach is too strict to accommodate user preferences and needs more adaptability to serve different types of customers.</p><p>In short, these techniques lack integrating customer preferences into their objective functions, hence may suffer in real-life implementation. We address this shortcoming by directly considering customer preference for charging duration and amount, and by introducing a monetary incentive to the customers based on their charging preferences (see Section II-C for details).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. USM for XFR Maintenance</head><p>The distribution grid, especially the customer-end XFRs, is susceptible to overloading and costly maintenance. Replacement of gasoline cars by EVs urges installing charging stations in place of gas stations and home charging arrangements. So, the power system needs more energy generation, transmission, and distribution capacity at all levels. Many works provide charging station assessment, capacity, installation, and optimization techniques <ref type="bibr">[10]</ref>, <ref type="bibr">[11]</ref>. In this work, we focus on EV charging at home, which EV users typically prefer due to the convenience and cheaper charging cost <ref type="bibr">[12]</ref>. Home charging may significantly burden the customer-end distribution XFRs, as modern EVs take more than 7kW power from type-2 home chargers, higher than the average cumulative demand from all other loads in a household. Overloading an XFR leads to overheating and electrical insulation breakdown of an XFR <ref type="bibr">[13]</ref>. IEEE guidelines provide estimation for effective aging due to overheating of the insulation <ref type="bibr">[14]</ref>. The work <ref type="bibr">[15]</ref> develops a probabilistic failure distribution that depends on the effective age of an XFR.</p><p>Transformer selection for replacement/upgrade is naturally a sequential decision making problem, requiring a solution that is adaptive to the observed states. Hence, dynamic programming (DP) techniques, which can optimize transformer selections according to the changing environmental factors such as EV charging stress, suit better to this problem than static optimization techniques. Since it is not tractable to model the future state transitions (probabilistically or deterministically) as the network consists of many transformers and each action creates another branch of possible states, the model-based DP techniques like value iteration and policy iteration are not suitable. Reinforcement learning (RL) is a model-free DP approach that utilizes a data-driven technique of approximating a solution through sampling. Furthermore, deep RL (DRL) methods capitalize neural network-based function approximation to deal with the continuous-valued large input state (i.e., the current age and load of each transformer for our problem). Recent advances in neural network-based deep RL algorithms lead to widespread applications, including gaming <ref type="bibr">[16]</ref>, finance <ref type="bibr">[17]</ref>, energy systems <ref type="bibr">[18]</ref>, transportation <ref type="bibr">[19]</ref>, communications <ref type="bibr">[20]</ref>, environmental systems <ref type="bibr">[21]</ref>, and healthcare systems <ref type="bibr">[22]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Contributions</head><p>We propose an EV integration policy for the utility company that aims to minimize the long-term maintenance costs for the electrical distribution grid. Our contributions can be summarized as follows.</p><p>techniques is proposed for flattening the load curve and making timely maintenance of the distribution XFRs, respectively.</p><p>&#8226; A novel utility-driven EV charging scheme to flatten the load curve of the XFR. Different from the existing EV charging methods, our method directly considers customer preference for charging duration and amount, and a proportional monetary incentive. &#8226; A novel DRL-based policy for XFR replacement and capacity upgradation to minimize the maintenance cost. The remainder of the paper is organized as follows. Section II presents the proposed utility-driven EV charging method. Section III formulates the Markov decision process (MDP) for the proposed DRL-based XFR maintenance policy. Experimental results and analysis are presented in Section IV for a distribution XFR feeder. We conclude the paper in Section V.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. EV CHARGE SCHEDULING FOR DSM</head><p>Our utility-driven EV charge scheduling offers a reasonable balance between peak load reduction and customer satisfaction. Utility companies offer lower electricity prices during offpeak hours to encourage consumers to shift their load towards those hours. However, this can create extensive peak demand during "off-peak" hours for a distribution XFR that serves many EVs, especially when EV owners employ smart charging to exploit low tariffs. Overloading the XFR results in expedited aging and subsequent risk of expensive XFR maintenance and power outage. So, we propose a utility-driven charging technique that aims to minimize the maintenance cost by flattening the load curve for the XFR while ensuring customer satisfaction. The proposed DSM considers the other household devices as base loads and schedules EV charging based on the available power after providing power for the base loads. As a result, the utility company faces fewer maintenance costs thanks to peak load reduction. It incentivizes the consumers using the profit it makes from reduced maintenance costs to participate in the scheduling program.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Proposed Technique</head><p>In our proposed technique, as shown in Fig. <ref type="figure">1</ref> (left blue box), the utility employs one charging agent for each XFR to schedule and control the charging of the EVs. Whenever an EV is plugged in for charging (EV arrival), the agent collects the battery charge level E n , the target charge level E tgt , and calculates the charging time window,</p><p>where P max is the maximum charging capacity of the particular EV and &#964; is the duration of timestep in hours. The ceiling &#8968;&#8226;&#8969; of the fraction indicates the minimum number of time units for completing the target charging, and T b provides buffer time to the agent to do the scheduling. Then the agent updates its memory M by removing the departed EVs and charged EVs (E n = E tgt ) information and puts the arrived EV at the last position, V . Here, V indicates the number of EVs awaiting charge (2) Add the arrived EV info at the last position V.</p><p>(3) Update other EVs' info.</p><p>V &gt; 0 ?  </p><p>. Next, the agent proceeds to charge scheduling if there is any EV in the pool (V &gt; 0). The agent gathers electricity price and household load forecast for the next H timesteps, a decision time horizon which is bigger than the charging time window of any EV (e.g., H = 16 hours). LSTM algorithm fits well for the household sequential load prediction <ref type="bibr">[23]</ref>. We take the temperature forecast &#952; n , the holiday flag H n &#8712; {0, 1}, and the load data of the last m time steps {L n-m , . . . , L n-2 , L n-1 }, to predict the XFR load forecast Ln , Ln+1 , . . . , Ln+H . Similarly, the agent uses LSTM to forecast electricity prices { Rn , Rn+1 , . . . , Rn+H } from the last m time steps data {R n-m , . . . , R n-2 , R n-1 }.</p><p>The agent follows the first-come-first-serve approach and starts scheduling the first EV (S = 1) in the pool. Algorithm 1 shows the EV charge scheduling technique for the S th EV in the pool. If the charge time window is not over, i.e., T S w &gt; 0, the agent sorts electricity prices R in ascending order and store the indices in vector I. The agent schedules charging for the next T S w timesteps, starting from the cheapest electricity tariff hours to the costlier ones. The available power forecast L is the difference between load capacity L cap and load forecast of the XFR for the corresponding hour (Line 5). Notably, the utility decides on the load capacity L cap of the XFR, typically between 100-120 % of the nameplate kVA rating (e.g., 25-30 kVA for a 25 kVA rated XFR). The agent reads the battery charge status E t from M and calculates the required charging E R (Line 6). So, the EV charge allocation for the cheapest hour is</p><p>The agent updates the charge level E L for the following schedule step (Line 8). This process continues for charge allocation for all the time steps from the second cheapest, P S n+I(2) till the costliest one P S n+I(T S w ) . Finally, the algorithm outputs charge allocation for the next T S w time steps as {P s n+1 , P s n+2 , . . . , P s n+T S w } (line 10). However, if the charging window is over (i.e., T w &#8804; 0), the agent implements charging {P s n+1 } for the immediate time step, as explained next.</p><p>If T w &#8804; 0, but the target charge level is not achieved (Line 12), the agent offers compensation charging at a fixed rate based on the battery charge status E S n . We define two more user input E S safe and E S crit that each consumer can initiate and update as required. As the EV is expected to leave anytime soon (T W &#8804; 0), Algorithm 1 outputs allocated compensation charging for the next time step as:</p><p>where, L = L cap -Ln+1 , is the estimated available power.</p><p>After the completion of charge scheduling for each EV, the agent updates the load forecast by adding the scheduled EV charges. This charge scheduling continues till all the V EVs are scheduled through Algorithm 1. Upon completion of scheduling, the charging for the n th time step is implemented.</p><p>Although the actual load may differ from the prediction, with an appropriate method, the prediction error will be within the range that causes insignificant aging difference to the XFR. So, the agent implements the charging as per scheduled and update the memory as:</p><p>The agent moves to the next time step with its memory update, and this recursive loop continues as shown in Fig. <ref type="figure">1</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. Consumer Incentive</head><p>The consumers receive free smart EV charging service and a monetary incentive for participating in the DSM. The monetary incentive,</p><p>Algorithm 1 Utility-Driven EV Charge Scheduling</p><p>Sort electricity prices R in ascending order and store the indices in vector I.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4:</head><p>for &#964; = 1, 2, ..., T S w do 5:</p><p>Estimate available power, L = L cap -Ln+I(&#964;) .</p><p>6:</p><p>Remaining charge, E R = E S tgt -E L .</p><p>7:</p><p>Scheduled charge, P S n+I(&#964; ) = min{P S max , E R , L}. Output: EV charge compensation {P S n+1 } from Eq. ( <ref type="formula">1</ref>). 13: end if depends on their preferences for T b and E tgt . Customers who provide longer buffer time T b and smaller target charge level E tgt get more incentive. On the contrary, who prioritize comfort by selecting T b = 0 ensure the fastest possible EV charging without any incentive. EV owners set T b and E tgt during system setup and can update their choices from time to time. This setup offers the customer control over their preferences and gives our method an edge over the existing techniques. The utility selects the incentive coefficient &#954; based on the savings in maintenance cost.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. DRL BASED XFR REPLACEMENT FOR USM</head><p>We propose an RL framework shown in Fig. <ref type="figure">1</ref> (right blue box), where the electricity utility company is the RL agent that makes replacement and upgradation decisions for the distribution XFRs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Environment</head><p>The RL environment is the distribution feeder with X customer-end XFRs and their connected loads. The XFRs can be of different capacities (kVA rating), serving different household numbers. The environment provides the peak load, L x t , and the loss of life, &#8710;D x t , for the x th XFR during the t th time step. We calculate the effective aging as per the IEEE standard <ref type="bibr">[14]</ref> as:</p><p>where T H denotes the hotspot temperature of the XFR which depends on the ambient temperature and the electrical load. We approximate the effective ageing integral equation through fine granularity (per minute) estimation. Apart from the scheduled maintenance, the utility also bears unscheduled interruption costs, mainly due to XFR failure and fuse blowing events. XFR failure occurs due to insulation breakdown, which depends on the used life</p><p>of the XFR, where D x 0 is the initial age of the XFR in days. Weibull distribution is popular for forecasting the insulation failure of a XFR <ref type="bibr">[13]</ref>. Our preliminary work <ref type="bibr">[18]</ref> shows the XFR failure probability during the t th timestep is</p><p>where &#945; and &#946; are the scale and the shape parameters of the Weibull distribution, respectively. XFR failure brings interruption cost C x t to cover XFR replacement, required labor, and unplanned outages.</p><p>Fuse-blowing events are deterministic and protect the XFR by disconnecting the circuit whenever the load exceeds the rating of the fuse; typically, 180% of the XFR's rated load <ref type="bibr">[24]</ref>. Since fuse is meant to protect the XFR, its replacement brings minor labor and outage costs. Hence, the interruption cost for fuse replacement is smaller than that of XFR failure. While the monetary value of C x t varies with time and place, a utility company can have a proper estimate of C x t for XFR failure or blown fuse.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. State, S t</head><p>The agent makes replacement decisions based on the used life (hereinafter, age) and peak load of a XFR. However, XFRs with low age and peak load are not suitable candidates for change; hence eliminating them from the RL decision process creates a smaller state space and faster algorithm convergence without performance compromise. So, the agent takes the most loaded Y l and most aged Y a XFRs to make a pool of size Y = Y l + Y a . The load and age of these XFRs create the state for time step t,</p><p>The percentile load L y t , which is the ratio of peak load and capacity of the y th XFR, does not require normalization. We divide the age by the IEEE recommended lifetime of a XFR (7500 days) for normalization.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. Action, A t</head><p>The utility needs to replace the overloaded and older XFR to avoid failure and outage-related costs. However, under budgetary constraints, the RL agent chooses one XFR for replacement from the pool at each time step. Replacing the XFR with a bigger one is more cost-effective if the existing peak load is significantly higher than the capacity. So, our RL agent's action includes replacing the XFR with the same-sized or double-sized (kVA) one. If there is no overloaded or old XFR in the fleet, the optimal action might skip replacement (A t =0). As a result, our action space contains 2Y + 1 options</p><p>where, y &#8712; {1, 2, . . . , Y } is the index of the XFR in the pool; 2y -1 and 2y represent replacing the y th XFR with the same and double capacity one, respectively. </p><p>Notably, as failure brings emergency labor and unplanned longer outages, XFR failure is way costlier than a scheduled replacement for a same-sized XFR. Furthermore, an undersized XFR brings huge maintenance costs by multiple interruptions through fuse blows and eventual failure, which can be negated by upgrading its size.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Next State, S t+1</head><p>The selected action installs a new XFR (zero aged) in place of the previous one. The replaced XFR's index gets attributed to the new one. The age of a XFR for the next time step is</p><p>, otherwise where the environment provides the effective aging in time step t, &#8710;D x t , from Eq. ( <ref type="formula">3</ref>). Furthermore, the environment provides the maximum load of the x th XFR during time step t, which is used to estimate the peak load of the XFR as</p><p>Maximum Load Rated Capacity .</p><p>The rated capacity of the XFR is updated whenever it is upgraded by a double-sized one.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>F. Solution Approach</head><p>The RL agent aims to minimize the discounted cumulative cost in T time steps:</p><p>where &#955; is the discount factor, a critical parameter that represents the weight of future cost in current decision. In a data-driven approach, we minimize the expected cost E[C T ] by selecting the best actions {A t }. Central to this problem is the following Bellman equation. In a recursive way, the agent's value function is given by  <ref type="formula">6</ref>) and <ref type="bibr">(7)</ref>. Update value network &#952; through backpropagation. end for end for For this challenging problem with continuous state space and high-dimensional action space, policy gradient Deep RL (DRL) methods provide an effective solution approach. Specifically, the Advantage Actor-Critic (A2C) algorithm is known to provide quick convergence in such problems <ref type="bibr">[4]</ref>, <ref type="bibr">[18]</ref>, <ref type="bibr">[19]</ref>, <ref type="bibr">[22]</ref>, <ref type="bibr">[25]</ref>, <ref type="bibr">[26]</ref>. Using two neural networks (actor and critic) A2C reduces the variance of its predecessor policy gradient algorithm the REINFORCE <ref type="bibr">[25]</ref>.</p><p>The actor network, also known as the policy network, outputs the probability for each action through a softmax function. To that end, it finds the gradient of expected return J(&#960; &#981; ) of the policy &#960; &#981; with respect to the weights &#981; of the neural network through the following equation</p><p>where the advantage function is given by</p><p>The critic network, also known as the value network, learns the value function V (S t ; &#952;) by updating the weights &#952; of its neural network. The pseudo-code for the proposed method is given in Algorithm 2.</p><p>IV. EXPERIMENTS</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>A. Experimental Setup</head><p>From the 2009 RECS dataset for the Midwest region of the United States <ref type="bibr">[28]</ref>, Muratori generates 200 household load profiles, along with 348 predicted EV charging loads connected to those households in <ref type="bibr">[29]</ref>. The households vary in size, occupancy, electricity consumption. Lisha et al. <ref type="bibr">[30]</ref> present an EV diffusion model for feeder level distribution system that considers different socioeconomic factors of the neighborhood. They provide an EV inclusion model for a 30year timeline based on the car age, neighborhood, economy, and other critical features for an urban distribution feeder in North Carolina <ref type="bibr">[30]</ref>. We combine the load and EV charging profile of <ref type="bibr">[29]</ref> with the EV diffusion model of <ref type="bibr">[30]</ref> to obtain the load profile in our case study. Distribution feeder data are summarized in Table <ref type="table">I</ref>.</p><p>The distribution feeder maintenance includes scheduled and unscheduled replacement (due to failure) of the XFR and protective fuses in our setup. Based on our study of the equipment cost and labor, we set the total cost for different types of maintenance as shown in Table <ref type="table">II</ref>. Fault-based maintenance brings emergency outages and customer inconvenience cost. We take the inconvenience cost for XFR failure and for fuse blows as $1.3 per kWh and $2 per kWh, respectively, according to the service value assessment in <ref type="bibr">[31]</ref>. Since the customers are notified beforehand, the inconvenience cost for scheduled replacement is zero.</p><p>For the neural networks, we take the discount factor, &#947; = 0.95, and learning rate 3 &#215; 10 -4 . The actor and the critic networks have three hidden layers, each with 30, 120, and 48 neurons. The LSTM network for the load prediction has two LSTM layers. We use Adam optimizer for both the LSTM and DRL networks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>B. EV Charging (DSM)</head><p>We select the buffer time, T b = 3 and target charge level, E tgt = 0.9 for all the customers. <ref type="bibr">[29]</ref> shows the impact of uncoordinated charging on the distribution grid, in which the EVs get charged to full capacity without any schedule. Our proposed DSM reduces the peak load significantly, as shown in Fig. <ref type="figure">2</ref>. Out of the 232 XFRs, XFR-4 receives the most EVs ( <ref type="formula">14</ref>) during the 30-year timeline. For the 1st year, the proposed and uncoordinated load profiles are the same, as there is no EV inclusion in the beginning. With growing time and EV inclusion, the proposed charging method reduces the peak load increasingly. For the first week of 30 th year, uncoordinated charging results in a peak load above 33 kV compared to the peak load of around 21.1 kV with the proposed utilitydriven charging. Similarly, for all the XFRs for the 30-year timeline, uncoordinated charging yields as much as 49.73 kVA load, compared to the 32.07 kVA max load of the proposed charging method. This indicates that uncoordinated charging incurs a significantly higher cost for XFR replacement and upgradation compared to the proposed charging method.</p><p>Apart from uncoordinated charging, we examine the following smart EV charging techniques from the literature.</p><p>(1) Rule-based in <ref type="bibr">[7]</ref>: Sarker et al. <ref type="bibr">[7]</ref> present a centralized strategy for EV charging by co-optimization of distribution XFR aging and energy arbitrage. The objective is to minimize the total cost of electricity consumption and the damage cost to the XFR. They estimate the damage cost by multiplying the price of the XFR by its loss of life, using Eq. ( <ref type="formula">3</ref>). The utility pays incentives to compensate the customers, as charging often happens during higher tariffs to minimize damage costs to the XFR. This constraint optimization strategy for EV charging satisfies constraints related to the battery's state of charge to represent user preference, which is too basic to capture a user's driving traits and routine.</p><p>(2) MARL in <ref type="bibr">[27]</ref>: Li et al. <ref type="bibr">[27]</ref> proposed a Multi-Agent Reinforcement Learning (MARL) based EV charging strategy. Each EV under a distribution XFR is an individual agent that minimizes the total cost due to electricity bills and XFR damage cost under a central agent, i.e., the distribution XFR. The MARL state is defined by real-time electricity price, XFR hotspot temperature, load forecast, EV state of charge, and other parameters. The reward function includes the customer's EV range anxiety cost, representing the inconvenience cost due to delaying charging to utilize lower tariff hours. The authors model three different types of range anxiety (RA) cost as a function of the EV's state of charge at departure time, of which we select Type-1 RA for the comparative analysis.</p><p>(3) CIBECS in <ref type="bibr">[23]</ref>: The consumer input based EV charge scheduling (CIBECS) <ref type="bibr">[23]</ref> for a residential home can be achieved by following Algorithm 1 with one modification of making the scheduled charging free of estimated available power L from Line 7 as:</p><p>Table <ref type="table">III</ref> shows the cost comparison among the different charging methods for XFR-4 for two representative years, the 15 th and 30 th years. The customer cost represents the electricity cost, and the utility cost represents the XFR loss of life, fuse-blowing costs, and customer incentive (if any). The proposed charging method estimates the maintenance savings with respect to the utility cost of the uncoordinated charging method. We select the incentive coefficient &#954; = 1 for the 30 th year, which correspond to 3.33% ($524) discount on the customers' EV charging bill. The Uncoordinated charging <ref type="bibr">[29]</ref> and CIBECS <ref type="bibr">[23]</ref> prioritize EV charging, hence resulting in high utility costs (due to frequent fuse blows). On the contrary, MARL in <ref type="bibr">[27]</ref> and Rule-based method in <ref type="bibr">[7]</ref> maintain strict peak load constraints to minimize the utility cost. However, they are susceptible to undercharged EVs, which is not a desirable solution for customers. The Rule-based method in <ref type="bibr">[7]</ref> provides the customer with an incentive from its maintenance savings, which contributes to its utility cost. Our proposed DSM method capitalizes low-price hours, accommodates customer preference, and maintains load flattening simultaneously. As a result, the customer cost for the proposed DSM technique is the least among all the methods, and the utility cost is only marginally higher than the MARL in <ref type="bibr">[27]</ref>. Lastly, as there are no fuse blow events and negligible utility cost saving for the 15 th year, the proposed DSM offers zero incentive for that year.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>C. DRL based maintenance (USM)</head><p>Based on the comparative analysis for EV charging in the previous section, we focus on the proposed utility-driven charging technique to implement our DRL-based XFR replacement policy. Fig. <ref type="figure">3</ref> shows that our method learns the optimal policy within 3000 episodes. Table <ref type="table">IV</ref> shows the computation time for the proposed method. It takes 2.4 seconds for the proposed EV charge scheduling by using an Intel &#174; Core i7, 3.60 GHz, 16 GB RAM computer. The DRL algorithm needs 180 minutes to perform the 3000 episodes for convergence. Notably, the computational time for each decision is 0.01 second, negligible compared to maintenance policy-making steps (i.e., 1 month).</p><p>We compare our method with an idle policy, two rule-based methods from <ref type="bibr">[32]</ref> and <ref type="bibr">[33]</ref>, and the popular statistical Markov Chain Monte Carlo (MCMC) <ref type="bibr">[34]</ref> method.</p><p>(1) Idle policy: In this policy, the utility waits till the failure of a XFR for replacement. The utility would replace the XFR with double capacity if it endured more than five fuse blowing events during the previous twelve months; otherwise, replace it with the same capacity one.</p><p>(2) Ranking-based method <ref type="bibr">[32]</ref>: Vasquez et al. <ref type="bibr">[32]</ref> proposes a ranking-based approach for XFR replacement. The ranking score is calculated based on the XFR's probability of failure (from Eq. ( <ref type="formula">4</ref>)) and its failure replacement cost &#958; x t (from Table <ref type="table">II</ref>). The ranking score of the x th XFR for the t th time step is given by R x t = P x t &#215; &#958; x t . The highest-ranked XFR is replaced if the ranking score exceeds the threshold set through trial and error. The new XFR will be double-sized if the peak load is more than 1.5 times, otherwise same sized as the replaced one. Notably, this method portrays aggressive XFR replacement, hence functions opposite the above-mentioned idle policy.</p><p>(3) Risk score based method <ref type="bibr">[33]</ref>: The following equation is used in <ref type="bibr">[33]</ref>  Since all the XFRs serve under similar environmental factor (EF ) and have similar characteristic conditions (Cond), we remove these two parameters when estimating the risk factor &#8476; for each XFR. At the end of the month, the XFR with the highest risk factor &#8476; is replaced. If the risk factor value is lower than a threshold, no replacement occurs. We found 1.85 as the optimal threshold in our experiments. If the XFR's peak load is more than 150% of its capacity, it is replaced with a double-sized one; otherwise, with a same-sized XFR.</p><p>(4) MCMC <ref type="bibr">[34]</ref>: Markov Chain Monte Carlo (MCMC) simulation is a popular tabular RL technique for problems with discrete and tractable state and action spaces. We discretize the state space (as opposed to the continuous-valued DRL states) as the MCMC utilizes a tabular method to learn the value function for the state. The granularity of the discretization is a trade-off between the optimization results and computation time. We discretized each input variable in m = 10 equally spaced states for a manageable computation burden, which requires the convergence for m r = 10 12 states, where r = 12 is the number of input variables (i.e., age and load of the 3 oldest and the 3 most loaded XFRs in the network).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>D. Comparative Analysis</head><p>We implement the above mentioned maintenance policies for both uncoordinated and the proposed utility-driven charg- ing for an ablation study. Table <ref type="table">V</ref> shows the cumulative maintenance cost to the utility for different EV charging and maintenance policy combinations for the distribution feeder over a 30-year timeline. Table VI further elaborates the results in terms of the following metrices. 1) Fuse Blow: As the load (EV charging) grows, fuse blow events occur more frequently during the late part of the simulation timeline. Without any planned capacity upgrade (as in Idle policy), it accumulates 237 such events in the 30-year timeline. The Ranking method <ref type="bibr">[32]</ref> ignores the peak load in its decision criteria and performs worse than the other methods. The Risk score method <ref type="bibr">[33]</ref> puts significance on peak load and reduces fuse blow events through XFR upgrades. The proposed DRL method learns the correlation and minimizes the fuse blows; however, the MCMC method lags due to discretized state space. The proposed DSM approach flattens the load to such an extent that none of the policies experience any fuseblowing events.</p><p>2) XFR Failure: The XFR failure events can not be nullified as it follows Weibull distribution in (4). However, the proposed DRL method minimizes XFR failure by approximately 30% followed by the MCMC method. The Ranking method performs well as it prioritizes XFR age in its maintenance decision. On the contrary, the Risk score method underestimate XFR age in risk calculation to reduce XFR failure.</p><p>3) Planned Maintenance: The DRL method implements 23 replacements and 3 upgrades in the Uncoordinated charging case. In the proposed DSM case, the proposed DRL requires 22 replacements and no upgrades. Its optimal selections yield minimum XFR failure, outage, and cost compared to the benchmark methods.</p><p>4) Monetary Cost: Cumulative cost includes the planned and unplanned maintenance costs, which is the actual objective of the utility company to minimize. Our proposed DRL, accompanied by the proposed DSM, is the best performing combination.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>E. Key Insights</head><p>&#8226; In MCMC-based RL, discretized states for feasible training time result in significant performance degradation with respect to the DRL method. &#8226; The rule-based methods are too simple to set the appropriate balance between XFR age and load in decisionmaking. Hence, they either suffer many fuse blows (Ranking <ref type="bibr">[32]</ref>) or XFR failures (Risk score <ref type="bibr">[33]</ref>). &#8226; The DRL policy learns the optimal weight of age and peak load of the candidate XFRs for selecting the most appropriate XFR for maintenance, which is evident by the reduction in XFR failure, fuse blows, and subsequent outages. As there are many aged XFRs in the network initially, our policy aggressively replaces the aged and overloaded XFRs with newer ones. These proactive actions reduce the number of XFR failures and fuse blows. &#8226; The proposed EV charging technique substantially boosts the DRL-based policy to minimize the long-term maintenance cost.</p><p>V. CONCLUSION This work offers insight and solutions for maintaining the distribution system to accommodate EV charging load. It demonstrates a complete EV adoption strategy for the utility company considering long-term planning for both demand side management (DSM) and utility side management (USM). For DSM, the proposed utility-driven EV charge scheduling based on customer preferences offers a reasonable balance between peak load reduction and customer satisfaction. Consequently, the utility company faces less maintenance costs due to peak load reduction. The utility compensates the customers using its profit from reduced maintenance costs to keep them interested in participating in the scheduled EV charging program. For USM, our DRL-based XFR maintenance policy chooses the best XFR for replacement or upgrade. Experiments show that the combination of the proposed DSM and USM methods outperforms the existing optimization techniques by a wide margin in terms of long-term maintenance cost and power outage. Idle Ranking <ref type="bibr">[32]</ref> Risk score <ref type="bibr">[33]</ref> MCMC <ref type="bibr">[34]</ref> Proposed DRL</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>&#8226; The first comprehensive study of the problem of increasing stress on the distribution XFRs due to EV charging. Specifically, a combination of novel DSM and USM</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>This article has been accepted for publication in IEEE Transactions on Smart Grid. This is the author's version which has not been fully edited and content may change prior to final publication. Citation information: DOI 10.1109/TSG.2023.3235903 &#169; 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.Authorized licensed use limited to: University of South Florida. Downloaded on June 03,2023 at 13:53:03 UTC from IEEE Xplore. Restrictions apply.</p></note>
		</body>
		</text>
</TEI>
