Reinforcement-Learning (RL)-Based Recloser Control for Distribution Cables with Degraded Insulation
Utility providers frequently observe cable failures in aged cables that have an unknown degraded basic insulation level (BIL). One of the root causes is the transient overvoltage (TOV) associated with the reclosing of circuit breakers. In effort to address this problem, researchers have proposed a series of controlled switching methods, most of which being under deterministic control. However, in power systems, especially in distribution networks, the switching transient is buffeted by stochasticity. Conventional switching methods do not account for observation uncertainty and noise, and relatively little development has occurred in stochastically controlled mechanisms that view the complexity of the control task as a Markov decision process (MDP). Since knowledge and cost functions of overvoltage dynamics are difficult to characterize, a productive way forward may involve combining the advantages of off-policy control and value function approximation.
Motivated by the switching-transient-related cable failures reported by industry partners, researchers at Arizona State University have developed a recloser control method for aged and degraded cables that exploits reinforcement learning (RL). Specifically, this model-free stochastic control method is designed for operation in uncertain and noisy conditions. To capture high-dimensional dynamics patterns, the recloser control problem is formulated by incorporating the temporal sequence reward mechanism into a deep Q-network (DQN). The physical understanding of the problem is embedded into the action probability allocation, resulting in an infeasible-action-space-elimination algorithm. PSCAD™ simulations reveal the impact of load types on cable TOVs, and to reduce the training burden for the RL method in different applications, a post-learning knowledge transfer method is established. Learning curves show significantly enhanced performance—this method requires only 200 episodes to realize what its counterpart without infeasible-action-elimination achieves in 900 episodes.
• Power systems
• Mitigation of reclosing transient overvoltage (TOV)
• Reduction of cable failure risk due to overvoltage in distribution systems, particularly in underground systems with degraded basic insultation level (BIL)