Reinforcement learning for water system control
Cost optimization at IJmuiden pumping station
More Info
expand_more
Abstract
The production and consumption of electricity need to be balanced at all times. Due to the ever-growing shift towards renewable energy generation, this poses an increasingly difficult challenge. Currently, supply is regulated to maintain balance. However, there is potential to improve reliability and save costs by shifting the balancing to the demand side, known as demand response. The flexibility of water systems can play a role in this, thereby benefiting from cheaper price fluctuations and reducing operating costs.
This research investigates the IJmuiden pumping station, which drains water from the Noordzeekanaal-Amsterdam-Rijnkanaal system into The North Sea. The primary focus of the control of this system is ensuring safe water levels as it runs through areas of high economic value. The flexibility of the range of safe water levels allows costs to be minimized by selecting favourable moments to consume electricity. This simultaneously contributes to the stability of the electrical grid. This research explores the potential for a Reinforcement Learning controller for such an optimization problem, as there are some drawbacks to the Model Predictive Control methods that are currently widely used. The research objective is formulated as follows:
To optimize the control of the IJmuiden pumping station using Reinforcement Learning while complying with local water level restrictions and compare it to the state-of-the-art Model Predictive Control methods in terms of constraint violation, energy costs, and computational speed.
The Reinforcement Learning controller will use a deep Q-learning algorithm that chooses the most cost efficient control in IJmuiden while respecting the water level restrictions. To do so, the model makes decisions based on electricity prices and details about the state of the water system for the current time step as well as a forecast of 48 hours ahead. This data is provided as an input to the model.
The inputs of the model consist of historical data, meaning that the associated uncertainties are not included. The water system that the model can interact with is represented by a linear reservoir model. Therefore, the water system is influenced dynamically by the actions taken by the model. The possible actions are determined by the state of the water system.
The trained model was tested on 2 years of unseen data (data that was not used during training). Using the same test data, control plans were generated using Model Predictive Control. The Reinforcement Learning model was very successful in ensuring safe water levels. However, this did result in approximately 50\% higher energy costs. The use of the gate was close to optimal but the pumping was not clearly correlated with favourable prices and power consumption. The trained model was robust, with consistently accurate results with regards to respecting the water level constraints.
The most significant difference with the Model Predictive Control was the computation time. The Reinforcement Learning model was able to create a control plan approximately 300 times faster. This opens doors for further development of the model and increased complexity. A more accurate model of the water system can be used to take into account temporal and spatial effects and individually representing the six pumps in IJmuiden.
There are still many steps before such a model can be used for operational control, but the method has potential for such an application. Many aspects of the model can be improved as well as making adjustments to increase the usability for control operators.