Mean Field Multi Agent Reinforcement Learning for Active Wake Control

Bachelor thesis (2023)

Authors

I. Plămădeală Electrical Engineering, Mathematics and Computer Science

Contributors

G. Neustroev Algorithmics - (mentor)

M.M. de Weerdt Algorithmics - (mentor)

Przemysław Pawełczak Embedded Systems - (graduation committee member)

Faculty

Electrical Engineering, Mathematics and Computer Science, Electrical Engineering, Mathematics and Computer Science

Active Wake Control Mean-Field Multi Agent Reinforcement Learning Reward function Observation view

To reference this document use:

http://resolver.tudelft.nl/uuid:5cf1f1a6-e306-4c1c-9a9b-f4c2de104065

More Info

expand_more

Published Date

30-06-2023

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Faculty

Electrical Engineering, Mathematics and Computer Science

Abstract

The wake effect which is turbulence behind a wind turbine created when it extracts energy negatively impacts the power output of the downstream turbines. Active Wake Control can mitigate this effect, by rotating some turbines away from the wind. Previous research applied single agent reinforcement learning to apply Active Wake Control, show- ing good results for small-scale layouts, that don’t scale for larger, practical wind farms.
To that extent, this study focuses on the application of mean-field multi-agent reinforcement learning to Active Wake Control, under constant wind conditions. This algorithm limits the computations to a limited set of neighbouring turbines, reducing their complexities. To build the answer to this question I will also study:
1. how to model the rewards to solve the lazy- agent problem, leveraging the nature of the Active Wake Control
2. how the view of the agent changes the results
3. how does it compare to a single-agent reinforcement learning algorithm, TD3
The experiments were done using the Floris Wake Simulator, with each turbine sharing the same agent, placed in tunnel layouts at real-life distances (6-7 rotor diameters), under constant wind conditions.
Results show that with the proper configuration of rewards and view space within wind tunnels, the mean-field algorithm finds near optimal configurations for Active Wake Control, within a small number of episodes. This shows a promising start for the application of mean-field multi-agent algorithms for the Active Wake Control problem, and provides insight into how to model the rewards, which might be applicable for the whole class of algorithms.

Files

Plamadeala_thesis.pdf

(pdf | 0.46 Mb)

Unknown license