Reinforcement Learning to Optimise levelised Cost Of Energy (RLoptCOE)


Stage 1

Project Lead

MaxSim Ltd

Project Sub-Contractors

University of Edinburgh, CorPower Ocean AB, Mocean Energy Ltd, Caelulum Ltd, Wave Conundrums Consulting, Aquaharmonics Inc., David Pizer

Wave energy converters (WECs) need to attract forces in order to capture power but they need to avoid forces to survive extreme events. This can be considered the 'force contradiction'. The ‘risk contradiction’ involves the costs of risk and uncertainty. These include the increased cost of capital, and costs due to reduced reliability, availability and survivability. Lowering these risks usually involves increased capital costs.

Reinforcement Learning (RL) is a leading machine learning method with the potential to bypass these contradictions. It can learn the probabilistic relationship between chosen actions and the device behaviour or 'state'. Desirable states can be assigned 'rewards'. Undesirable states can be penalised with negative rewards. RL calculates the best long-term control strategy by summing the probabilities of future rewards. RL is robust to sensor errors, delay and drift, as well as unidentified non-linear response. It builds a 'map' of the relationship between the chosen actions and the device response or 'state'. Furthermore, this map is built for each device, so it accounts for manufacturing, installation and operational differences. RL would also enable subsystems that have been developed independently to be integrated into a single WEC. RL can address the force contradiction by deciding when to limit forces and when to maximise capture. It can address the risk contradiction by including the inherent uncertainties in the mapping and decision-making process.

WEC developer Aquaharmonics recently won the US Wave Energy Prize and part of their winning formula was the use of control to maximise performance while limiting extreme loads. They want to extend their approach to more realistic conditions, and RL is a promising technique. The Aquaharmonics device is a perfect test case for RL as it is physically simple but has enough functional complexity to demonstrate the inherent force and risk contradictions.

MaxSim Ltd presented a poster on their Stage 1 Control Systems project at the 2017 WES Annual Conference. All Stage 1 Control System posters are available to download here.

Control Systems Stage 1 - Public Report - MaxSim Ltd

Control Systems Stage 1 Public Report for the MaxSim "Cost of Energy Optimised by Reinforcement Learning" project. Includes a description of the technology, scope of work, achievements and recommendations for further work.

View Details