Hydropower is currently the largest renewable energy generation method worldwide, being the third overall after coal and natural gas, and providing 15% of global electricity. However, generation data regarding hydropower is scarce. If available, it mostly exists at national and a
...
Hydropower is currently the largest renewable energy generation method worldwide, being the third overall after coal and natural gas, and providing 15% of global electricity. However, generation data regarding hydropower is scarce. If available, it mostly exists at national and annual level. Only limited generation data is available at plant scale. For research on power grid decarbonization, electricity grid expansion development and electricity grid optimization, more data on hydropower generation is wanted at plant scale. Hydropower generation differs significantly throughout the year following weather patterns, which means monthly generation data would be beneficial for research. To fill in data gaps, two models were created for the prediction of hydropower, using plant capacity, discharge, and reservoir area as predictor variables for the monthly model. For the yearly model, reservoir area was not included in the final model. A linear mixed-effects regression model and a mixed-effects random forest model were fitted and compared to the Hydro Plant Generation Estimation Model. The models were created using data from the United States (US) and used for predictions with hydropower plants from the US and the European Union (EU). The median KGE for the monthly LMER model was -0.08 in the US. For the monthly MERF model, the median KGE was 0.12 in the US. In the EU the models were evaluated at an annual time step due to data limitations, resulting in the LMER model scoring better (-0.16) than the MERF model (-0.68) on median KGE. The prediction errors of the annual US model were comparable to the Hydro Plant Generation Estimation Model. Discharge and plant capacity were found to be important predictor variables, followed by reservoir area for the monthly model. The models were able to predict at plant scale in data-scarce regions and at a monthly time step, although they can produce large outliers. A purpose for the model could be to not use it at plant scale but at a larger scale, as the median KGE scores were around zero, showing that predictions over multiple HPPs are usable.