Fig 3 Model schematic diagram
Taking January in winter as an example, the training set for Network 1
is composed of the first 8056 samples representing the three most
significant historical climatic factors and historical PV output data;
the testing set includes the last 20% of the training set. The training
set inputs for Network 2 mirror the testing set inputs of Network 1; the
target outputs for the training set consist of the predicted PV outputs
from Network 1 minus the actual PV output values; the testing set
constitutes the last 16% of the training set; the output values
represent the advance prediction errors of PV output predicted by
Network 3. The training set for Network 3 is the same as the testing set
for Network 1; the testing set is the last 16% of the training set. The
final model output is the PV output predicted by Network 3 for a certain
day minus the advance prediction error from Network 2, thereby achieving
improved prediction accuracy of the model.
The principle of using NGA-ELMAN for predicting photovoltaic (PV) output
involves representing weights and biases with individuals in a genetic
algorithm and optimizing the ELMAN neural network by continuously
searching for and updating the optimal individual through evolutionary
operations and niche technology, aiming to minimize the probability of
falling into local optima. The implementation steps are as follows:
Step 1: Identify the main climatic factors affecting PV output in each
season based on Pearson correlation analysis.
Step 2: Normalize the data and divide it into training and testing sets.
Step 3: Determine the structure of the ELMAN neural network and the
parameters of the algorithm.
Step 4: Initialize the population using floating-point encoding, where
the length of each individual is the sum of the total number of weights
and thresholds in the network, given by the formulaC len=I inH hid+H hid+H hidH out+H out.
Here, in I in represents the number of neurons in
the input layer, H hid the number of neurons in
the hidden layer, and H out the number of neurons
in the output layer.
Step 5: Set the fitness function as the mean absolute error between the
network’s predicted values and the actual values. The mathematical
expression is: , where m is the number of nodes in the network’s
output layer; yi is the predicted output value of
the network; yt is the actual value; φrepresents a threshold function.
Step 6: Calculate the fitness of all individuals using training data,
and employ an elitist retention strategy to select the top M individuals
with the highest fitness. Perform selection and elimination on all
individuals in the population using the roulette wheel algorithm, where
the probability of an individual being retained for the next generation
is .Conduct crossover and mutation operations on individuals to generate
the offspring population. Calculate the fitness of the offspring
individuals. Add the retained M individuals to the offspring population
and adjust the fitness of individuals using the niche technique. Employ
a tournament selection mechanism to select the top P individuals to form
a new population. Repeat the above steps until the convergence accuracy
is satisfied or the maximum number of iterations is reached, to obtain
the optimal individual.
Case Study Analysis: This paper selects data from the Australian
Solar Photovoltaic Research and Development Center, covering the period
from 0:00 on January 1, 2018, to 0:00 on December 30, 2018. The dataset
consists of photovoltaic output and various climatic factors, with a
sample point every 5 minutes throughout the year.
The number of neurons in the hidden layer exerts a significant influence
on network performance, making the selection of an appropriate number of
nodes imperative. This study identifies the optimal number of nodes
using an empirical formula and conducts an evaluation based on the test
error of the ELMAN network. The empirical formula is specified as
follows:
where m is the number of nodes in the input layer; h is the number of
nodes in the output layer; a is an integer between [1,10]. The
prediction error of Elman neural networks with different numbers of
neurons is shown in Figure 4, with the optimal number of neurons for the
hidden layer determined to be nhid=11.