Fig 3 Model schematic diagram
Taking January in winter as an example, the training set for Network 1 is composed of the first 8056 samples representing the three most significant historical climatic factors and historical PV output data; the testing set includes the last 20% of the training set. The training set inputs for Network 2 mirror the testing set inputs of Network 1; the target outputs for the training set consist of the predicted PV outputs from Network 1 minus the actual PV output values; the testing set constitutes the last 16% of the training set; the output values represent the advance prediction errors of PV output predicted by Network 3. The training set for Network 3 is the same as the testing set for Network 1; the testing set is the last 16% of the training set. The final model output is the PV output predicted by Network 3 for a certain day minus the advance prediction error from Network 2, thereby achieving improved prediction accuracy of the model.
The principle of using NGA-ELMAN for predicting photovoltaic (PV) output involves representing weights and biases with individuals in a genetic algorithm and optimizing the ELMAN neural network by continuously searching for and updating the optimal individual through evolutionary operations and niche technology, aiming to minimize the probability of falling into local optima. The implementation steps are as follows:
Step 1: Identify the main climatic factors affecting PV output in each season based on Pearson correlation analysis.
Step 2: Normalize the data and divide it into training and testing sets.
Step 3: Determine the structure of the ELMAN neural network and the parameters of the algorithm.
Step 4: Initialize the population using floating-point encoding, where the length of each individual is the sum of the total number of weights and thresholds in the network, given by the formulaC len=I inH hid+H hid+H hidH out+H out. Here, in I in represents the number of neurons in the input layer, H hid the number of neurons in the hidden layer, and H out the number of neurons in the output layer.
Step 5: Set the fitness function as the mean absolute error between the network’s predicted values and the actual values. The mathematical expression is: , where m is the number of nodes in the network’s output layer; yi is the predicted output value of the network; yt is the actual value; φrepresents a threshold function.
Step 6: Calculate the fitness of all individuals using training data, and employ an elitist retention strategy to select the top M individuals with the highest fitness. Perform selection and elimination on all individuals in the population using the roulette wheel algorithm, where the probability of an individual being retained for the next generation is .Conduct crossover and mutation operations on individuals to generate the offspring population. Calculate the fitness of the offspring individuals. Add the retained M individuals to the offspring population and adjust the fitness of individuals using the niche technique. Employ a tournament selection mechanism to select the top P individuals to form a new population. Repeat the above steps until the convergence accuracy is satisfied or the maximum number of iterations is reached, to obtain the optimal individual.
Case Study Analysis: This paper selects data from the Australian Solar Photovoltaic Research and Development Center, covering the period from 0:00 on January 1, 2018, to 0:00 on December 30, 2018. The dataset consists of photovoltaic output and various climatic factors, with a sample point every 5 minutes throughout the year.
The number of neurons in the hidden layer exerts a significant influence on network performance, making the selection of an appropriate number of nodes imperative. This study identifies the optimal number of nodes using an empirical formula and conducts an evaluation based on the test error of the ELMAN network. The empirical formula is specified as follows:
where m is the number of nodes in the input layer; h is the number of nodes in the output layer; a is an integer between [1,10]. The prediction error of Elman neural networks with different numbers of neurons is shown in Figure 4, with the optimal number of neurons for the hidden layer determined to be nhid=11.