A window size of 5 seconds and response size of 20 seconds would satisfy the set conditions, but even though the distillation column is very responsive, 5 seconds might be too short to capture the effects of changes in control variables. Hence, a window size of 10 s and a response size of 20 s are chosen for the further training and optimization of models.
The following ML methods are compared for the pressure drop forecast: linear regression, random forest, extra trees, AdaBoost, and gradient boosting regression. To achieve the best performance, the number of estimators (decision trees) and the depth of the decision trees were optimized for the bagging and boosting regressors in a k-fold cross-validated grid search (k = 5) utilizing the training data. Finally, the performance of every model is compared in Table 1 based on the test data set and the chosen metrics. Note that AdaBoost and the extra trees regressor are combined via a voting regressor for an additional model as their training time is quite low. Due to the different working principles, weaknesses of the respective models could be eliminated by combining them. The training time is based on an INTEL Core i5-6600K CPU overclocked to 4.5 GHz.
Table 4: Accuracy of pressure drop forecast for different algorithm methods.