Results:
Between 207 days of January 21 2019 and August 14 2020, 5,248,242
confirmed cases were identified with 167,110 (3.2%) deaths. For model
building variables from the first 200 days were used with total cases of
4,883,646 with 164,104 (3.3%) mortal cases where 364,596 new confirmed
cases with 7,006 (1.9%) new deaths were analyzed for the validation
part.
ADF unit root test showed there is a unit root and visual diagnosis of
cases upward trend in cases so log transformation was used. Because ADF
showed unit root for the log-transformed case numbers, differencing was
applied to make the series stationary. After differencing with 2 lags,
time series became stationary and ADF showed there were no unit roots
(p<0.05). Since we assumed the time series was stationary, we
proceeded to model fitting.
“auto.arima ()” function was used to find the best fitting model with
an auto-regressive (AR) component of one order (p=3), moving averages
(MA) component of one order (q=1) and differencing of 2 (d=2). The
proposed model was ARIMA (5,2,1). The coefficients for AR(3) were
-0.3173, -0.0205, -0.1031, -0.1991 and -0.2060 while MA(1) was -0.6376
with the model’s AIC of -643.54. The order of differencing was 2 as
previously found.
We used the newly formed ARIMA model to forecast the number of cases
from August 8 to August 14, 2020 using the “forecast” function (Table
1). By comparing the actual number of cases with predicted ones, the
prediction accuracy of forecasting calculated by mean percentage error
was 0.09%.