Case definition and desired accuracy

    The user should provide a case definition of what constitutes the minimum expected rise in cases that, if present, should be detected. A case definition can be a rise in cases between two consecutive weeks that exceeds a threshold \(r\):
\[\frac{\overline{\mu}_{t:t-7}}{\overline{\mu}_{t:t+7}}\ge r\]
with \(0\le r\le1\).
    The accuracy of \(EVI\), given the specified case definition, depends on \(m\) and \(c\), which should be selected in a way to achieve a desired accuracy target. Several strategies are available. Selection of \(m\) and \(c\) values that lead to the simultaneous maximization of the sensitivity \(\left(Se\right)\) and the specificity \(\left(Sp\right)\) for \(EVI\) and the Youden index \(\left(J=Se+Sp-1\right)\)\cite{Fluss_2005} and thus to an overall minimization of the false results (i.e. both false positive and false negative early warnings). Another approach could be to select \(m\) and \(c\) such that the highest \(Se\left(or\ Sp\right)\) is achieved with \(Sp\left(or\ Se\right)=1\) or not dropping below a critical value (e.g. 95%).  Advanced Receiver Operating Characteristic curve analysis can also be performed \cite{Zweig_1993} and selection of critical values can be based on indices that quantify the relative cost of false positive (i.e., falsely predicting an upcoming epidemic wave) to false negative (i.e., failing to predict an upcoming epidemic wave) warnings, like the misclassification cost term \(\left(MCT\right)\)

Issuance of an early warning

    Every time a new time point \(t\) is observed, the model uses all the observed cases up to \(t\) to decide whether it should issue an early warning, at time point \(t\). The steps are: 
  1. Observed cases up to \(t\) are analyzed for all possible values for the window size \(\left(m\in\left[1,m_{\max}\right]\right)\) and threshold \(\left(c\in\left[0,1\right]\right)\)
  2. For each of the \(m,c\) combinations the \(Se_{t_{m,c}}\)and \(Sp_{t_{m,c}}\)is estimated for the predefined case definition (Eq. 4). 
  3. The  \(m'\) and \(c'\) that give the best \(Se_{t_{m',c'}}\) and \(Sp_{t_{m',c'}}\) combination are selected.
  4. For \(m'\) and \(c'\), the value of \(Ind_{EVI_{t,t-1}}\) is determined at the most recent time point \(t\) and a decision is made on whether a warning signal is issued.

Accuracy, Positive and Negative Predictive Values

    Further, at each time point \(t\), the probability of observing a rise or drop in the future cases given that an early warning was issued or not can be calculated as the positive \(\left(PV_t+\right)\) and negative \(\left(PV_t-\right)\) predictive value, respectively:
\[PV_t+=P(D+\mid T+)=\frac{p_{1:t}Se_{t_{m',c'}}}{p_{1:t}Se_{t_{m',c'}}+\left(1-p_{1:t}\right)\left(1-Sp_{t_{m',c'}}\right)}\]
\[PV_t-=P(D-\mid T-)=\frac{\left(1-p_{1:t}\right)Sp_{t_{m',c'}}}{\left(1-p_{1:t}\right)Sp_{t_{m',c'}}+p_{1:t}\left(1-Se_{t_{m',c'}}\right)}\] where \(p_{1:t}\) is the proportion of events satisfying the condition of  Eq. 4 up to time point \(t\).
    Once the entire time series data has been observed, the overall \(Se\) of \(EVI\) can be estimated as the fraction of the total number of occurrences for which an early warning was issued, given that the case definition (Eq. 4 ) holds \((P (T+ \mid D+))\), divided by the total number of occurrences that the case definition holds \((P(D+))\). Similarly, the overall \(Sp\)  of \(EVI\) is calculated as the fraction of the total number of occurrences for which an early warning was not issued given that the expected rise of cases was not observed, that is, the case definition is not true, \((P(T- \mid D- ))\) divided by the total number of occurrences that the case definition is not true \(\left(P\left(D-\right)\right)\):