Estimation of basic reproduction number of the Middle East respiratory syndrome coronavirus (MERS-CoV) during the outbreak in South Korea, 2015

Background In South Korea, an outbreak of Middle East respiratory syndrome (MERS) occurred in 2015. It was the second largest MERS outbreak. As a result of the outbreak in South Korea, 186 infections were reported, and 36 patients died. At least 16,693 people were isolated with suspicious symptoms. This paper estimates the basic reproduction number of the MERS coronavirus (CoV), using data on the spread of MERS in South Korea. Methods The basic reproduction number of an epidemic is defined as the average number of secondary cases that an infected subject produces over its infectious period in a susceptible and uninfected population. To estimate the basic reproduction number of the MERS-CoV, we employ data from the 2015 South Korea MERS outbreak and the susceptible-infected-removed (SIR) model, a mathematical model that uses a set of ordinary differential equations (ODEs). Results We fit the model to the epidemic data of the South Korea outbreak minimizing the sum of the squared errors to identify model parameters. Also we derive the basic reproductive number as the terms of the parameters of the SIR model. Then we determine the basic reproduction number of the MERS-CoV in South Korea in 2015 as 8.0977. It is worth comparing with the basic reproductive number of the 2014 Ebola outbreak in West Africa including Guinea, Sierra Leone, and Liberia, which had values of 1.5–2.5. Conclusions There was no intervention to control the infection in the early phase of the outbreak, thus the data used here provide the best conditions to evaluate the epidemic characteristics of MERS, such as the basic reproduction number. An evaluation of basic reproduction number using epidemic data could be problematic if there are stochastic fluctuations in the early phase of the outbreak, or if the report is not accurate and there is bias in the data. Such problems are not relevant to this study because the data used here were precisely reported and verified by Korea Hospital Association.


Background
The Middle East respiratory syndrome (MERS) is caused by a coronavirus (CoV), the MERS-CoV. In Saudi Arabia, the first case of the disease was reported in 2012 [1]. The first case of MERS in the Republic of Korea was identified on 20 May 2015 [2]. A significant outbreak of MERS occurred in South Korea and lasted for almost three months, from May to July 2015.
The 2015 MERS spread in South Korea is the second largest outbreak recorded to date [3]. As a result of the outbreak in South Korea, 186 infections were reported, and 36 patients died. At least 16,693 people were isolated with suspicious symptoms [4]. This paper evaluates the basic reproduction number of MERS-CoV, using data from the 2015 South Korea outbreak.
The basic reproduction number (generally denoted as R 0 ) of an epidemic is defined as the average number of secondary cases that an infected subject produces over its infectious period in a susceptible and uninfected population [5][6][7][8][9][10][11]. It can estimate the growth rate of an infectious disease at the early stage of the outbreak, when most individuals are susceptible [12]. The basic reproduction number of an epidemic is useful for determining whether an outbreak of the disease will occur or not [13], and for analyzing epidemic properties of the disease further [14].
Note that R 0 is referred to as the basic reproductive (or reproduction) number (or ratio). The basic reproductive (or reproduction) rate is incorrect nomenclature because R 0 is a dimensionless number that is not related to any physical quantity corresponding to rate.
Based on [11], the reasons to estimate the basic reproduction number of an epidemic are summarized as follows: First, we can relatively evaluate the risk of the corresponding epidemic using R 0 . In other words, we can compare the infectivity of the epidemic with others, already familiar to us. Second, the reproduction number can be evaluated multiple times, e.g., before (and after) an infection control measure intervention. To this end, it is needed to distinguish the reproduction number after control intervention from the basic reproductive number R 0 , which is estimated before the intervention. Now we refer to the reproduction number after control intervention as the effective reproduction number, denoted by R eff . Then we can compare R eff with R 0 , and we can evaluate the efficacy of a control measure quantitatively based on R eff . By doing so, eventually we can determine how to apply control intervention to reduce R eff to less than one. If R eff < 1, it is concluded that the control intervention works effectively, and that the outbreak will eventually be controlled by reducing the reproduction number to less than the threshold level, 1.
In this paper, to estimate the basic reproduction number of the MERS-CoV, we employ data from the 2015 South Korea MERS outbreak and the susceptible-infected-removed (SIR) model [13,15,16], a mathematical model that uses a set of ordinary differential equations (ODEs). Because the availability of epidemic data is limited, we usually employ non-structured deterministic models to evaluate R 0 [11].
Based on the data reported from the 2015 South Korea outbreak of MERS, we evaluate the basic reproductive number of the virus, MERS-CoV. We fit the model to epidemic data from the South Korea outbreak, and identify model parameters and the basic reproduction number, R 0 . Note that other epidemiological parameters, such as incubation period and serial interval, have been discussed in [17] for the outbreak.
Preliminary work relating to this paper is presented in [18]. This paper includes an analysis of the derivation of the basic reproduction number, careful screening of the reported data, and a sophisticated approach using the sum of the squared errors to evaluate the basic reproduction number precisely.
A number of papers and books have been dedicated to the study of R 0 for other infectious diseases. A few of them are as follows: see [19] for the R 0 of severe acute respiratory syndrome(SARS), [20] for the R 0 of influenza, [21,22] for the R 0 of Ebola, and [23] for the R 0 of malaria.
Although some literature on the study of R 0 of the MERS-CoV has been reported such as [12,24], the MERS outbreak in South Korea is unique [25]. MERS spread almost naturally without any intervention in the early stage, and the Korean government did not respond appropriately [3,25]. The list of medical facilities involved was not even announced to public. Ironically, it is the ideal condition to fit a mathematical model to the clinical epidemic data and to evaluate epidemic properties of the MERS-CoV, including the basic reproduction number.
The paper is organized as follows: In "Methods" section, a mathematical model, which comprises a set of ordinary differential equations, is introduced, and an estimation method is discussed for the basic reproductive number. "Results" section describes the evaluation result, providing further discussion. "Conclusions" section concludes the paper by suggesting additional research.

Methods
In this section, we briefly discuss the method employed in this paper, including a definition of the basic reproduction number, and we introduce the SIR model.

Basic reproduction number
The basic reproduction number is defined as the number of secondary cases that one infected primary subject causes on average in an uninfected and totally susceptible population, over the infectious period [8][9][10]. Based on this definition, we obtain a mathematical description of R 0 via the so-called 'survival function' [7,11].
Considering a large population, this description is given by where F(a) is the survival function describing the probability of a newly infected patient will be infectious at least to time a, and b(a) is the average number of infected subjects whom one infected patient will produce per unit time at time a [7,11]. The notation of (1) follows the usage therein. Formula (1) is derived from the R 0 definition and can thus be used for any mathematical model, not just models given by ODEs. However, it requires explicit expressions for F (·) and b(·), which are functions of time. This paper employs the SIR model described by ODEs that is introduced in the next section. Because the model does not provide  16:79 explicit descriptions for F (·) and b(·), we use an alternative expression for R 0 , derived from the SIR model.

SIR model
To investigate the spread dynamics of epidemics, several nonlinear mathematical models have been studied. We employ one of the models, the SIR model that is described by where the states S, I, and R correspond to the number of susceptible people, the number of infected, and the number of removed, respectively. Note that state R includes both deceased and recovered patients. For each subject group of S, I, and R, we assume that the properties of the subjects (for example, infectiveness, susceptibility, and so forth) are homogeneous. The parameter β is the disease transmission rate and the parameter ν is the removed rate. Note that both parameters are positive real.
The notation of (2) follows that used in [16]. Table 1 presents descriptions of system parameters and state variables for model (2). Parameters κ and τ are discussed further in the "Results" section. For a further explanation of the SIR model, see [13,15,16].
Any term related to birth and death in the population that is not caused by MERS is not included in the model (2). The dynamics of the disease (e.g., infection or recovery) is assumed to be significantly faster than that of birth and death in the population. Generally, epidemic models such as SIR do not include birth and death because zero net change of the population is assumed. If we model an infectious disease with comparatively slow dynamics (e.g., an endemic disease), we must consider dynamic terms describing birth and death.
The system parameters, which are all rates, are positive and real in the model (2). Because of the definition of R 0 , an alternative description for R 0 can be derived from model (2): where d is the infection duration, κ is the transmittable contact number by one infected subject per one unit time, and τ is the transmissibility of the infectious disease, which corresponds to the probability of infection per one contact between an infected patient and a susceptible individual. It is notable that the removed rate is reciprocal to the infection duration by the assumption of constant rates for the SIR model (2):

Results
In Table 2, the history of the MERS-CoV spread status is presented. The Ministry of Health and Welfare, Korea officially announced the data. "Infected" represents the accumulated number of infected patients. "Deceased" is the number of dead subjects. "Recovered" is the number of individuals returning to healthy status. All entries in the table are as of the "Date". The number of infected patients includes both removed and recovered patients. Table 2 shows the MERS-CoV spread data for only the initial phase of the outbreak, i.e., from 20 May 2015 to 12 June 2015.
On 7 June 2015, the South Korean government disclosed to the public the list of all hospitals exposed to MERS-CoV, with the dates and duration of exposure [4]. This is  Page 6 of 11 Chang BioMed Eng OnLine (2017) 16:79 the first intervention of the government to control the spread. Before this date, there was no control action that could affect estimation of the basic reproduction number of MERS-CoV.
The incubation period of MERS-CoV that can range from 2 to 14 days, is 5 days on average [26]. Thus, we use reported data up to 12 June 2015.
The total population size is denoted by N(t): It can be seen that from the SIR model (2)  Then, we can describe the SIR model (2) as a two-dimensional model. Considering the magnitude of the numbers in Table 2, it is assumed that Compared to the total size of the population, the number of outbreak cases is small. The N of South Korea is now known to be over 51 million. If the number of outbreak cases is much smaller than the total size of the population, the number N does not need to be exact to estimate system parameters of the SIR model (2) [21,22]. From [16], where τ is the transmissibility of the infectious disease and κ is the transmittable contact number by one infected subject per one unit time. Then, the model where the dimension is reduced by assumption (4) is The initial state for the SIR model is provided by Table 2, given by Now, we only need to know the parameter pair, (κτ, ν), to solve model (5) numerically.
To solve the mathematical model and to evaluate the basic reproduction number of the model, we do not need to know each κ and τ necessarily, but we can use the value of the product, i.e., κτ.
We search for the parameter pair (κτ, ν) such that can respond appropriately with the data in Table 2. To evaluate how closely system response is fitted to the data in Table 2, we employ a quantitative measure, the sum of the squared errors. Once we obtain the optimal values for the parameters with respect to this measure, we can estimate the basic reproduction number as described in the "Methods" section.   16:79 We define the measure f E (·, ·) as where n = 23 and k corresponds to the date. For example, k = 0 and k = 23 indicate the dates 20 May 2015 and 12 June 2015, respectively. I i (·) and R i (·) are from the simulation result of model (5) with the initial condition (6) and the corresponding parameters, i.e., the function arguments (κ i τ i , ν i ). R T (k) is the sum of the values of the "Deceased" and "Recovered" at k in Table 2. I T (k) is the value, R T (k) subtracted from the value of the "Infected" at k. See Table 3. Based on the data of Table 2, I T (k) and R T (k) along with k are shown in Table 3.
To compare the quantitative measures for each pair of parameters, we consider the plane, i.e., 2-dimensional space, of the parameters, (κτ, ν). We can obtain a surface in 3-dimensional space by plotting the corresponding measure as the value along the third axis.
We explore the plane for wide ranges of parameters, and one of the results is shown in Fig. 1. This figure can help show the relation between f E and the system parameters. To effectively capture the characteristics of the measure, the ranges of κτ and ν in the figure are [0.00, 0.26] and [0.00, 0.12], respectively. Table 3 I T (k) and R T (k) of function (7), derived from Table 2 k The function f E (·, ·) describes the sum of squared error between the outbreak data of Table 2 and the simulation result of model (5) with the initial condition (6) and the Fig. 1 Plot of f E (κτ , ν) of (7) on the plane of (κτ, ν). f E (·, ·) is a quantitative measure, the sum of the squared errors, that describes how the system response of model (5) with corresponding parameters is close to the data in Table 2 16:79 function arguments. Thus, if we find the parameters minimizing the function (7), then these parameters can be considered to correspond to the case of Table 2. Figure 3 shows the data presented in Table 2 and the state trajectories of model (5) with the parameter values obtained from Fig. 2. The top panel of Fig. 3 shows infected patients, and the bottom panel shows deceased or recovered patients. In both panels, the circle marks and the solid line display patient numbers based on Table 2 and the state transition of model (5), respectively.
We derive the basic reproductive number for the SIR model (2) as in the "Methods" section, based on [16]. Equation (8) can be obtained alternatively by using the next generator approach [6,8,11,27] to model (5). We determine the R 0 of the MERS-CoV in South Korea in 2015 as 8.0977 (i.e., 0.21153/0.026122). It is worth comparing with R 0 of the 2014 Ebola outbreak, which had values of 1.5-2.5 [21].  Table 2 and the state trajectories of model (5) with the estimated parameters. In the top panel, the circle marks plot the number of "Infected" patients from Table 2, and the solid line depicts the transition of state I of model (5). In the bottom panel, the circle marks indicate the sum of the numbers of "Deceased" and "Recovered" patients, and the solid line depicts the transition of state R of the model

Conclusions
In this paper, we evaluated the basic reproduction number of the MERS-CoV outbreak that occurred in 2015 in South Korea, using officially reported data. We employed a mathematical dynamic model, the SIR model. We first fit the response of the SIR model to the epidemic curve data reported from the MERS outbreak. Then, we identified the system parameters of the model to estimate the basic reproduction number.
Because there was no intervention to control the infection in the early phase of the outbreak, the data used here provide the best conditions to evaluate the epidemic characteristics of MERS, such as the basic reproduction number. An evaluation of R 0 using epidemic data could be problematic if there are stochastic fluctuations in the early phase of the outbreak, or if the report is not accurate and there is bias in the data [11]. Such problems are not relevant to this study because the data used here were precisely reported and verified by [4].
We conclude this paper with the following discussion on future work to overcome the limitations of research, derived from assumptions in the paper.

Further research direction
• Behind the SIR model (2), there are several strong assumptions, one of which is a zero latent period, i.e., the incubation period is zero. This implies that a patient becomes infectious immediately after infection. However, the incubation phase occurs during the course of the MERS outbreak. To address this weak point, in future work we could consider the 4-dimensional SEIR (i.e., susceptible-exposed-infectious-removed) model, which has been employed in [28] to study Ebola epidemic model. The additional state in the SEIR model can help us deal with the latent period. • In this paper, we considered the epidemic curve data in [4] only from the early stage of the 2015 MERS outbreak in South Korea, where there was no intervention to control the spread. Accordingly, we evaluated R 0 based on the data. In future work, we will also consider the epidemic data in [4] from the later (or closing) stage of the MERS spread in South Korea in 2015, so we can estimate the effective production number (i.e., R eff ), which is the production number resulting from interventions, such as education, quarantine, and the tracing of contacts by infected patients. By doing so, we can evaluate the effectiveness of each control measure on the spread of the infectious disease [29]. Eventually, such evaluation could help us improve public health policy.