Sir Model
Sir Model
Sir Model
net/publication/350936274
CITATIONS READS
0 649
2 authors, including:
Josue A Ruiz
University at Albany, The State University of New York
2 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Josue A Ruiz on 17 April 2021.
Abstract—In this work, we will be implementing a com- model using vital dynamics for the ongoing COVID-19 pan-
partmental SIR model for the ongoing COVID-19 pandemic demic. The purpose is to find when the infection peak per
considering the birth rate and natural mortality rates. Important day will occur and interpret the parameters of the model for
results obtained from the model, like the basic reproduction
number R0 , provide information on how fast the spread in a a particular country or territory. It is important to emphasize
specific population would be. This value will give us a foundation that this model is a simplification of the original model from
for the infection function to be used to predict short and long stated in 1927 [1].
term epidemic behavior; then will facilitate us finding when
the maximum infections per day occur. This will help with the II. R ELATED W ORK
program control optimization and to reduce spread among the
population; which is one of the most challenging problems. Among all the epidemiological models, SIR without vital
dynamics and constant population is one of the fundamental
I. I NTRODUCTION
models. This contains population divided into three compart-
The dynamics of infectious diseases were first studied in ments (disease-states) associated with time-dependent function
1760 by Daniel Bernoulli and he showed advantages in his in the model: susceptible S(t) , people who are not infected
epidemiological foundation of spread mechanisms through the and not immune; infected I(t), people who have the disease
SIR model. The epidemiology area has contributed to a better and are infectious; and the recovered R(t), people who has
comprehension of the dynamic behavior of infectious diseases, no longer the virus and are permanently immune to it. We
how they impact the population, and possible short and long assume that a person becomes infected only by direct contact
term spread predictions. Kermack and McKendrick [1] worked with infectious individuals and has permanent immunity after
on the epidemiological compartmental model in 1927 for the recovery. This model can be represented as coupled system of
pandemic that occurred in Mumbai late XIX century which ordinary differential equations:
produced important findings. Since then, mathematical mod-
eling has become an essential topic to study the factors that dS βSI
=−
influence the transmission and spread of infectious diseases. dt N
One of the most important advantages of these mathematical dI βSI
= − γI (1)
models is that they can be used to optimize several control dt N
dR
programs to reduce spread among the population and minimize = γI
health system overload. dt
The mathematical interpretation of these epidemiological Where the parameter β stands for contact rate and γ the
models often takes the form of a coupled system of nonlinear recovery rate.
ordinary differential equations. For various biological reasons, Researchers like Gupta e.t al [4] applied the basic model
the actual dynamic behavior of an epidemic depends not only to the H1N1 outbreak that occurred in Rajasthan India in
on its current state but also on its past history because the June 2009. They based on assumptions that individuals mix
recovered individuals remain with permanent immunity. In the at random, and the population size remains constant over
modeling of infectious diseases, the incidence function is one time. They found that reducing the value of R0 will decrease
of the important factors to decide the dynamics of epidemic the proportion of the total population infected, however, the
models. The standard incidence rates is an increasing function duration of the outbreak can be prolonged. Among other
of the total of the infected class and have been used in the first examples we have Yellow Fever occurred in late 2015 in
models in the literature [2]. The dynamics can be determined Angola. Hossain e.t al [5] use the same SIR model and
by the basic reproduction number R0 . This number is defined implement numerical method Runge-Kutta to solve initial
as the number of cases directly generated by one case in a value problems for the ordinary differential equations were
population where all individuals are susceptible to infection used. The data used for Angola contains a small window
[3]. If R0 < 1, the disease will start to vanish otherwise, the of time of 60 days birth and death rates are not considered.
disease will spread among the population. In this paper, we Hethcote [6] suggests the SIR model without vital dynamics
will be implementing the compartmental SIR epidemiological might be appropriate for describing an epidemic outbreak
during a short time period, whereas the SIR model with vital
dynamics would be appropriate over a longer time period. One
of the main limitations of SIR without vital dynamics is that
it is tough to differentiate between the number of people who
have died and the number of people who have survived with
permanent immunity because they both fall under recovered Fig. 2: SIR with vital dynamics
class in the simple model. We propose to apply the SIR model
with vital dynamics to the epidemic of COVID-19 to estimate
infection behavior more precisely. • Λ is the birth rate: [ births
1000 ]/[time]
III. P ROBLEM • µ is the natural mortality rate: [ deaths
1000 ]/[time]
1
A. Schema • β is the contact rate: [time]
1
The following diagram describes in a general form the • γ is the recovery rate: [time]
process for the application of the SIR model. Initially, we The mathematical formulation that describes the disease
read the daily cumulative infected data, then proceed to dynamics is given by the following coupled system of ordinary
smooth them using the Savitzky-Golay filter. The third box differential equations where the initial conditions are non-
establishes the initial conditions of the system of differential negative:
equations. The model uses these conditions and generates dS βSI
partial trajectories of the functions S(t), I(t), R(t). The fourth = Λ − µS −
dt N
box uses the Trust-Region Algorithm to fit the best curve dI βSI
of the model to the smoothed data recursively. Finally, we = − γI − µI (2)
dt N
obtained the optimal parameters of the model which describes dR
the propagation curve of the epidemic. = γI − µR
dt
Where S(0) = N (0) − I(0), I(0) ≥ 1, R(0) ≥ 0 and
population over time is N (t) = S(t) + I(t) + R(t).
D. Parameter Setting
As previously discussed, this model has 4 different adjust-
ment parameters that describe a particular behavior. Some of
these can be obtained from the preliminary statistical data and
the demographics of each population. For example, γ can be
easily obtained since it is inversely proportional to the average
Fig. 1: High level processing schema recovery time of the disease. Statistics suggest that it is a time
between 2-5 weeks for mild and severe cases [9]; therefore,
we can define the domain for γ to be [1/14 days, 1/35 days].
B. Data Collection On the other hand, the birth rate Λ is given by demographic
The purpose of this work is to implement the SIR model data and it is a fixed value for each population. The death
into a real-world problem. So as to achieve this goal two rate µ is also fixed, but it remained as a free parameter to be
datasets providing daily information for ongoing corona-virus computed using the data to be found. It must be taken into
infections in the United States and other countries were used: account that both rates are annual; consequently, scaling the
the COVID-19 Data Repository by the Center for Systems fraction of years that represents the integration time in days
Science and Engineering (CSSE) and The New York Times’ (number of data points) is necessary. Particularly the contact
data. CSSE at Johns Hopkins University contains data for rate β is the most difficult one to obtain since it cannot be
the number of people infected each day by country [7]. The measured directly. In this case, the value of the parameter is
New York Times provides similar information but exclusively left free between (0, 1) and it is obtained once the model is
for the States and Territories of the United States [8]. It is fitted to the infected data of the population.
important to emphasize that the data of the infected population
represents the daily cumulative, whereas the curve I(t) in the E. Model Fitting
model refers to the number of infected per day t. The solution of the coupled ordinary differential equations
is solved numerically with the help of subroutines provided
C. SIR Model by Python libraries. The ode int package uses a numerical
We consider the SIR epidemic model with vital dynamics integration routine to compute function values for each curves
with an initial total population size N . The transitions between S(t), I(t), and R(t) of the model with optimal parameters. The
disease states are described as follows: optimal parameters are calculated adjusting the solution of the
curve I(t) to the smoothed infected data. The adjustment is
made using Trust-Region-Reflective Least Squares Algorithm.
The fit problem is one of constrained nonlinear optimization R0 . This means that if the value of this parameter is high, the
defined as follows: average citizen of that population is infecting more people,
1X
τ which causes a delay in stopping the rapid spread.
ρopt = arg min (ŷt − f (t, [β, γ, µ, Λ]))2 State Start Date Peak Date Peak ∆t R0
τ t=1
New York 4/3/2020 9/4/2020 36 6.52
s.t β ∈ (0, 1)
(3) California 24/2/2020 12/9/2020 201 32.68
1 1 Mass. 7/3/2020 22/4/2020 46 7.32
γ∈ ,
35 days 14 days Florida 7/3/2020 7/12/2020 275 47.98
365 Alabama 14/3/2020 19/8/2020 158 26.31
µ∈ (µmin , µmax )
m Texas 24/2/2020 12/12/2020 292 53.08
Note that µ is a free parameter and has been defined in a Arizona 14/3/2020 3/9/2020 173 27.87
certain interval. Values µmin = 1.6 deaths/1000 per year and Hawaii 16/3/2020 2/4/2020 17 3.00
µmax = 15.4 deaths/1000 per year are the range rates of New Jersey 9/3/2020 15/4/2020 37 7.23
mortality rates in the countries around the world [10]. The Utah 14/3/2020 19/9/2020 189 34.97
value of ŷi in Equation (2) represents the new smoothed data.
The data was smoothed using the Savitzky−Golay algorithm TABLE II: States comparison in the United States (Update:
which applies a local polynomial regression for equispaced July 9th , 2020)
points yi in one dimension to determine the new value ŷi .
This smoothing preserves the characteristics of the initial
distribution and helps the model fitting process [11].
Each SIR model has a basic reproduction number value R0
based on the number of parameters it has. In this case, the
general form of R0 using SIR with vital dynamics can be
described in term of four adjustment parameters as follows:
Λβ
R0 = . (4)
µ(µ + γ)
As part of the results, the values of the average R0 are reported
in order to make comparisons between the populations. This
variable is monitored daily because when its value is equal or
less than one the disease stops spreading.
IV. R ESULTS
Fig. 3: New York infections per day
Once the optimal fit between the data and the model is
found, an average R0 value can be computed. This value states
how many susceptible individuals can be directly infected by
infectious person and was discussed in further detail in the
previous section. Below are some results for different countries
and territories of the United States.