HAV Exercise EpiInfo (Part 2)
HAV Exercise EpiInfo (Part 2)
HAV Exercise EpiInfo (Part 2)
Case study
Outbreak with Hepatitis A virus in
Scandinavian countries 2013
Hepatitis A
Hepatitis A virus (HAV) is a single-stranded, non-enveloped RNA virus belonging to the Picorna
family of viruses (same family as poliovirus). It can infect only humans and some primates. The
route of transmission is faecal-oral.
Symptoms of infection range from mild to severe, and can include fever, malaise, loss of appetite,
diarrhea, nausea, abdominal discomfort, dark-colored urine, and jaundice. The disease is rarely
fatal, and there is no chronic state (as with Hepatitis B and C infections). Adults have symptoms of
illness more often than children. The incubation period is long, typically three weeks (normally 14
to 28 days). After infection, lifelong immunity is gained. Vaccine exists which provides good
protection.
Infections are generally transmitted via food or water that has become contaminated with HAV or
via person-to-person contact. HAV is among the most frequent causes of foodborne infections
worldwide and can cause huge outbreaks. The disease is typical for areas with poor sanitation and
is endemic in many areas in the developing world, but not in Europe. In Northern Europe, infections
are generally associated with foreign travel and outbreaks are very rare.
Diagnostic testing for HAV is performed at local laboratories in Denmark. The standard test is
serological and the patient material is a blood sample that is examined for antibodies (IgM) against
the virus. Virus typing is carried out only at the reference laboratory at SSI. This laboratory receives
material from a subset of the IgM-positive patients in Denmark. Such samples will undergo
confirmatory identification of virus RNA by PCR and further characterisation by genotyping and
subtyping. This consists in sequencing relevant areas of the viral genome (the VP1 region). If two
persons are not infected from the same source, these sequences will most likely vary.
The epidemiology and the surveillance system for HAV in Norway is not exactly similar, but very
much comparable to that of Denmark.
2
Case definition
The following case definition was developed for the outbreak:
• Probable case: A person living in Denmark or Norway with clinical illness associated with
Hepatitis A and positive for HAV IgM antibodies, no travel history outside of Western
European countries or other known HAV risk factors, and symptoms onset on or after 1
October 2012.
• Confirmed case: Probable case typed with HAV genotype 1B and having the outbreak strain
sequence.
3
Part 1: Descriptive analysis of outbreak data – using EpiInfo 7
Throughout this exercise, you should imagine that you are part of the team working on solving
an outbreak of HAV infections in Denmark.
It is March 2013. An outbreak has been recognised and an outbreak team formed following international
alerts, it has become clear that Norway is also experiencing an outbreak with HAV - and it is thought that
the source of infection in both countries is the same. However, the source is not known. As of first of
March, 20 Danish cases are being counted as part of the outbreak of which 10 are confirmed
microbiologically. In Norway, there are six cases of which three are confirmed.
Most of the cases in Denmark and Norway have been interviewed by yourself or your Norwegian
colleagues. The interviews have two aims. Firstly, to establish if notified patients likely were part of the
outbreak, by assessing time of onset, foreign travel and similar characteristics. Secondly, to generate
hypotheses about the source of the illness. This is done by asking detailed questions about food intake
and behaviour at the likely time of infection. Questions can help clarify, if cases were infected directly
from another case (and thus were secondary cases), and so on.
To collect this information, a questionnaire has been developed. Patients have been interviewed over the
phone using the questionnaire - and the questionnaire filled in on paper. A data entry form was made
using EpiInfo 7 and the data were entered into the computer.
On the next page you can see a shortened form of the questionnaire that was used. Also shown is a table
(divided into two) with a few rows of the dataset that resulted from the questionnaire.
Task 1.
• Look at the questionnaire and the dataset (next page).
• Make sure you understand what the column headings in the table means - what is for
instance “DiseaseOnset” and “NotificationDate”.
• Make sure you understand how the data shown, were generated.
4
5
Task 2.
• Which types of descriptive analysis would you suggest making?
At the course site at D2L, you will have the full dataset (“Dataset Session 2a”) in Excel format.
We will now do the descriptive analysis using EpiInfo 7. We will use the so-called “Visual
Dashboard” function. However, it is also possible to use the “Classical Analysis” format (so if
there is time, familiarise yourself with both, and use the one you like the better).
Now that you have successfully loaded the dataset, please perform the tasks listed on the
next page using the Visual Dashboard. For most of the tasks, you will simply need to right
click with the mouse and then choose one of the ‘Analysis Gadgets’ – see the figure
above.
6
Task 3 – Line list
• Make a line list containing basic information (i.e., case no., case, age, sex, date of onset, date of
diagnosis, country of residence, foreign travel).
• Use the line list to count the number of cases. How many are there?
Task 8 – Map
• Using the map function, make a map of where the cases are living so that it easy to see which cases
live in each country. From the main interface page of the program, press ‘Create maps’. Then go to
the ‘Add Data Layer’ menu and choose ‘Case Cluster’. Then you can again open the Excel dataset.
You are then asked to choose the latitude and longitude variables. The map will then be made i.e.
you can now see the addresses of cases on the map.
• Zoom in on Denmark. Redo the map so that you stratify on males and females and applied to
different colors to each gender. Using the map layers function at the lower end of the map you can
restrict the cases shown e.g. only females in Denmark. Then you can add another layer e.g. only
males in Denmark. Having the two layers shown at the same time in different colors will give you a
distribution of male and female cases in Denmark.
Task 9 – Saving
• Finally, remember to save your output. You can save what you have made on the visual dashboard.
This way you can open it again with EpiInfo and continue working with the data and the output you
made by clicking ‘Save’ in the top blue bar. The resulting file will have the suffix .cvs7, which can be
opened in EpiInfo. You can also export the visuals as a webpage (HTML) or to Excel and Word by
right clicking on the visual dashboard and choose ‘Send output to’ and then e.g. ‘Microsoft word’.
You can also copy and paste each element from the dashboard into a Word or PowerPoint
document if you so prefer. 7
Part 2: Case-control study– using EpiInfo 7
Throughout this exercise, you should imagine that you are part of the team working on solving
an outbreak of HAV infections in Denmark.
Case-control study:
The cases included in the case-control study were the most recent cases from the line list.
Control-persons, representing the background population, were selected from the Danish
population register. They were selected to match cases individually, meaning that they had the
same age (was born within the two months of the case), had the same sex and lived in the same
municipality. For every case two controls were interviewed.
8
For the case-control study, information was collected using telephone interviews by filling in
questionnaires on paper. This was done from March 6 to March 14. In total, 25 cases and 50
matching controls were interviewed. A data entry form was then made and the data entered into
the computer. On the next pages you can see an extract of the questionnaire that were used for
the case-control study (the real questionnaire was longer). The table shows the first rows of data
in table format.
9
10
You now wish to make an analysis of the data. We will do that using the computer. We need to
work fast, since people are getting ill. The media knows that you’re conducting the study and the
main newspapers have been calling to hear about the results, and also the food authorities are
eagerly requesting the information in order to take action if needed.
Before we begin the data analysis using the computer, we do a rough calculation of two food
items that we are particularly interested in: eating edamame beans and drinking home-made
smoothies. We count the number of cases and controls that have reported eating these two
products.
We learn that 2 cases said that they likely ate edamame beans before becoming ill, while 23 said
that they didn’t or most likely didn’t. For the controls, 4 persons said that they ate edamame
beans, while 46 said that they didn’t.
Concerning home-made smoothies, 18 cases said that they likely drank home-made smoothies
before becoming ill, while 7 said that they didn’t or most likely didn’t. For the control persons, 10
said that they drank home-made smoothies and 40 said they didn’t.
Task 3
• Without using the computer, make a 2×2 table for each of these two exposures. Calculate
the crude odds ratios for both and discuss the results.
11
A dataset ‘Dataset Session 2B’ is supplied in Excel format. You now wish to examine the dataset
and perform the odds ratio calculation for all food items. (Hint: use the create new variable to
form a ‘Food’ variable, so you don’t have to make a cross tabulation for all the food items). First,
you will do the odds ratio calculation without taking the matching of cases and controls into
account. This is formally wrong, but later you will do both a matched analysis and a logistic
regression.
For the present analysis, we will use the functions on the Visual Dashboard. Start by opening the
dataset, as in Part 1. For most of the tasks, you will simply need to right click with the mouse and
then choose one of the ‘Analysis Gadgets’.
12
A dataset ‘Dataset Session 2B one control only matched’ is supplied in Excel format. Here only
one control is retained, as EpiInfo only allows for one control per case for the paired match
analysis. You now wish to examine the dataset and perform a matched analysis for each
suspected food item. In the dataset, new exposure variables have been created recoding the 1/0
to yes/no. Likewise, a CaCo2 variable has been created by recoding the CaCo to yes/no.
Again, we will use the functions on the Visual Dashboard. Start by opening the dataset, as in the
previous exercise.
13
Go back to the full dataset ‘Dataset Session 2B’. Create a new dependent variable called ‘ill’ by
recoding the CaCo to 1 and 0 values. Perform a matched logistic regression analysis including
multiple independent variables (consider carefully which to include).
Again, we will use the functions on the Visual Dashboard. Start by opening the dataset, as in the
previous exercise.
14
15