Basic Biostatistical Concepts in Research
Basic Biostatistical Concepts in Research
Basic Biostatistical Concepts in Research
E Omoregie
INTRODUCTION
STATISTICS
The science of collecting, describing, and interpreting data
STATISTICAL METHODS
Include
- clarifying the situation,
- gathering data,
- summarizing data, and
- deriving and communicating meaningful conclusions
about the population (drawn on the analysis of samples)
1
INTRODUCTION
INTRODUCTION
Population vs. Sample:
POPULATION
The collection, or set, of ALL individuals, items or events of
interest from which information is sort
SAMPLE
The subset of the population on which we make measurements
2
INTRODUCTION
AREAS OF BIOSTATISTICS:
DESCRIPTIVE: Methods of summarizing or describing a set of
data. Examples: Tables, graphs, numerical summaries.
INTRODUCTION
Types of Data:
Qualitative – non numerical
Nominal – names that describe categories, but no order. Example:
eye color, gender
Ordinal – ordered categories. Example: class standing (adults,
juveniles, males, females, etc.)
Quantitative - numerical
Discrete – number of possible values can be counted or in fixed
intervals. Number of fish, Numerical grade (4, 3.5, 3, 2.5, 2, 1.5, 1,
0)
Continuous – possible values arein a continuum. Example:
measurements in time, weight and space/distance
3
SAMPLING TECHNIQUES
Surveys and experiments should use methods to select a sample
that is “representative” of the population of interest
SAMPLING TECHNIQUES
JUDGMENTAL SAMPLING Elements are selected on the
basis of being “typical”
May be used to get “expert” opinions
Danger of biased results (like interviewing only those
households with cars if a route to Wernhill Park is to be mapped
out)
PROBABILITY SAMPLING: Each element has a certain
probability of being selected as part of the sample.
4
SAMPLING TECHNIQUES
TYPES OF SAMPLING TECHNIQUES
Simple Random Sample (SRS): A random sample (n) that is selected
such that all possible samples from the population (N) have an equal
chance of being selected.
The concept is simple, but may be costly and difficult
to implement for a large population/ geographical
area
SAMPLE
SAMPLING
FRAME
10
10
5
SAMPLING TECHNIQUES
TYPES OF SAMPLING TECHNIQUES
11
SAMPLING TECHNIQUES
TYPES OF SAMPLING TECHNIQUES
Stratified Random Sample Divide sampling frame into groups called
strata, then select units from each strata using SRS.
12
6
SAMPLING TECHNIQUES
TYPES OF SAMPLING TECHNIQUES
FINAL NOTE:
In many cases, the sampling schemes used in actual surveys are
combinations of one or more of the basic sampling techniques.
The purpose of doing so is to minimize cost, while making sure
that the sample taken is truly a “representative” of the population.
13
14
7
PROBABILITY
Why do we need to study probability?
Researcher / Businessman: he/she may be interested in
determining the likelihood of some “critical events” to occur.
Researcher: he/she may want to know how “reliable” are
his/her conclusions
15
PROBABILITY
THE RULES OF THE GAME…
You must have….
16
8
PROBABILITY
THE RULES OF THE GAME…
How do we assign a probability value to an outcome in dependent events?
Example tossing of a coin!
a) We can do the experiment so many times, and count the number of
times an outcome occur; and then we express that in relative frequency
(maximum of 1). Those are experimental or empirical probabilities.
Probability of tossing a coin 24,000 times and getting 12,012 heads.
Hence, the empirical probability of getting a head was
12012
P ( getting a head ) = = 0 . 5005
24000 N umber of outcomes belonging to the event
P ( any event ) =
Total number of outcomes
17
PROBABILITY
Experimental probabilities make use of this law:
If the number of times an experiment is repeated is increased,
the ratio of the number of successful occurrences to the number
of trials will tend to approach the theoretical probability of the
outcome of an individual trial
18
9
PROBABILITY
Probability in independent events:
Independent Events: Two or more outcomes which are independent events
if and only if the occurrence (or non-occurrence) of one does not affect the
outcome of others.
19
PROBABILITY
The normal probability distribution is considered to be the single
most important probability distribution.
is the distance (deviation) from the
inflection point to the center
20
10
PROBABILITY
Sample size affect the sample mean variability. In addition, increasing
the sample size changes the shape of the distribution of the sample
mean:
21
22
11
PARAMETRIC vs. NON-PARAMETRIC
Parametric Nonparametric
(Actual measurements (Actual measurements
used) not used – ranking)
Two samples – compare mean value t-test for independent Wald-Wolfowitz runs test
for some variable of interest samples
Mann-Whitney U test
Kolmogorov-Smirnov two
sample test
23
Parametric Nonparametric
Multiple groups Analysis of Kruskal-Wallis analysis
variance (ANOVA) of ranks
Median test
24
12
PARAMETRIC vs. NON-PARAMETRIC
Parametric Nonparametric
t-test for
Compare two variables measured in the same dependent Sign test
sample samples
Wilcoxon’s matched
pairs test
If more than two variables are measured in Repeated Friedman’s two way
same sample measures ANOVA analysis of variance
Cochran Q
25
Correlation Spearman R
coefficient
Kendall Tau
Coefficient Gamma
Two variables of interest are
categorical Chi square
Phi coefficient
Kendall coefficient of
concordance
26
13
PARAMETRIC vs. NON-PARAMETRIC
Summary Table of Statistical Tests
Level of Sample Characteristics Correlation
Measurement
1 2 Samples K Sample (i.e., >2)
Sample
Independent Dependent Independent Dependent
27
28
14
TESTING OF EXPERIMENTAL HYPOTHESIS
These two opposing hypotheses from AN EXPERIMENT form the Null and
the Alternative hypotheses:
NULL HYPOTHESIS (Ho) is the claim initially believed to be true.
It is the “status quo”
• If we find “sufficient evidence” that the two means differ, then we can
reject Ho and claim Ha
• If we don’t find “sufficient evidence” against the accused, then we just
“failed to disprove the difference”. It DOESN’T mean the weights are the
same. In other words, we only failed to reject Ho.
29
Ho is TRUE Ho is FALSE
“mean weights are “mean weights
same” differs”
Reject Ho TYPE 1 Error Correct Decision
(a) (1-b)
Fail to reject Ho Correct Decision TYPE 2 Error
(1-a) (b)
30
15
STANDARD DEVIATION AND VARIANCE
Variance
The variance and the closely-related standard deviation are measures
of how spread out a distribution is. In other words, they are measures
of variability.
Population Variance =
Sample Variance =
where μ is the mean and N is the number of scores, M is the mean of the sample and X
is the variable
31
32
16