Proability Principles
Proability Principles
Proability Principles
Prepared by:
Dr. Sampson Twumasi-Ankrah
1 / 37
Basic Concepts of Probability
Introduction
Probability forms the basis of inferential statistics. We can
think of the probability of an outcome as the likelihood of
observing that outcome. If something has a high likelihood of
happening, it has a high probability (close to 1). If something
has a small chance of happening, it has a low probability (close
to 0). If something occurs that has a low probability, we
investigate to find out ”whats up”.
Probability
Is a measure of the likelihood of a random phenomenon or
chance behavior. Probability describes the long-term proportion
with which a certain outcome will occur in situations with
short-term uncertainty.
2 / 37
Basic Concepts of Probability Cont’d
Illustration
Consider an experiment in which only one of two possible
outcomes can occur. For example, the result of treatment with
an antibiotic is that an infection is either cured or not cured
within 5 days.
4 / 37
Definitions Cont’d
b. Trial
Each repetition of an experiment is called a trial. That is, a
trial is a single performance of an experiment.
c. Outcome
The possible result of each trial of an experiment is called an
outcome. When an outcome of an experiment has equal chance
of occurring as the others the outcomes are said to be equally
likely. For example, the toss of a coin and a die yield the
possible outcomes in the sets, {H, T} and {1, 2, 3, 4, 5, 6} and
a play of a football match yields {win (W), loss (L), draw (D)}.
Sample Space
Sample space is the collection of all possible outcomes at a
probability experiment. We use the notation S for sample
space. Each element or outcome of the experiment is called
sample point.
5 / 37
Example of Sample Spaces
S = {HH, HT, T H, T T }
6 / 37
Definitions Cont’d
Event:
An event is a collection of one or more outcomes from an
experiment. That is, it is a subset of a sample space. It is
denoted by a capital letter. For example we may have:
The event of observing a head (H) in three tosses of a coin,
A = {HTT, TTH}
Consider a newly married couple planning to have three
children. The event of the family having two girls is: D =
{BGG, GBG, GGB}
Tree Diagram:
The tree diagram represents pictorially the outcomes of random
experiment. The probability of an outcome which is a sequence
of trials, is represented by any path of the tree. For example,
7 / 37
Definitions Cont’d
Consider a couple planning to have three children, assuming each
child born is equally likely to be a boy (B) or girl (G).
8 / 37
Determination of Probability of an Event
9 / 37
Probability of an Event
a. The Classical Definition
This is based on the assumption that the outcomes of an
experiment are equally likely. For example, if an experiment
can lead to n mutually exclusive and equally likely outcomes,
then the probability of the event A is defined by
Solution:
The sample space for this experiment is
S = {BBB, BBG, BGB, BGG, GBG, GGB, GGG}
Let A be the event of the couple having exactly two girls.
Then, A = {BGG, GBG, GGB}
n(A) 3
P (A) = =
n(S) 8
12 / 37
Probability of Compound Events
13 / 37
Probability of Events
Independent Events:
Two or more events are said to be independent if the
probability of occurrence of one is not influenced by the
occurrence or non- occurrence of the other(s). Mathematically,
the two events, A and B are said to be independent, if and only
if P (A ∩ B) = P (A) · P (B). However, if A and B are such that,
P (A ∩ B) = P (A) · P (B|A), they are said to be conditionally
independent.
Conditional Probability:
Let A and B be two events in the sample space, S with
P (B) > 0. The probability that an event A occurs given that
event B has already occurred, denoted P (A|B), is called the
conditional probability of A given B. The conditional
probability of A given B is defined as.
P (A|B) = P P(A∩B)
(B) , P (B) > 0. In particular, if S is a finite
n(A∩B) n(B)
equiprobable space, then P (A ∩ B) = n(B) , P (B) = n(S)
14 / 37
Probability of Events
Exhaustive Events:
Two or more events defined on the same sample space are said
to be exhaustive if their union is equal to the sample space S
(thus, if they partition the sample space mutually exclusively).
Eg: if A1 , A2 , A3 ∈ S A1 ∪ A2 ∪ A3 = S
Definition (partition of sample space):
A1 , A2 , A3 · · · An form a partition of the same sample space S if
the following hold:
1 Ai 6= ø for all i = 1, 2, 3, · · · , n
2 Ai ∩ Aj for all i 6= j, i, j = 1, 2, 3, · · · , n
Pn
i=1 Ai = S
3
17 / 37
The Multiplication Rule for P (A ∩ B)
The definition of conditional probability yields the following
result, obtained by multiplying both sides of the conditional
probability equation by P(B).
P (A ∩ B)
P (A|B) =
P (B)
P (A ∩ B)
P (A|B) ∗ P (B) = ∗ P (B)
P (B)
P (A|B) ∗ P (B) = P (A ∩ B)
This rule is important because it is often the case that P (A ∩ B)
is desired, whereas both P(B) and P (A|B) can be specified from
the problem description.
18 / 37
The Law of Total Probability
Let A1 , · · · , Ak be mutually exclusive and exhaustive events.
Then for any other event B,
Bayes’ Rule
The power of Bayes’ rule is that in many situations where we
want to compute P (A|B) it turns out that it is difficult to do so
directly, yet we might have direct information about P (B|A).
Bayes rule enables us to compute P (A|B) in terms of P (B|A).
P (A ∩ B) P (B|A)P (A)
P (A|B) = =
P (B) P (B)
19 / 37
Bayes Theorem
Let A and Ac constitute a partition of the sample space S such
that with P (A) > 0 and P (Ac ) > 0, then for any event B in S
such that P (B) > 0,
P (B|A)P (A)
P (A|B) =
P (B|A)P (A) + P (B|Ac )P (Ac )
Example
A paint-store chain produces and sells latex and semigloss
paint. Based on long-range sales, the probability that a
customer will purchase latex paint is 0.75. Of those that
purchase latex paint, 60% also purchase rollers. But only 30%
of semigloss pain buyers purchase rollers. A randomly selected
buyer purchases a roller and a can of paint. What is the
probability that the paint is latex?
20 / 37
Solution
L = {The customer purchases latex paint.}, P(L) = 0.75
S = {The customer purchases semigloss paint.}, P(S) = 0.25
R = {The customer purchases roller.}
P (R|L) = 0.6; P (R|S) = 0.3
P (L ∩ R)
P (L|R) =
P (R)
P (R|L)P (L)
=
P (R)
0.6 × 0.75
=
(0.6 × 0.75) + (0.3 × 0.25)
≈ 0.857
21 / 37
Axioms of Probability
Given an experiment and a sample space, S , the objective of
probability is to assign to each event A a number P(A), called
the probability of the event A, which will give a precise measure
of the chance that A will occur. To ensure that the probability
assignments will be consistent with our intuitive notions of prob-
ability, all assignments should satisfy the following axioms (basic
properties) of probability
A.1: For every event A, 0 ≤ P (A) ≤ 1
A.2: P(S) = 1
A.3: If A and B are mutually exclusive events, i.e A ∩ B = øthen
P (A ∪ B) = P (A) + P (B)
A.4: If A1 , A2 , · · · , An is a sequence of n mutually exclusive
events, then,
P (A1 ∪A2 ∪A3 ∪· · ·∪An ) = P (A1 )+P (A2 )+P (A3 )+· · ·+P (An )
22 / 37
Theorems
P (A0 ) = 1 − P (A)
23 / 37
Bayes’ Theorem, Screening Tests, Sensitivity,
Specificity, and Predictive Value Positive and Negative:
There are two states regarding the disease and two states regard-
ing the result of the screening test:
24 / 37
Definitions
There are two false results:
1. A false positive result:
This result happens when a test indicates a positive status
when the true status is negative. Its probability is:
P (T |D̄) = P( positive result | absence of the disease )
The Sensitivity:
The sensitivity of a test is the probability of a positive test
result given the presence of the disease. P (T |D) = P( positive
result of the test | presence of the disease )
25 / 37
Definitions Cont’d
The specificity:
The specificity of a test is the probability of a negative test
result given the absence of the disease. P (T̄ |D̄) = P( negative
result of the test | absence of the disease)
Disease
Test Result Present(D) Absent (D̄) Total
Positive(T) a b a + b = n(T )
Negative (T̄ ) c d c + d = n(T̄ )
Total a + c = n(D) b + d = n(D̄) n
For example, there are (a) subjects who have the disease and
whose screening test result was positive.
26 / 37
From the Sensitivity and Specificity Table
From this table, we may compute the following conditional prob-
abilities:
1. The probability of the false positive result:
n(T ∩ D̄) b
P (T |D̄) = =
n(D̄) b + d
2. The probability of false negative result:
n(T̄ ∩ D) c
P (T̄ |D) = =
n(D) a+c
3. The sensitivity of the screening test:
n(T ∩ D) a
P (T |D) = =
n(D) a+c
4. The specificity of the screening test:
n(T̄ ∩ D̄) d
P (T̄ |D̄) = =
n(D̄) b+d
27 / 37
Definitions of the Predictive Value Positive and
Predictive Value Negative of a Screening Test:
28 / 37
Calculating the predictive Value Positive and Predictive
Value Negative:
P (T ∩ D)
P (D|T ) =
P (T )
29 / 37
Calculating the predictive Value Positive and Predictive
Value Negative Cont’d:
But we know that:
P (T ) = P (T ∩ D) + P (T ∩ D̄)
P (T ∩ D) = P (T |D)P (D) multiplication rule.
P (T ∩ D̄) = P (T |D̄)P (D̄) multiplication rule.
P (T ) = P (T |D)P (D) + P (T |D̄)P (D̄)
P (D̄|T̄ )P (D̄)
P (D̄|T̄ ) = (2)
P (D̄|T̄ )P (D̄) + P (T̄ |D̄)P (D̄)
NOTE:
P (T̄ |D̄) = specificity
P (T̄ |D) = 1 − P (T |D) = 1- sensitivity 31 / 37
Example:
A medical research team wished to evaluate a proposed screening
test for Alzheimer’s disease. The test was given to a random sam-
ple of 450 patients with Alzheimer’s disease and an independent
random sample of 500 patients without symptoms of the disease.
The two samples were drawn from populations of subjects who
were 65 years of age or older. The results are as follows:
Alzheimer Disease
Test Result Present(D) Absent (D̄) Total
Positive(T) 436 5 441
Negative (T̄ ) 14 495 509
Total 450 500 950
n(T ∩ D) 436
p(T ∩ D) = = = 0.9689
n(D) 450
2 The specificity of the test:
n()T̄ ∩ D̄) 495
P (T̄ |D̄) = = = 0.99
n(D̄) 500
3 The probability of the disease in the general population,
P(D) : The rate of disease in the relevant general
population, P(D) , cannot be computed from the sample
data given in the table. However, it is given that the
percentage of patients with Alzheimer’s disease is 11.3%
out of all subjects who were 65 years of age or older.
Therefore P(D) can be computed to be:
11.3%
P (D) =
100%
33 / 37
The predictive value positive of the test:
We wish to estimate the probability that a subject who is positive
on the test has Alzheimer disease. We use the Bayes’ formula of
Equation (1):
P (T |D)P (D)
P (D|T ) = (3)
P (T |D)P (D) + P (T |D̄)P (D̄)
From the tabulated data, we compute:
436
P (T |D) = = 0.9689
450
n(T ∩ D̄) 5
P (T |D̄) = = = 0.01
n(D̄) 500
Substituting of these results into Equation (1), we get:
(0.9689)P (D)
P (D|T ) =
(0.9689)P (D) + (0.01)P (D̄)
34 / 37
(0.9689)(0.113)
P (D|T ) =
(0.9689)(0.113) + (0.01)(1 − 0.113)
= 0.93
As we see, in this case, the predictive value positive of the test is
very high.
The predictive value negative of the test:
We wish to estimate the probability that a subject who is
negative in the test does not have Alzheimer disease. We use
the Bayes formula of Equation (2):
P (D̄|T̄ )P (D̄)
P (D̄|T̄ ) = (4)
P (D̄|T̄ )P (D̄) + P (T̄ |D̄)P (D̄)
35 / 37
The predictive value negative of the test:
36 / 37
37 / 37