Testing The Difference
Testing The Difference
Testing The Difference
Chapter Outline
• Testing the difference between Two Means & Proportions
• Testing the difference between Two Variances.
• Testing the difference between Two Means: Small independent
& dependent samples.
• Hypothesis Testing With Categorical Data.
1
Setting Up the Hypothesis:
For Difference between Two Means
𝑍=
𝑝Ƹ1 − 𝑝Ƹ 2 ) − (𝑝1 − 𝑝2 X1 + X 2 n1 pˆ1 + n2 pˆ 2
where p = or p =
𝑝ҧ 𝑞ത
1
+
1 n1 + n2 n1 + n2
𝑛1 𝑛2
and q =1− p
4
Example # 1
A researcher hypothesizes that the average number
of sports colleges offer for males is greater than the
average number of sports colleges offer for females.
The results are shown below. At = 0.10, is there
enough evidence to support the claim?
Males: X1 = 8.6 1 = 3.3 n1 = 50
Females: X2 = 7.9 2 = 3.3 n2 = 50
H0: m1 - m2 0
H1: m1 - m2 > 0 (claim)
= 0.10
Test statistic is
( x1 − x2 ) − ( m1 − m 2 )
z= = 1.06
12 22
+
n1 n2
The critical region is
CR: z > z0.10 = 1.28 - 1.28
0
Conclusion :
Accept Ho calculated Z lies in the acceptance region. 5
Example # 2
The same physical fitness test was given to a group
of 100 scouts and to a group of 144 guides. The
maximum score was 30. The guides obtained a
mean score of 26.81 and the scouts obtained a
mean score 27.53. If the fitness scores are normally
distributed with a common population standard
deviation of 3.48, test at 5% level of significance
whether the guides did not do as well as the scouts
in the fitness test.
H0: m1 - m2 0
H1: m1 - m2 > 0 (claim)
= 0.05
Test statistic is
𝑥ҧ1 − 𝑥ҧ2 ) − (𝜇1 − 𝜇2
𝑍= = 1.589
1 1
𝜎 𝑛 +𝑛
1 2
H0: p1 - p2 = 0
H1: p1 - p2 0 (claim)
43 + 58
where p = =0.505
= 0.05 100 + 100
Test statistic is and q = 1 − 0.505 = 0.495
𝑝Ƹ1 − 𝑝Ƹ 2 ) − (𝑝1 − 𝑝2 (0.43 − 0.58) − 0
𝑍= = = −2.12
1 1 1 1
𝑝ҧ 𝑞ത + 0.505 0.495 +
𝑛1 𝑛2 100 100
9
Question # 5
A potential buyer of light bulbs bought 50 bulbs of
each of 2 brands. Upon testing the bulbs, he
found that brand A had a mean life of 1282 hours
when 1 = 80 hours, whereas brand B had a mean
life of 1208 hours when 2 = 94 hours. Can the
buyer be quite certain that the two brands do differ
in quality? Use = 0.05. Ans: z = 4.29, CR: ± 1.96
Question # 6
An examination was given to two classes of 40
and 50 students, respectively. In the first class,
mean grade was 74 with standard deviation of 8,
while in the second class the mean grade was 78
with a standard deviation of 7. Is there a
significant difference between mean grades at
5% level of significance? Ans. Z = - 2.48
Question # 7
In a sample of 200 men, 130 said they used seat
belts. In a sample of 300 women, 63 said they
used seat belts. Test the claim that men are
more safety-conscious that women, at = 0.01. 10
Ans. Z = 10.7
Question # 8 Men
A study found a difference in the proportion of
Question # 9
A telephone service representative believes
Utility South 50%
that the proportion of customers completely Satisfaction West 43%
satisfied with their local telephone service is Customers
give their local
Question # 11
In a population that has COVID – 19 , samples of
100 males and 100 females are taken. It is found
that 31 males and 24 females have positive
COVID – 19. Can we conclude at 0.01 level of
significance that proportion of men who has
COVID – 19 is greater than proportion of
women? Ans: z = 1.11
Question # 12
The two samples A and B detailed below were taken from normal
populations of standard deviation 0.8. Test whether the difference of means
is significant. 𝐴: 𝑋ത1 = 12.800, 𝑛1 = 7 𝐵: 𝑋ത2 = 13.675, 𝑛2 = 8 Use = 0.05
12
Ans: z = - 2.11, CR: ± 1.96
3. Testing Hypothesis About Difference Between Two Variances.
(General Procedure).
(i) (a) H0: 21 = 22 and H1: 21 22 (Two Tailed Test) (Equality of variances)
(b) H0: 21 22 and H1: 1 > (One Tailed Test)
(c) H0: 21 22 and H1: 1 < (One Tailed Test)
(ii) Choose the level of significance .
(iii) The Test Statistic is
s12
where s1 s 2
2 2
F=
s 22
with V1 = n1 – 1 and V2 = n2 – 1 d.f.
(iv) The critical region is
(a) H1 : 21 22 F F (v1 , v2 )
(b) H1 : 21 > 22 F F ( 1 , 2 )
Question # 14
An instructor claims that when a composition
course is taught in conjunction with a word-processing
course, the variance in the final grades will be larger than
when the composition course is taught without the word-
processing component. Two groups are randomly selected.
The variance of the exams of the group that also had word-
processing instruction is 103, and the variance of
the exams of the students who did not have the word-
processing component is 73. Each sample consists of 20
students. At = 0.05, can the instructor’s claim be
supported? Ans: F = 1.41, CR: 2.23 16
Question # 15
A researcher claims that the variation of blood pressure of
overweight individuals is greater than the variation of
blood pressure of normal-weight individuals. The
standard deviation of the pressures of 25 overweight
people was found to be 6.2 mm Hg, and the standard
deviation of the pressures of 21 normal-weight people
was 2.7 mm Hg. At = 0.01, can the researcher conclude
that the blood pressures of overweight individuals are
more variable than those of individuals who are of normal
weight? Ans: F = 5.27, CR: 2.86
Question # 16
A researcher wishes to test the variation in the
number of pounds lost by men who follow two popular
liquid diets. Ten men follow diet A for four months, and the
standard deviation of the weight loss is 6.3 pounds. Twelve
men follow diet B for four months, and the standard
deviation of the weight loss is 4.8 pounds. At = 0.05, can
the researcher substantiate the claim that the variation in
pounds lost following diet A is greater than the variation in
pounds lost following diet B? Ans: F = 1.72, CR: 2.90 17
4. Testing the Difference Between Two Means: small independent samples.
(General Procedure).
H 0: 2 1 = 2 2 and H1: 21 22
Choose the level of significance . s12
The Test Statistic is F= where s12 s 22
s 22
Question # 18
The average income of 10 families
who reside in a large metropolitan
East Coast city is $26800, with a
standard deviation of $600. The
average income of 8 families who
reside in a rural area of the
Midwest is $25400, with a standard
deviation of $450. At = 0.05, can
it be concluded that the families
who live in the cities have a higher
income than those of live in the 20
rural areas? Ans: F = 1.78, t = 5.47
Question # 19
Two samples are randomly selected from two
classes of students who have been taught by
different methods. An examination is given, and the
results are shown as follows.
Sample Size Means Variances
Class I n1 = 8 x1 = 95 s12 = 47
Class II n2 = 10 x 2 = 97 s22 = 30
Test the hypothesis that two different methods of
teaching are equally effective at = 0.01.
Ans: F = 1.57, F 0.005 (7, 9) = 6.88, t = - 0.6889 , |t| ≥ t0.005(16) = 2.921
Question # 20
A sample of 15 teachers from Rhode Island has an
average salary of $35270, with a standard deviation
of $3256. A sample of 29 teachers from New York
has an average salary of $29512, with a standard
deviation of $1432. Is there a significant difference
in teacher’s salaries between the two states? Use
= 0.02. Ans: F = 5.17, F 2.90, t = 6.53 , t = ±2.624
0.01 (14, 28) = 0.01(14)
21
5. Testing Hypothesis about Two Means with Paired Observations.
(General Procedure). (Dependent Samples.)
(i) Ho : md = 0 (or m1 – m2 = 0 )
(a) H1 : md 0 (Two Tailed Test) There are many
(b) H1 : md 0 (One Tailed Test) situations in which
(c) H1 : md < 0 (One Tailed Test) samples are not
(ii) Choose the level of significance . independent. This
happens when the
d − md observations are found a
(iii) The Test Statistic is t = pairs as the two
sd
observations of a pair
n are related to each
other. Pairs occurs
(iv) The critical region is either naturally or by
(a) t
H1 : md 0, CR: | t | ( n − 1) design. Natural pairing
2
occurs whenever
(b) H1 : md > 0, CR: t > t (n −1) measurement is taken
on the same unit or
(c) H1 : md < 0, CR: t < – t (n − 1) individual at two
different times. Like
(v) The calculation of the test statistic. before-and-after.
(vi) Conclusion : Observations are also
Reject Ho If t lies in the critical region, otherwise accept it. paired to eliminate
effects in which there is
no interest.
22
Example # 7
The following data give a paired yield of two
varieties of wheat. Each pair was planted in
a different locality. Test the hypothesis that
the mean yields are equal. Use = 0.05.
Variety I 45 32 58 57 60 38 47 51 42 38
Variety II 47 34 60 59 63 44 49 53 46 41
Solution
Ho : md = 0
d − md
H1 : md 0 The Test Statistic is t=
= 0.05 sd
Computation n
1 2 ( d)
2
−28
x y d=x−y d2 d = = − 2.8 sd = d −
45 47 −2 4 10 n −1 n
32 34 −2 4
(−28)
2
1
58 60 −2 4 sd = 94 − = 1.32
57 59 −2 4
9 10
−3 −2.8 − 0
60 63 9 t = = − 6.71
38 44 −6 36 1.32 / 10
47 49 −2 4
51 53 −2 4
Critical region is |t|> − t0.025(9) = 2.262
42 46 −4 16
38 41 −3 9 Conclusion:
∑ −28 94 Since our calculated value t = − 6.71 falls in the region of
rejection; we will accept H1 and conclude that mean yields are
23
not equal.
Question # 21
The weight of four persons before they stopped
smoking and 5 weeks after they stopped
smoking are as follows:
Person 1 2 3 4
Before 148 176 153 116
After 154 176 151 121
Question # 22
Ten young recruits were put through a strenuous physical
training program by the Army. Their weights were
recorded before and after the training with the following
results.
Weight before 125 195 160 171 140 201 170 176 195 139
Weight after 136 201 158 184 145 195 175 190 190 145
Question # 24
An experiment was performed with seven hop plants. One half of
each plant was pollinated, and the other half was not pollinated.
The yield of the seed of each hop plant is tabulated as follows:
Plant Number 1 2 3 4 5 6 7
Pollinated 0.78 0.76 0.43 0.92 0.86 0.59 0.68
Non-Pollinated 0.21 0.12 0.32 0.29 0.30 0.20 0.14
Determine at the 5% level whether the pollinated half of the plant
gives a higher yield in seed than the non-pollinated half.
Ans: Ho : Pollinated half does not give a higher mean yield than the non-pollinated half md ≤ 0, H1 : Pollinated half
gives a higher mean yield of seed than the non-pollinated md > 0 (claim), t = 6.96, CR: 1. 𝟗𝟒𝟑 𝑑ҧ = 0.491, 𝑠𝑑 =
0.1868 Conclusion: Accept H1, and conclude that there is enough evidence to support the claim that Pollinated half
gives a higher mean yield of seed 25
Question # 25
A physical education director claims by taking a special vitamin, a
weight lifter can increase his strength. Eight athletes are selected
and given a test of strength, using the standard bench press. After
two weeks of regular training, supplemented with the vitamin, they
were tested again. Test the effectiveness of the vitamin regimen at
= 0.05. Each value in the data that follow represents the
maximum number of pounds the athlete can bench press. Assume
that the variable is approximately normally distributed.
Athlete 1 2 3 4 5 6 7 8
Before 210 230 182 205 262 253 219 216
After 219 236 179 204 270 250 222 216
Ans: In order for vitamins to be effective, the “ before” weight must be significantly less than the “after” weights;
hence, the mean of the differences must be less than zero: H1 : md < 0 (claim), Ho : md ≥ 0, t = − 1.388,
CR: − 1.895 𝑑ҧ = −2.375, 𝑠𝑑 = 4.84. Conclusion: Accept Ho, and conclude that there is not enough evidence to support the claim that vitamin
increases the strength of weight lifters.
26
PRACTICE
( Basic Skills & Concepts )
• A Doctor wants to determine if the life expectancy of people in Africa is less
than the life expectancy of people in Asia. The data obtained is shown in the
table below. Use = 0.05 • (1) H : o
Africa Asia • (2) - 1.65
1. What is the null hypothesis.
X 55.3 65.2 •
2. Calculate the critical value. (3) - 5.45
8.1 9.3 • (4) - 13.46 < m1 – m2 < -6.34
3. What is the test value.
n 53 42
4. Determine the 95% C.I. of the difference in means.
PRACTICE
1. negative
Complete the following statements 2. t-test
3. dependent
4. numerator
1. The value of F cannot be ______________.
2. To determine whether two sample variances are equal, a researcher can use a
____________.
3. When the subjects are paired or matched in some way, samples are considered to
be __________.
4. When finding the F test value, the smaller of the variances is placed in the
__________. 27
Chi-Square Test
The chi-square can be used for tests concerning frequency distribution
(goodness-of-fit), such as “If a sample of buyers is given a choice of
automobile colors, will each color be selected with the same frequency?”
28
1. c2 Test for Goodness of Fit
Example # 8
A market analyst wished to see whether
consumers have any preference among five
flavors of a new fruit soda. A sample of 100
people provided the following data:
Cherry Strawberry Orange Lime Grape
32 28 16 14 10
At = 0.05, test the claim that there is no
preference in the selection of fruit soda
flavors.
H0: Consumers show no preference for flavors. (claim)
H1: Consumers show preference.
For expected frequency
= 0.05 ( f0 − fe )
2
Question # 27
The dean of students of a University wishes to test the
claim that the distribution of students is as follows; 40%
business (BU), 25% computer science (CS), 15% science
(SC), 10% social science (SS), 5% liberal arts (LA), and
5% general studies (GS). Last semester, the program
enrollment was distributed as shown below. At = 0.10,
is the distribution of students the same as hypothesized?
Major BU CS SC SS LA GS
Number 72 53 32 20 16 7 Ans: c2 = 5.613, CR : 9.24 30
Contingency Table: A table that consists of two or more rows and two
or more columns, into which n observations are classified according to two
different criteria (or variables) is commonly called a contingency table.
Contingency tables provide a useful method of comparing two variables. It is
widely used in marketing.
( f0 − fe )2
c2 = fe
5. Set Up Contingency Table. Determine the CR which depends on and the number
of Degrees of Freedom.
CR: c2 c2,( r – 1 )( c – 1 )
0 CR: c2 + 31
6. Compare Test Statistic with Table Value and Make Decision.
Example # 9: A Survey was conducted to determine
whether there is a relationship between architectural
style (Split level or Ranch) and geographical location
(Urban or Rural). Survey data given below:
House Location
House Style Urban Rural
Split Level 63 49
15 33
Ranch
H0: The 2 categorical variables (Architectural Style and Location) are independent.
H1: The 2 categorical variables are related.
= 0.01 (
f0 − fe
2
)
Test Statistic: c2 = fe
32
Computation of expected frequencies:
House Location
House Style Urban Rural Total
V.Good 20 30 20 2
Good 14 125 85 12
Fair 3 140 165 125
Poor 3 37 68 151
H0: The 2 categorical variables (Mathematical and general ability) are independent.
H1: The 2 categorical variables are related.
= 0.01
Test Statistic:
( f0 − fe )
2
c2 = fe
Computation of expected frequencies:
Mathematical ability
General ability V.Good Good Fair Poor Total
34
Now expected frequencies are:
✓ General ability
Mathematical ability
V.Good
+ Good
Fair Poor Total
c2 = 233.143
Decision: Reject Ho at = 0.01
Conclusion: There is evidence that the Mathematical and general ability are not
35
independent.
When the c2 test value is significant, and there is relationship between the variables, the
strength of this variable is measured by using the contingency coefficient. (Pearson’s
coefficient of mean square contingency)
The formula for contingency coefficient is
c2
C =
n + c2
Contingency coefficient will always be less than one.
233.143
C = 1000 + 233.143 = 0.4348
36
3. c2 Test for Homogeneity of Proportions
Example # 11
A researcher selected a sample of 150 seniors from
each of three area colleges and asked each senior,
“Do you drive to college in a car owned by either you
or your parents? The data are given below.
College 1 College 2 College 3
Yes 18 22 16
No 32 28 34
At = 0.05, test the claim that the proportion of students who drive their own
or their parents’ cars is the same at all three colleges.
H0: p1 = p2 = p3
H1: At least one proportion is different from others.
= 0.05 ( f0 − fe )2
Test Statistic is: c2 = fe = 1.596
College 1 College 2 College 3
Yes 18 (18.67) 22 (18.67) 16 (18.67) 56
No 32 (31.33) 28 (31.33) 34 (31.33) 94
50 50 50 150
CR: c2 c20.05,( 2 – 1 )( 3 – 1 ) = c2 0.05,(2) = 5.991 c2 = 5.991
0 37 +
Conclusion: Accept H0 and conclude that proportion of students who drive their own or their parents’ cars is the same at all three colleges.
Question # 28
A study is being conducted to determine
whether there is a relationship between
jogging and blood pressure. A random
sample of 210 subjects is selected, and
they are classified as shown in the table.
At = 0.05, test the claim that jogging and
blood pressure are not related.
Blood pressure
Jogging status Low Moderate High Ans: c2 = 6.799, CR : 5.991
Joggers 34 57 21
Non joggers 15 63 20
Question # 29
A researcher wishes to see whether the age of an individual
is related to coffee consumption. A sample of 152 people is
selected, and they are classified as shown in the table. At
= 0.01, is there a relationship between coffee consumption
and age? Calculate the coefficient of mean square
contingency. What is the maximum value for C?
Coffee consumption
Age Low Moderate High
21 – 30 18 16 12
31 – 40 9 15 27
41 – 50 5 12 10 Ans:
38
c2 = 15.824, CR : 16.812, Coefficient of MS = 0.3071, Maximum C = 0.82
51 & over 13 9 6
Question # 30
According to a recent survey, 64% of American
females between the ages of 16 and 20 cannot
pass a basic fitness test. A physical education
instructor wishes to determine if the
percentages of such students in different
universities in her district are the same. She
administers a basic fitness test to 120 students
in each of four universities. The results are
shown below. At = 0.05, test the claim that
the proportions who pass the test are equal.
Ans: c2 = 5.317, CR : 7.851
Southside West End East Hills Jefferson
Passed 49 38 46 34
Failed 71 82 74 86
Question # 31
The grades in a statistical examination were at
follows. Test the hypothesis at the 0.01 level of
significance, that the distribution of grades is
uniform.
Grade A+ A B C D
39
f 14 18 32 20 16 Ans: c2 = 10, CR : 13.277
Question # 32
Discuss the association between the two criteria of
classification, i.e. degree and hobbies. If the null
hypothesis is rejected, calculate the Pearson’s
coefficient of mean square contingency. What could
be its maximum value for this contingency table?
Use = 0.05. Degree
Hobby
Marketing TQM HRM
Gardening 24 83 17
Craftwork 11 62 28
Reading 32 121 34
Cooking 10 26 44
Ans: c2 = 54.06, CR : 12.59, Coefficient of mean SC = 0.315, maximum value of C = 0.816, so range 0 < C 0.816
Question # 33
An investigation into colour-blindness and sex of a
person gave following results.
Colour Blindness
Sex
Colour Blind Not Colour Blind
Male 36 964
Female 19 981
Is there evidence, at 5% level of significance for
40
the association between sex and colour blindness.
2 Ans: c = 5.4, CR : 3.84
CRITICAL THINKING PROBLEM
42