Lesson2 - Measures of Tendency

Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Probability and Statistics for Engineers

STAT 301
Chapter 1: lesson
Presentation of Data
2
Central Tendency: Mode, Median,
Mean
Dispersion: Variance, Standard
Deviation
Presentation of Data
• After the data have been collected, the main
tasks a statistician must accomplish are the
organization and presentation of the data

. The organization must be done in a meaningful


way and the presentation should be such that an
interested reader of the study can understand
the data distribution.
1.Frequency Table
• The researches organizes the raw data by using frequency
distribution.
• The frequency is the number of values in a specific class of
data.

• The frequency of a data value is the number of times it


occurs. A frequency table shows the frequency of each
data value. If the data is divided into intervals, the
table shows the frequency of each interval.
Example 1: Making a Frequency
Table

❖ n : total of frequency
❖ The interval must equal width.
❖Use for qualitative and discrete data.
❖You should cover all values and categories.
2.Histogram
• A histogram is a bar graph used to display the frequency of data
divided into equal intervals. The bars must be of equal width and
should touch, but not overlap.
• Histogram: A graph in which the classes are marked on the
horizontal axis and the class frequencies on the vertical axis. The
class frequencies are represented by the heights of the bars and
the bars are drawn adjacent to each other.
Example 1: Making a Histogram
Use the frequency table in Example 2 to make a histogram.

Step 1 Use the scale and interval


from the frequency table.
Enrollment in Western
Civilization Classes
Step 2 Draw a bar for the number Number Frequency
of classes in each interval. Enrolled
1 – 10 1
All bars should be the same 11 – 20 4
width. The bars should touch, 21 – 30 5
but not overlap. 31 – 40 2
Example 1 Continued

Step 3 Title the graph


and label the horizontal
and vertical scales.
Example 2
Make a histogram for the number of days of
Maria’s last 15 vacations.
4, 8, 6, 7, 5, 4, 10, 6, 7, 14, 12, 8, 10, 15, 12
Step 1 Use the scale and interval from the frequency table.

Number of Vacation Days


Interval Frequency
4–6 5
7–9 4
10 – 12 4
13 – 15 2
Example 2 Continued

Step 2 Draw a bar for the number of scores in each


interval.
Vacations

Step 3 Title the graph


and label the horizontal
and vertical scales.
3. Bar chart and frequency
polygon
Bar chart
❖The scores/categories along the x-axis and the frequencies on the
y-axis.
❖When data discrete and the frequency refer to individual values we
use bar chart.
❖The bars do not touch (unlike a histogram).
❖The scores are not ordered.
❖The heights correspond to the number of times the score occurs.
Example
The following table represents distribution of students
according to their faculties in one of universities:
Faculty Students
Science 150
Medicine 100
Arts 250
Education 300
Economics 200
Total 1000
Example
3. Bar chart and frequency
polygon
frequency polygon
❖The scores/categories along the x-axis and
the frequencies on the
y-axis.
❖A frequency polygon consists of line
segments connecting the points formed
by the class midpoint and the class
frequency.
❖A frequency polygon is similar to a
histogram, except line segments are used
instead of bars – the points formed by the
intersections of the class midpoints and
the class frequencies.
Example
Draw a polygon for the following data

To draw polygon we need to compute classes midpoints


Example
Compare the Frequency Polygon
to the Histogram

To turn a histogram into a frequency polygon, just draw a line


from the top center of each bar
Pie chart

pie chart (or a circle graph) is a circular chart divided into


sectors, illustrating numerical proportion.
A pie chart is a circle that is divided into sections according to
the percentage of frequencies in each category of the
distribution.
4.Stem and Leaf Plots
• A simple graph for quantitative data
• Uses the actual numerical values of each data
point.
–Divide each measurement into two parts: the stem
and the leaf.
–List the stems in a column, with a vertical line to
their right.
–For each measurement, record the leaf portion in the
same row as its matching stem.
–Order the leaves from lowest to highest in each
stem.
–The range is the difference between the greatest and
the least value.
4.Stem and Leaf Plots
• To write 42 in a stem-and-leaf plot, write
each digit in a separate column.
Example
The prices ($) of 18 brands of walking shoes:
90 70 70 70 75 70 65 68 60
74 70 95 75 70 68 65 40 65

4 0 4 0
5 Reorder
5
n= 18
6 580855 6 055588
7 000504050 7 000000455
8 Range=95-40=55
8
greatest value=95
9 05 9 05
Leaf unit = 1
least value =40
stem unit = 10
Example : Creating Stem-and-Leaf Plots
Use the data in the table to make a
stem-and-leaf plot.

Test Scores
75 86 83 91 94
88 84 99 79 86

What is the least value?


What is the greatest value?
n=?
Leaf unit?
Stream unit?
Range?
Exercise
Test Scores
72 88 64 79 61
84 83 76 74 67
Use the data in the table to make a
stem-and-leaf plot.
Find the least value, greatest value,
range of the data.
Presentation of Data

1. Qualitative or categorical data


a. Pie charts
b. Bar charts

2. Quantitative data
a. Pie and bar charts
b. Stem and leaf
central tendency
Three measures of central tendency are commonly used in statistical
analysis - the mode, the median, and the mean.
The data (observations) often tend to be concentrated around the
center of the data.
Some measures of location are: the mean, median and mode.
These measures are considered as representatives (or typical values)
of the data.
Arithmetic Mean or Average
• The mean of a set of measurements is the
sum of the measurements divided by the
total number of measurements.

where n = number of measurements


The Sample Mean:
If the list is a statistical population, then the
mean of that population is called a
population mean ,denoted by µ.
If the list is a statistical sample, we call the
resulting statistic a sample mean. denoted
by .

27
Example
•The set: 2, 9, 1 1, 5, 6

If we were able to enumerate the whole


population, the population mean would be
called μ .
Arithmetic Mean or Average
• Finding the Mean?
If X = {3, 5, 10, 4, 3}
X = (3 + 5 + 10 + 4 + 3) / 5
= 25 / 5
= 5
Median
• The median of a set of measurements is
the middle measurement when the
measurements are ranked from smallest
to largest.
• The position of the median is
.5(n + 1)
once the measurements have been
ordered.
Example
• The set: 2, 4, 9, 8, 6, 5, 3 n = 7
• Sort: 2, 3, 4, 5, 6, 8, 9
• Position: .5(n + 1) = .5(7 + 1) = 4th

Median = 4th largest measurement


• The set: 2, 4, 9, 8, 6, 5 n = 6
• Sort: 2, 4, 5, 6, 8, 9
th
• Position: .5(n + 1) = .5(6 + 1) = 3.5
Median = (5 + 6)/2 = 5.5 — average of the 3rd and 4th
measurements
Mode
• The mode is the measurement which occurs
most frequently.
• The set: 2, 4, 9, 8, 8, 5, 3
– The mode is 8, which occurs twice
• The set: 2, 2, 9, 8, 8, 5, 3
– There are two modes—8 and 2 (bimodal)
• The set: 2, 4, 9, 8, 5, 3
– There is no mode (each value is unique).
Example
The number of quarts of milk purchased by
25 households:
0 0 1 1 1 1 1 2 2 2 2 2 2 2 2
2 3 3 3 3 3 4 4 4 5
• Mean?

• Median?

• Mode?
Exercise
• Find the Median , mode, mean?
❖ 4 5 6 6 7 8 9 10 12
❖ 5 6 6 7 8 9 10 12
❖ 4, 5, 8, 7
Exercise
•For what value of X will 8 and X have the same
sample mean as 27 and 5?
Solution:
First, find the mean of 27 and 5:

Now, find the X value, knowing that the sample


mean of X and 8 must be 16 :

cross multiply and solve: 32 = X + 8 X


=24 35
Exercise
• On his first 5 Stat. tests, Omer received the
following marks : 72, 86, 92, 63, and 77. What
test mark must Omer earn on his sixth test so
that his average for all six tests will be 80? .
• Solution
Set up an equation to represent the situation.

X= 90

Omer must get a 90 on the sixth test. 36


Measures of Dispersion
The variation or dispersion in a set of data refers to
how spread out the observations are from each
other.

The variation is small when the observations are


close together. There is no variation if the
observations are the same. 37
Measures of Dispersion
Measures of dispersion are important for describing
the spread of the data, or its variation around a
central value . or express quantitatively the degree of
variation or dispersion of values.
There are various methods that can be used to
measure the dispersion of a data set, each with its
own set of advantages and disadvantages.
38
The Range
The difference between the largest and smallest
sample values
If X1,X2,………..,Xn are the values of
observations in a sample then range is given by:

39
The Range (Example):
find The range of (12, 24, 19, 20, 7) .
Solution:

One of the simplest measures of variability to calculate.


Depends only on extreme values and provides no
information about how the remaining data is distributed.

40
Mean Absolute
Deviation(M.A.D.)
The key concept for describing normal distributions
and making predictions from them is called
deviation from the mean.
We could just calculate the average distance between each
observation and the mean.
• We must take the absolute value of the distance, otherwise
they would just cancel out to zero!
Formula:
Mean Deviation: An Example
Data: X = {6, 10, 5, 4, 9, 8} X = 42 / 6 = 7

X – Xi Abs. Dev.
1. Compute X (Average)
7–6 1 2. Compute X – X and take
7 – 10 3 the Absolute Value to get
Absolute Deviations
7–5 2 3. Sum the Absolute
7–4 3 Deviations
4. Divide the sum of the
7–9 2 absolute deviations by N
7–8 1
Total: 12 12 / 6 = 2
The Population Variance:
If X1,X2,………..,XN are the population values, then the

population variance is:

Using summation form:

Where μ is population mean 43


The Sample Variance:
If X1,X2,………..,Xn are the population values, then the

sample variance is:

Using summation form:

44
The Sample Variance:
Where:

is the sample mean.

Note:

(n −1) : is called the degrees of freedom (df) associated with


the sample variance S2.

45
The Sample Standard Deviation :
The standard deviation is another
measure of variation. It is the square
root of the variance, i.e., it is:

46
Example 1 :
Compute the sample variance and standard
deviation of the following observations
(ages in year): 10, 21, 33, 53, 54.

Solution

(year)
Example 1 :

The sample standard deviation is:


The Sample Variance(another formula):

Another Formula for Calculating S2:

(It is simple and more accurate)


The Sample Variance(another formula):
For the previous Example,

10 21 33 53 54

100 441 1089 2809 2916


Calculate the Sample
Variance
Use the Definition Formula:

5 -4 16
12 3 9
6 -3 9
8 -1 1
14 5 25
Sum 45 0 60
Exercise
• Compute the Range, sample variance
and standard deviation of the following
observations :5,12,6,8,14
Exercise

5 25
12 144
6 36
8 64
14 196
Sum 45 465
InterQuartile Range (1/7)
(The Range of the middle 50% of scores)

IQR = Q3 – Q1

What are Q3 and Q1?


Q1 is the lower quartile of 25th percentile.
Q3 is the upper quartile of 75th percentile.

Median = 6 Example 1
1, 3, 5, 6, 7, 8, 8

Q3 = 8 Q1 = 3
IQR = Q3 - Q1
Middle of Middle of =8-3
top half. lower half.
=5
Inter-quartile Range
Median =6 Example 2
2, 3, 6, 6, 7, 8.

Q3 = 7 Q1 = 3 IQR = Q3 - Q1
Middle of Middle of =7-3
top half. lower half. =4

Median =6.5 Example 3


2, 3, 5, 6, 7, 9, 9, 10.

Q3 = 9 Q1 = 4 IQR = Q3 - Q1
Middle of Middle of =9-4
top half. lower half. =5
Inter-quartile Range and Dot Plots

Median

Q1 Q3

0 1 2 3 4 5 6 7 8

IQR = Q3 – Q1

=5–2

=3
Drawing a Box Plot.
Example 1: Draw a Box plot for the data below

Q1 Q2 Q3

4, 4, 5, 6, 8, 8, 8, 9, 9, 9, 10, 12

Lower Upper
Median
Quartile Quartile
= 8
= 5½ = 9

4 5 6 7 8 9 10 11 12
Drawing a Box Plot.
Example 2: Draw a Box plot for the data below

Q1 Q2 Q3

3, 4, 4, 6, 8, 8, 8, 9, 10, 10, 15,

Lower Upper
Quartile Median Quartile
= 4 = 8 = 10

3 4 5 6 7 8 9 10 11 12 13 14 15
Drawing a Box Plot.
Question: Stuart recorded the heights in cm of boys in his
class as shown below. Draw a box plot for this data.

137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, 186
Drawing a Box Plot.
Question: Stuart recorded the heights in cm of boys in his
class as shown below. Draw a box plot for this data.
QL Q2 Qu

137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, 186

Lower Upper
Quartile Median Quartile
= 158 = 171 = 180

130 140 150 160 170 180 cm 190


Exercises
Exercises
Exercises
Exercises
Exercises

You might also like