Statistics in Education - Made Simple
Statistics in Education - Made Simple
Statistics in Education - Made Simple
Introduction
We are living an information age, which is invariably bound up with the notion of
counting and measurement. There would be no exaggeration in saying that the process of
counting and measurement in quite near to our lives also in education field. In fact one can
easily establish that the process of counting and measurement has been with us ever since
human race stepped towards civilization. It is the sheer importance and applications of
counting that has led to the emergence of the discipline, ‘STATISTICS’
The term ‘statistics’ seems to have been derived from the Latin word ‘status’ or
Italian word ‘statista’ or the German word ‘statistik’, each of which means ‘Political State’.
Statistics was born as the ’Science of Kings’. It had its origin in the needs of the ruling chiefs
in the olden days for collecting data on vital matters such as population, man power, and
wealth in the form of land, buildings and other assets with a view to framing their military
and fiscal politics.
Statistics – Definition
“Statistics is the science which deals with collection, classification and tabulation of numeric
facts as a basis for explanation, description and comparison of phenomenon” -
Lovitt
Discrete Variable: Those variables which can be assume only distinct or particular values
are called discrete variable. They are exact or finite and are not normally fractions.
Continuous Variable: Those variables which can take any numerical values are known as
continuous variable.
Series: A series, as used statistically, may be defined as things or attributes of things
arranged according to some logical order.
Discrete Series: Ant series represented by discrete variables is called discrete series.
Continuous Series: Any series represented by continuous variable is called continuous series.
Raw Data: A mass of statistical data in its original form is called data or ungrouped data.
Class: It is a decided group of magnitude. Eg. 0 – 10, 10 – 20 etc.
Open-end Class: A lowest class lacking of lower limit and a highest class lacking an upper
limit are called open end classes.
Eg. Below 5 -------- Open-end class
5 – 10
10 – 15
15 – 20
20 above ------ Open-end Class
Inclusive type classes or working class: The classes in the form of 1 – 5, 6 – 10, 11 – 15, ----
are called Inclusive type classes. Here both limits (lower limit and upper limit) included in the
same class itself.
Exclusive type classes or Actual classes: The classes in which upper limit not included. 0 –
10, 10 – 20, 20 – 30, ---- etc. are Exclusive type classes.
Prepared by KSK 8
Class limit: The class limits are the lowest and highest values of the variable that can be
included in that class.
Class boundaries: The class limits of the exclusive type classes or actual classes are
called actual limits or class boundaries.
Mid points of the class or class marks: The mid point of a class is the average of the
upper and lower limit of the class.
Class interval: The class interval or class width is the difference between the upper limit
and lower limit of the class.
• Conversion of Inclusive type classes into Exclusive type classes:
• Note the difference between one upper limit and next lower limit of the inclusive class.
• Divide the difference by 2
• Subtract that value from the lower limit and ass the same to the upper limits
• Do the same in all classes.
Frequency: the number of times a given value in an observation appears is the frequency.
Class frequency: the number of values in each of the quantitative classes is called the
class frequency.
Total frequency: the sum total of the frequencies is known as the total frequency.
Steps
The following data give the number of children per family in each of 25 families. Construct a
Discrete Frequency Distribution: 1, 4, 3, 2, 1, 2, 0, 2, 1, 2, 3, 2, 1, 0, 2, 3, 0, 3, 2, 1, 2, 2, 1, 4,
2
Number
Tally No. of
of
Marks Children
Children
0 III 3
1 IIII I 6
Prepared by KSK 8
2 IIII IIII 10
3 IIII 4
4 II 2
Total 25
Continuous (Grouped) Frequency Distribution is a table in which the data are grouped into
different classes and the number of observations falls in each class are noted.
Eg. Construct a Continuous frequency distribution for the following set of observations
70 45 33 64 50 25 65 75 30 20
55 60 65 58 52 36 45 42 35 40
51 47 39 61 53 59 49 41 20 55
42 53 78 65 45 49 64 52 48 46
Less than Cumulative frequency distribution ia table which fives the number of
observations falling bellow the upper limit of a class.
Eg. Construct Less than Cumulative Frwquency Distribution
Prepared by KSK 8
Class Frequenc Class Frequency <CF
y 0–5 4
0–5 4 5 – 10 7 4
5 – 10 7 10 – 15 12 (4 + 7)
10 – 15 12 15 – 20 5 11
15 – 20 5 20 - 25 2 (4 + 7 + 12)
20 - 25 2 23
Frequency Distribution (4 + 7 + 12 + 5)
Less than 28
Cumulative Frwquency Distribution (4 + 7 + 12 + 5 + 2)
30
Less than Cumulative frequency distribution ia table which fives the number of observations
lying above the lower limit of the class
Apart from tabulation, data can also be presented through diagrams and graphs.
Graphs and Diagrams are visual aids for the presentation of data. They are most convincing
and appealing methods by which statistical data can be presented.
1. Histogram
2. Frequency Polygon
3. Frequency Curve
Prepared by KSK 8
5. Pie Diagram (Sector Diagram)
6. Bar Diagram
1. Histogram
• The class interval taken along the horizontal axis (X – axis) and the respective class
frequencies are taken on the vertical axis (Y – axis) using suitable scales of each
classes.
• For each class a rectangle is drawn with base as width of the class and height as the
class frequency.
• The total area of the histogram will be proportional or equal to the total frequency of
the distribution.
X
Class Frequency
0 – 10 4
10 – 20 10
20 – 30 21
30 – 40 9
40 – 50 4
50 – 60 2
Total 50
10 20 30 40
Y
50 60
2. Frequency Polygon
Prepared by KSK 8
• To draw Frequency Polygon by drawing Histogram, join the mid-points of the top of the
rectangles of the Histogram using straight lines
• Frequency Polygon can also drawn by joining the consecutive points, plotted by taking
the mid-points of the classes on X-axis and corresponding frequencies on Y-axis.
• The end points are extended at each end and to join the X-axis.
Class Frequency
0 – 10 4
10 – 20 10
20 – 30 21
30 – 40 9
40 – 50 4
50 – 60 2
Total 50
-10 10 20 30 40 50
60 70
First Method
Frequency Frequency
Polygon Polygon
Frequency ------>
Scale
5
10
15
20
25
5
10
15
20
25
X axis - 1 cm = 10 units
Y axis - 1 cm = 5 units
Frequency
------>
-10 10 20 30 40 50 -5 5 15 25 35 45
60 70 Classes ------> 55 65 Classes ------>
Second Method Third Method
3. Frequency Curve
Prepared by KSK 8
• It is a graphical representation of continuous frequency distribution
• To draw Frequency curve by drawing Histogram, join the mid-points of the top of the
rectangles of the Histogram using smooth curve by free hand method
• Frequency curve can also drawn by joining the consecutive points, plotted by taking
the mid-points of the classes on X-axis and corresponding frequencies on Y-axis.
• The end points are extended at each end and to join the X-axis.
• The total area under the Frequency Curve is equal to or proportional to (numerically)
the total frequency of the given distribution.
Class Frequency
0 – 10 4
10 – 20 10
20 – 30 21
30 – 40 9
40 – 50 4
50 – 60 2
Total 50
-10 10 20 30 40 50
60 70
First Method
Frequency Frequency
Curve Curve
Scale Scale
5
10
15
20
25
10
15
20
25
-10 10 20 30 40 50 -5 5 15 25 35 45
60 70 Classes ------> 55 65 Classes ------>
Second Method Third Method
4. Cumulative Frequency Curve (Ogive)
frequency
30 – 40 40 85 Cumulative Frequency distribution.
40 – 50 21 106 • Greater than Cumulative Frequency Curve is drawn by
joining smoothlyScalethe points obtained by plotting the
120
50 – 60 10 116 Lower limit of the- 1actual
X axis cm = 10 classes against their Greater
units
than cumulative Frequencies.
Y axis - 1 cm = 20
60 - 70 4 120 100 units
80
Frequen <C 20
Class
cy F
0 – 10 5 120 -10 0 10 20 30 40 50
60 70 80
10 – 20 12 115 Upper limit of Classes
------>
frequency
20 – 30 28 103
30 – 120
40 40 75 Scale
X axis - 1 cm = 10
units
40 – 50 21 35 Y axis - 1 cm = 20
100 units
50 – 60 10 14
80
60 - 70 4 4
60
40
5. Pie Diagram
• Pie20
diagram consist of circle whose area proportional to the magnitude of the variable
they present
• The -10component
0 part of20the variable
10 30 represented
40 50 by means of sectors of the circle
•60 The70area of80the sector proportional to
Lower limit of Classes the frequencies of the component parts of the
variable.
------>
Prepared by KSK 8
• If A1 and A2 are the total magnitude of the two variables, to represent the data by
means of Pie diagram, draw two circles with radius r1 and r2 given by
Draw Pie Diagram for the following data
Prepared by KSK 8
• It is difficult to show minor differences • It is subjective in character; its
with their help interpretation varies from person to
• Diagram can be used only to show a person.
limited amount of information • Diagrams and graphs can be misused
• Diagrams show only approximate very easily
values • Diagrams and graphs are not
substitute of the original data
MEASURES OF CENTRAL TENDENCY
• When we collected data from a sample of study, the majority of scores in that collected
data always show a tendency to be closer the average. This phenomenon is called
‘central tendency’.
• The value of the point around which scores tend to cluster is called ‘Measures of Central
Tendency’.
• A measure of central tendency may be defined as a single measure representing all the
scores of given data.
Commonly used Measures of Central Tendency are
1. Mean
2. Median
3. Mode
1. MEAN (ARITHMETIC MEAN)
Case – I: Ungrouped Data (Discrete data)
If x1, x2, x3, …………..xn are N observations
Then A.M (X) = Sum of the x1+x2+x3+……………x
= n
= observations N
A.M () =
Total No. of
Eg, Calculate A.M of the observations: 12, 18, 14, 15, 16 x – Observations (Scores)
A.M (X) = =
12+18+14+15+16 = N- Total frequency
5
= 15
Case – II: Ungrouped Frequency Distribution (Discrete Frequency Distribution)
If x1, x2, x3, …………..xn are observations and f1, f2, f3, …………..fn then A.M is given by
A.M ( ) = = = A.M () =
x – Observations (Scores)
f – Frequency
N- Total frequency
Eg. Calculate A.M of the following data
Answer
= = 25.4 = 25 + =
25.4
Arithmetic Mean – Merits Arithmetic Mean – demerits
• It is rigidly defined • AM is affected by extreme values
• AM is easy to understand • AM may lead to wrong conclusion if
• Simple to calculate the figures from which it is
• Based on all observations computed are not known.
• It is capable for further algebraic • AM can’t be calculated for a
treatment. distribution having open end
classes.
2.MEDIAN
• Median is defined as the middle most observation when the observations are arranged
in ascending or descending order of magnitude.
CALCULATION OF MEDIAN
1. Discrete Data & Discrete Frequency Distribution
Let N be the total number of observations,
Case I: N is odd
Median = (th observation when the data are arranged in ascending or descending
order of magnitude
Prepared by KSK 8
th
Here N = 9, Then Median = ( observation = 5th observation
= 12
Case II: N is even
Median =Average of (th observation and (th observation when the data are arranged in
ascending or descending order of magnitude.
Median =
Eg.2 Calculate Median: 30, 26, 42, 28, 35, 20, 32, 50
Data in Ascending order of magnitude: 20, 26, 28, 30, 32, 35, 42, 50
Here N = 8 Median =
=
= = 31
Eg.3 Calculate Median Median = ( th observation = ( th
Observatio frequenc observation
n
5 y
3
6 8 = 11th observation = 6
Here N = 41
7 12
8 10
9 8
Total 41 2. Grouped (Contiguous) Frequency
Distribution
Median =lm + ( ) ×c
lm – Actual lower limit of Median Class
(Median Class – Class in which ( observation
falls
N – Total Frequency
cfm – Cumulative frequency Up to Median Class
fm – frequency of Median Class
c – Class interval
Eg. Calculate Median
Answer:
Class Frequen <CF
0–5 cy
5 5 Median = lm + ( ) ×c Here
5 – 10 10 15 lm = 10
10 – 15 15 30 N = 50
Median Class = 10 + ( ) ×5 cfm = 15
15 – 20 12 42 fm = 15
25 – 25 8 50 c=5
= 10 + ( ) ×5
N = 50
Steps: = 13.33
• Draw Less than or Greater than
Ogive.
Graphical determination of Median
• Locate N/2 on the Y – Axis
I Method N
• At N/2 draw a perpendicular to the Y
– Axis and extent it to meet the
Ogive
• From that point of intersection draw
a perpendicular to the X – Axis Prepared by KSK 8
• The point at which the perpendicular
N/2
II Method Median
Steps:
• Draw Less than and Greater than
Ogive simultaneously
• Draw perpendicular from the point
of intersection to the X - Axis
• The point at which the perpendicular
meets the X- Axis will be the
Median.
Median
Median – demerits
Median – Merits
• It is not based on all observations
• It is rigidly defined • Median is a non-algebric measure
• It is easy to understand and hence not suitable for further
• Simple to calculate algebric treatment
• It can be located by mere inspection • It is can’t be used for computing
• It is not affected by extreme values other statistical measures such as
• It can be calculated for a distribution Standard Deviation, Coefficient of
having open end classes correlation etc.
• It can be determined graphically. • When there are wide variations
between the values of different
scores, a Median may not be
representative of the distribution.
3.MODE
• Mode is the value of the variable which occurs most frequently.
• In certain cases such as exact Mode may not exist or there may be Two or Three
Modes in a distribution.
• When there are Two Modes we call it Bi-Modal Distribution
• If there are Three Modes, we call it Tri-Modal Distribution.
Calculation of Mode
1. Discrete Distribution
•In a large distribution, that is almost Normal, Mode can be calculated by using the
relation
• Mean – Mode = 3(Mean – Median)
• Mode = 3Median – 2 Mean
MEASURES OF DISPERSION (MEASURES OF VARIABILITY)
• Measures of central tendency need not give an exact picture of the distribution.
• If we compare two groups, merely on the basis of the average, there is a possibility of
being mislead to incorrect judgment
Eg: consider the Marks of two Groups
2, 8, 20, 28, 42 ------------------ Group 1
18, 19, 20, 21, 22 ------------------ Group 2
Here when we calculate the Mean for both groups, we get Mean = 20
But when we examine the scores, we can find that Group1 is Heterogeneous
Group and Group2 is a Homogeneous Group.
• The statistical measures used to determine the Nature and extent of dispersion of the
scores are known as Measures of Dispersion or Measures of Variability.
Prepared by KSK 8
• Measures of Dispersion measures the spreading of observations from the central value
of the distribution.
Commonly used Measures of Dispersion
1. Range 3. Mean Deviation
2. Quartile Deviation 4. Standard Deviation
1.RANGE
Range is the difference between the highest and lowest scores in a Distribution.
Range (R) = H –
L
H – Highest Value
L – Lowest Value
Eg: find Range 53, 51, 70, 45, 60, 62, 40, 53, 71, 55
Range (R) = H – L
= 71 – 40
= 31
Observation frequency Range (R) = H – L
5 3 =9-5
6 8 =4
7 12
8 10
9 8
Total 41
In a continuous distribution, Range is the difference between
the upper limit of the highest class and lower limit of the lowest class.
Eg:
Class f
10 – 20 12
Range (R) = H – L
20 - 30 20 = 50 - 10
30 - 40 10 = 40
40 - 50 5
• The quartile deviation is half the difference between the upper and lower quartiles in a
distribution.
• It is a measure of the spread through the middle half of a distribution.
• It can be useful because it is not influenced by extremely high or extremely low
scores.
Prepared by KSK 8
• Quartile: One of the four divisions of observations which have been grouped into four
equal-sized sets based on their statistical rank.
• Lower Quartile (first quartile) Q1: first point of division of observations which have
been grouped into four equal-sized sets based on their statistical rank.
• Upper Quartile (Third quartile) Q3: Third point of division of observations which
have been grouped into four equal-sized sets based on their statistical rank.
• Second Quartile Q2: Second point of division of observations which have been
grouped into four equal-sized sets based on their statistical rank.
• Second Quartile is called Median
Q1 =l1 + ( ) ×c
Quartile Deviation (Q) =
Q1 – Lower (First) Quartile Q3 =l3 + ( ) ×c
Q3 – Upper (Third) Quartile
1. Discrete Data:
Eg: find Quartile deviation: 2, 13, 17, 20, 25, 28, 30, 33, 37, 40, 41
Answer
2 13 17 20 25 28 30 33 37 40 41
Q Q Q
1 2 Median =l1 + 3 ( ) ×c
l1 – Actual lower limit of Q1 Class
Quartile Deviation (Q) = (Q1 Class – Class in which ( observation falls
N – Total Frequency
=
cf1 – Cumulative frequency Up to Q1 Class
= 10
f1 – frequency of Q1 Class
c – Class interval
Frequen
Class <CF
2. Continuous Distribution
cy
30 – 35 10 10 Answer
Q1 =l1 + ( ) ×c l1 = 35
35 – 40 16 26 = 35 + ( ) ×5 N = 100
= 39.68 cf1 =
10
40 – 45 18 44
Q1 c=5
Class Q1 =l3 + ( ) ×c f1 = 16
45 – 50 27 71
= 50 + ( ) ×5
= 51.11 l1 = 50
50 – 55 18 89 N = 100
cf1 =
55 – 60 8 97 Quartile Deviation (Q)71
=
Q3 = c=5
Class = 5.715
f1 = 18
60 – 65 3 100
Prepared by KSK 8
3. MEAN DEVIATION (AVERAGE DEVIATION)
• Mean Deviation is the average of the deviations of the scores taken from the Mean
• It may be calculated by taking the deviations of each of the scores from the mean and
fins the average of these scores.
• Deviations may –ve or +ve, so take absolute value of deviations.
Continuous
Discrete Data Discrete Distribution
Distribution
Mean Deviation = Mean Deviation =
Mean Deviation =
x - Scores x - Scores
x – Mid-value
- Arithmetic Mean - Arithmetic Mean
- Arithmetic Mean
N – Total Number of f - Frequency
f - Frequency
scores N – Total frequency
N – Total frequency
= 4.38
• Standard Deviation is the square root of the average of the squares of the
deviations of the scores taken from the mean. SD denoted by the symbol σ
(sigma).
• The Arithmetic Mean (Average) of the squares of deviations is known as Variance.
• Standard Deviation is the square root of the Variance.
Calculation of Standard Deviation – Steps
1. Find the Arithmetic Mean of the given data.
2. Find the deviations from Arithmetic Mean of scores.
3. Find the average of squares of deviations taken from the Mean.
4. Find the square root of the average of squares of deviations.
Continuous
Discrete Data Discrete Distribution
Distribution
Standard Deviation = Standard Deviation =
Standard Deviation =
x - Scores x - Scores
x – Mid-value
- Arithmetic Mean - Arithmetic Mean
- Arithmetic Mean
N – Total Number of f - Frequency
f - Frequency
scores N – Total frequency
N – Total frequency
1. Discrete Series
Prepared by KSK 8
Find Standard Distribution
Answer
Score Frequen Scor Freque fx d=(x - (x - )2 f(x - )2
cy e ncy )
22 5 22 5 110 -14 196 980
27 10 27 10 270 -9 81 810
32 25 32 25 800 -4 16 400
37 30 37 30 1110 1 1 30
42 20 42 20 840 6 36 720
47 10 47 10 470 11 121 1210
N=100 N=100 ∑fx=36 ∑fd2=4150
00
= =
= 36 = 6.44
3. Continues Frequency Distribution (Grouped Distribution)
Score Frequen
Calculate Standard Deviation
cy
20 – 5
24–
25 10
29–
30 25
34 -
35 30
39–
40 20
4544
- 49 10
N=100
Answer
Score Freque
x fx (x - ) (x - )2 f(x - )2
ncy
20 – 22 5 110 -14 196 980
24 Standard Deviation =
25 – 27 10 270 -9 81 810
29
30 – 32 25 800 -4 16 400 =
34
35 - 37 30 1110 1 1 30
39 = 6.44
40 – 42 20 840 6 36 720
44
45 - 49 47 10 470 11 121 1210
N=100 ∑fx=36 ∑fd2=4150
00
For a large distribution, Short-cut method (Assumed Mean Method) can be used to calculate
Standard Deviation
Prepared by KSK 8
SD = c
c – Class interval d =
f – Frequency
x – Mid-point
N – Total Frequency
- Assumed Mean
d - Deviations
SD = 5
= 11.31
Prepared by KSK 8
Eg: Intelligent and Achievement
Negative correlation: When the first variable increase or decrease, the other variable
decrease or increases respectively, then the relationship between this two variables are said
to be in Negative correlation.
Eg: Time spend to practice and Number of typing error
Zero correlation: if there is no relationship between two variables, then the relationship
between this variable are said to be in Zero correlation.
Eg: Body weight and Intelligent
COEFFICIENT OF CORRELATION
o The ratio indicating the degree of relationship between two related variables is called the
coefficient of correlation.
• For a perfect positive correlation, the Coefficient of Correlation is +1 and for a perfect
Negative correlation, the Coefficient of Correlation will be -1.
• Perfect positive or Negative correlation is possible only in Physical Science.
• In a Social Science like Education, the correlation between two variables will lie within the
limit +1 and -1
• Positive correlation varies from 0 to +1 and Negative correlation varies from 0 to -1
• Zero correlation indicates that there is no consistent relationship between two variables.
Use of Coefficient of Correlation
• It helps to determine the validity of • It indicates the nature of the
a test. relationship between two variables.
• It helps to determine the reliability • It predicts the value of one variable
of a test. given the value of another related
• It can be used to ascertain the variable.
degree of the objectivity of a test. • It helps to ascertain the traits and
• It can answer the validity arguments capacities of pupils.
for or against a statement.
Calculation of Correlation Coefficient
• There are two important techniques for calculating Correlation coefficient
Rank Correlation
Product Moment Correlation
Rank Correlation
• Spearman who for the first time measures the extent of correlation between two set
of scores by the method of Rank Difference
=1
Prepared by KSK - 8
Here the correlation is found
to be Positive and High
Product Moment Correlation
Karl Pearson devised formula for the calculation of Product Moment Correlation
coefficient
SD of h1 ( = 0.6
Positive correlation between
SD of h2 ( Height of Father and Height of
Son
Short-cut mark Mark
stude
Method
Product Test1 Test2 x2 y2 xy
Mark Mark nts
stude
Moment Test1 Test2 x, (x)
y : first set(y)
of scores and the second set
nts r
Correlatio (x)= A 8 of 9 64 81 72
(y)
n B N 6: Number7of scores 36 in a set 49 42
A 8 9
coefficien
B Product 6 Moment C
7 Correlation 4 3 16 9 12
Find coefficient Answer
C 4 3 D 7 6 49 36 42
D 7 6 E 3 5 9 25 15
E 3 5 F 6 6 36 36 36
F 6 6 G 5 5 25 25 25
G 5 5 H 4 5 16 25 20
H 4 5 I 5 4 25 16 20
I 5 4 J 6 5 36 25 30
J 6 5
Prepared
∑ x by KSK∑ y ∑xy =8
∑ x2 = 312 ∑y2 = 312
=54 =55 314
r=
Correlation is Positive and
High
r= =
0.76
Normal Probability Curve
• In a Normal Distribution, when the scores are arranged in the order of magnitude,
those at the centre will have the maximum frequency.
• The frequencies will gradually go on decreasing towards the right and left of the score
at the centre. Because of this property, the curve representing a normal distribution
will show symmetry on either side of its central axis. Hence it will be in ‘bell-shaped’
• These special features of the Normal Distribution will be seen in the dispersion of scores
regarding natural phenomena as intelligence, height, weight etc. in a population.
• This characteristic of Normal Distribution is found to be true to a great extent with regard
to achievement scores of a well conducted examination, if the number taking the
examination is sufficiently large.
• Hence properties of Normal Distribution and Normal Distribution curve are of great
importance in the study of group and their characteristics with respect to given variables.
Properties of Normal Probability Curve
Prepared by KSK 8
• It is symmetrical. If a perpendicular is drawn from the peak to X-axis, this will divide the
whole area of the curve into two equal parts.
• The majority of scores will show a tendency to cluster around the centre. On either side of
the central axis the frequencies of scores will go on reducing, these being least at the two
ends.
• All the three Measures of Central Tendency, viz Mean, Median, and Mode of a normal curve
coincide, that is, they are all equal.
• The first and third quartiles are equidistant from the median.
• The ordinate at the mean is the highest. The height of other ordinates at various sigma
distances from the mean are also in fixed relationship with the height of the mean ordinate.
• The curve will gradually go on the nearer to the base line, but it will never meat the base
line. For practical purpose, the curve may be taken to end at points -3o- to +3o- distance
from the mean, because this region will cover almost 100% of the cases.
• If the total area enclosed by the normal probability curve is represented by N, the total
number of cases in the group considered, we can find out the area between any two points
with the help of mathematical formulae.
• The most important relationship in the Normal Probability Curve is the area relationship. In
a normal distribution 34.13% cases will be distributed between M and a score at a distance
of 1o- from M. Thus 68.26% cases are included between M+1o-. 99.37% or almost all the
cases are included between M+3o-.
• If the distribution is not perfectly normal or symmetrical or the frequencies on either side
not even, then the frequency curve deviates from Normalcy. Such curve are said to be
skewed in nature.
• The lack of symmetry due to extended tails in a particular direction is known as Skewness.
• In a skewed distribution the Mean, Median and Mode will not be the same.
• There are two types of Skewness
• Negative Positive Skewness
Skewness
Negative Skewness
Positive Skewness
Prepared by KSK 8
• If the tail extends to the right (Positive
direction of the curve), the distribution
is said to be Positively Skewed.
• The distance between the Mean and Median will indicate the extent of skewness.
• In a negatively skewed curve the Mean lies to the left of the Median.
• In a positively skewed curve the Mean will lie to the right of the Median.
• The degree of Skewness of a frequency distribution may be calculated using the
formula
• Sk = When Mean, Median and Standard Deviation are given.
• When the percentiles are available the following formula is used to find out the
skewness Sk = (Here P90 is the 90th Percentile and P10 is the 10th
percentile)
• For a Normal curve the skewness is Zero.
Kurtosis
• Kurtosis refers to the Peakedness or Flatness of curve of frequency distribution compared
to Normal curve.
• The curve of A frequency distribution, which is more peaked than the normal curve , is said
to be Leptokurtic
• If the peak is found to be flatter than a normal curve, the curve is said to be Platykurtic.
• The curve of a normal distribution is said to be Mesokurtic.
Ku = (Q –
Quartile Deviation)
Standard Scores
• Mean is the most representative score
for commending about the
position of other given scores.
• The distance from the mean is usually
expressed in terms of the Standard
deviation of the scores of the distribution
concerned.
• The scores used to indicate the standard deviation away from the mean of a given
distribution is known as standard scores.
• Commonly using standard scores are Z score and T score
Prepared by KSK 8
Z Score
• Z score indicated how many standard deviations away from the mean and in which
direction is a given raw score of a distribution.
• Z = X–
σ
where X - Row score
- Mean
σ - Standards Deviation
Example A Example B
X = 76 X = 67
= 82 = 62
σ=4 σ=5
z = 76 - 82 z = 76 - 62
4 5
= -1.50 = +1.00
T Score
• T score has been devise to avoid some confusion resulting from negative z score
(below the mean) and also to eliminate decimal values.
• To find out the T score, multiply the z score by 10 and add 50. T = 50 + 10z
• T score are always rounded to nearest whole number.
• For example, In Example A, T = 50 + 10(-1.50) = 50 + (-15.0) = 35
In Example B, T = 50 + 10(1.00) = 50 + 10 = 60
Prepared by KSK 8