Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word

Lecture 5: Chapter 5
Statistical Analysis of Data

…yes the “S” word
What is a Statistic????
Sample
Sample
Sample
Population
Sample
Parameter: value that describes a population
Statistic: a value that describes a sample

PSYCH  always using samples!!!
Descriptive & Inferential Statistics
Descriptive Statistics Inferential Statistics
• Organize • Generalize from

samples to pops
• Summarize
• Hypothesis testing
• Simplify • Relationships
• Presentation of data among variables
Describing data Make predictions

Descriptive Statistics
3 Types
1. Frequency Distributions 3. Summary Stats

# of Ss that fall Describe data in just one
in a particular category number
2. Graphical Representations
Graphs & Tables

1. Frequency Distributions
# of Ss that fall
in a particular category
How many males and how many females are

in our class?
total
Frequency ? ?
(%)
?/tot x 100 ?/tot x 100
scale of measurement?
-----% ------%
nominal
# of Ss that fall
in a particular category
Categorize on the basis of more that one variable at same time

CROSS-TABULATION
total
Democrats 24 1 25
Republican 19 6 25
Total 43 7 50
How many brothers & sisters do you have?
# of bros & sis Frequency

7 ?
6 ?
5 ?
4 ?
3 ?
2 ?
1 ?
0 ?
Graphs & Tables
Bar graph (ratio data - quantitative)

Histogram of the categorical variables

Polygon - Line Graph

Graphs & Tables
How many brothers & sisters do you have?

Lets plot class data: HISTOGRAM
# of bros & sis Frequency

7 ?
6 ?
5 ?
4 ?
3 ?
2 ?
1 ?
0 ?
jagged
Altman, D. G et al. BMJ 1995;310:298
smooth
Central Limit Theorem: the larger the sample size, the closer a distribution
will approximate the normal distribution or
A distribution of scores taken at random from any distribution will tend to

form a normal curve
Normal Distribution:
halfTwo Tail above
the scores 68%
mean…half below
(symmetrical)
2.5% 95%
2.5%
13.5%
13.5%
IQ
body temperature, shoe sizes, diameters of trees,
5% region of rejection of null hypothesis
Wt, height etc…
Non directional
Summary Statistics
describe data in just 2 numbers
Measures of variability
• typical average variation
Measures of central tendency
• typical average score
Measures of Central Tendency
• Quantitative data:
– Mode – the most frequently occurring
observation
– Median – the middle value in the data (50 50 )
– Mean – arithmetic average
• Qualitative data:
– Mode – always appropriate
– Mean – never appropriate
Mean
Notation
• The most common and most
useful average • Sample vs population
• Mean = sum of all observations • Sample mean = X
number of all observations
• Observations can be added in • Population mean =m
any order. • Summation sign = 
• Sample size = n
• Population size = N
Special Property of the Mean
Balance Point
• The sum of all observations expressed as

positive and negative deviations from the
mean always equals zero!!!!
– The mean is the single point of equilibrium
(balance) in a data set
• The mean is affected by all values in the data
set
– If you change a single value, the mean changes.
The mean is the single point of equilibrium (balance) in a data set
SEE FOR YOURSELF!!! Lets do the Math

Summary Statistics
describe data in just 2 numbers
Measures of variability
Measures of central tendency • typical average variation
• typical average score
1. range: distance from the
lowest to the highest (use 2
data points)
2. Variance: (use all data points)
3. Standard Deviation
4. Standard Error of the Mean
Descriptive & Inferential Statistics
Descriptive Statistics Inferential Statistics
• Organize • Generalize from

samples to pops
• Summarize
• Hypothesis testing
• Simplify • Relationships
• Presentation of data among variables
Describing data Make predictions

Measures of Variability
2. Variance: (use all data points):
average of the distance that each score is from

the mean (Squared deviation from the mean)
Notation for variance

s2
3. Standard Deviation= SD= s2
4. Standard Error of the mean = SEM = SD/ n

Inferential Statistics
Sample
Sample
Population Sample
Sample
Draw inferences about the

larger group
Sampling Error: variability among
samples due to chance vs population
Or true differences? Are just due to

sampling error?
Probability…..
Error…misleading…not a mistake
Probability
• Numerical indication of how likely it is that a
given event will occur (General
Definition)“hum…what’s the probability it will rain?”
• Statistical probability:
the odds that what we observed in the sample did
not occur because of error (random and/or
systematic)“hum…what’s the probability that my results
are not just due to chance”
• In other words, the probability associated with
a statistic is the level of confidence we have that
the sample group that we measured actually
represents the total population
data
Are our inferences valid?…Best we can do is to calculate probability

about inferences
Inferential Statistics: uses sample data
to evaluate the credibility of a hypothesis
about a population
NULL Hypothesis:
NULL (nullus - latin): “not any”  no

differences between means
H0 : m1 = m2
Always testing the null hypothesis “H- Naught”

Inferential statistics: uses sample data to
evaluate the credibility of a hypothesis
about a population
Hypothesis: Scientific or alternative

hypothesis
Predicts that there are differences

between the groups
H1 : m1 = m2
Hypothesis
A statement about what findings are expected
null hypothesis
"the two groups will not differ“
alternative hypothesis
"group A will do better than group B"
"group A and B will not perform the same"
Inferential Statistics
When making comparisons

btw 2 sample means there are 2
possibilities
Null hypothesis is false

Null hypothesis is true
Reject the Null hypothesis

Not reject the Null Hypothesis
Possible Outcomes in
Hypothesis Testing (Decision)
Null is True Null is False

Correct
Accept Error
Decision
Type II Error
Correct
Reject Error
Decision
Type I Error
Type I Error: Rejecting a True Hypothesis

Type II Error: Accepting a False Hypothesis
Hypothesis Testing - Decision
Decision Right or Wrong?
But we can know the probability of being right
or wrong
Can specify and control the probability of

making TYPE I of TYPE II Error
Try to keep it small…

ALPHA
the probability of making a type I error  depends on the
criterion you use to accept or reject the null hypothesis =
significance level (smaller you make alpha, the less likely
you are to commit error) 0.05 (5 chances in 100 that the
difference observed was really due to sampling error – 5%
of the time a type I error will occur)
Hypothesis Testing
Alpha (a) Accept

Correct
Decision
Error
Type II Error
Correct
Difference observed is really Reject Error
Decision
just sampling error Type I Error
The prob. of type one error

When we do statistical analysis… if alpha
(p value- significance level) greater than 0.05
WE ACCEPT THE NULL HYPOTHESIS
is equal to or less that 0.05 we
REJECT THE NULL (difference btw means)

Two Tail
2.5% 2.5%

Non directional
One Tail
5%

Directional
BETA
Probability of making type II error  occurs when we fail
to reject the Null when we should have
Hypothesis Testing
Beta (b) Accept

Correct
Decision
Error
Type II Error
Correct
Difference observed is real Reject Error
Decision
Failed to reject the Null Type I Error
POWER: ability to reduce type II error

POWER: ability to reduce type II error
(1-Beta) – Power Analysis
The power to find an effect if an effect is present
1. Increase our n
2. Decrease variability
3. More precise measurements
Effect Size: measure of the size of the difference

between means attributed to the treatment
Inferential statistics
Significance testing:
Practical vs statistical significance

Used for Testing for Mean Differences
T-test: when experiments include only 2 groups

a. Independent
b. Correlated
i. Within-subjects
ii. Matched
Based on the t statistic (critical values) based on

df & alpha level
Used for Testing for Mean Differences
Analysis of Variance (ANOVA): used when

comparing more than 2 groups
1. Between Subjects
2. Within Subjects – repeated measures
Based on the f statistic (critical values) based on

df & alpha level
More than one IV = factorial (iv=factors)

Only one IV=one-way anova
Meta-Analysis:
Allows for statistical averaging of results

From independent studies of the same
phenomenon

Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word

Uploaded by

Copyright:

Available Formats

Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 5: Chapter 5 Statistical Analysis of Data Yes The "S" Word

Uploaded by

Copyright:

Available Formats

Lecture 5: Chapter 5

Statistical Analysis of Data

Parameter: value that describes a population

Statistic: a value that describes a sample

• Organize • Generalize from

Describing data Make predictions

1. Frequency Distributions 3. Summary Stats

Graphs & Tables

How many males and how many females are

Categorize on the basis of more that one variable at same time

How many brothers & sisters do you have?

# of bros & sis Frequency

Graphs & Tables

Bar graph (ratio data - quantitative)

Histogram of the categorical variables

Polygon - Line Graph

Graphs & Tables

How many brothers & sisters do you have?

# of bros & sis Frequency

Altman, D. G et al. BMJ 1995;310:298

A distribution of scores taken at random from any distribution will tend to

• The sum of all observations expressed as

SEE FOR YOURSELF!!! Lets do the Math

• Organize • Generalize from

Describing data Make predictions

2. Variance: (use all data points):

average of the distance that each score is from

Notation for variance

3. Standard Deviation= SD= s2

4. Standard Error of the mean = SEM = SD/ n

Draw inferences about the

Or true differences? Are just due to

Are our inferences valid?…Best we can do is to calculate probability

NULL (nullus - latin): “not any”  no

Always testing the null hypothesis “H- Naught”

Hypothesis: Scientific or alternative

Predicts that there are differences

When making comparisons

Null hypothesis is false

Reject the Null hypothesis

Null is True Null is False

Type I Error: Rejecting a True Hypothesis

Can specify and control the probability of

Try to keep it small…

Null is True Null is False

Alpha (a) Accept

The prob. of type one error

WE ACCEPT THE NULL HYPOTHESIS

is equal to or less that 0.05 we

REJECT THE NULL (difference btw means)

5% region of rejection of null hypothesis

5% region of rejection of null hypothesis

Null is True Null is False

Beta (b) Accept

POWER: ability to reduce type II error

The power to find an effect if an effect is present

3. More precise measurements

Effect Size: measure of the size of the difference