Threats To Internal Validity
Threats To Internal Validity
Threats To Internal Validity
Regression toward the mean effects are especially likely to occur among
well-meaning investigators, who want to give a treatment that they believe
is very beneficial to the group that appears to need it the most (the top
scoring group is usually left alone.) When the scores of the worst group
improve after the intervention (and the top group scores a little lower on
the readministration), misguided investigators are even more convinced
that they have found a good treatment (instead of a methodological
artifact.) How to avoid this threat to internal validity? Either avoid extreme
groups, or if you do use them, randomly assign their members to treatment
conditions, INCLUDING A CONTROL GROUP.
Testing. Just taking a pretest can sensitize people and many people
improve their performance with practice. Almost every classroom teacher
knows that part of a student' s performance on assessment tests depends
on their familiarity with the format. Solution? A Solomon Four Group
Design, wherein half the subjects do not receive a pretest is a good way to
control inferences in this case.
Most people and groups (who allow you to study them at all) try to
cooperate with researchers. But some try to descover the purpose of the
intervention and thwart it, or "wreck the study." Social Reactance effects
refer to boomerang effects in which individuals or groups "fake bad," or
deliberately deviate from study procedures. This happens more among
college students, and others who suspect that their autonomy is being
threatened.
ON REACTIVITY AND INTERNAL VALIDITY. If demand effects are specific to
a particular situation, reactivity problems may also influence generalizing,
or external validity (this is how your Wiersma book treats the term.)
However, I think reactivity introduces an alternative causal explanation for
our results: they occurred, not because of the intervention or treatment,
but because people were so self-conscious that they changed their
behavior. This is internal validity. Reactivity may also statistically interact
with the experimental manipulation. For example, if the treatment somehow
impacts on self-esteem (say you are told that the stories you tell to the TAT
pictures indicate your leadership ability), reactivity may be a greater
internal validity problem.
When the medical and pharmacy professions test a new medicine, they
don't just use a "sugar pill" placebo.
Subjects in the study do not know if they are taking a new medication, an
old medication, or a sugar pill.
The individuals who pass out the medication and assess the subjects'
health and behavior also do not know whether the person is taking a new
medication, an old medication, or a sugar pill.
Thus both those involved as subjects and those involved with collecting
data are "blind:" blind to the purposes of the study, the condition that
subjects are in, and the results expected.
This means that
You may need to deceive subjects about the true purpose of the
study (if you were told the purpose of the study was to measure
leadership qualities in sports, might you try to "shape up?")
Avoid collecting your own data; don't act as your own experimenter
or interviewer. Trade off with another student or apply for a small
University or external grant to hire someone.
Don't tell interviewers or experimenters the true purpose of the study
and don't tell them (if possible) which subject is in which condition.
You might give each person a "generic overview" of the study ("this
study is about which movies children like.")
Almost no one who collects data "likes deception" but without at least a
little, you may introduce reactivity and bias into your study. Do the
minimum (I prefer "omission" rather than deliberate lies) and be sure to
debrief subjects after their participation in the study is completed. This
means that you tell them the true purpose of the study and any
manipulations pertinent to their role in it. Debriefing is ethically mandatory,
and is especially important if your manipulation involved lies about the
student's performance ("no, you really didn't score in the 5th percentile on
that test, all feedback was bogus") or any other aspect of the "real world."
Susan Carol Losh September 21 2001
This page was built with Netscape Composer
and is best viewed with Netscape Navigator
600 X 800 display resolution.
height, weight, socioeconomic status, and ethnic origin are common, depending on the
focus of the study.
Show the ability of athletic subjects as current or personal-best performance, preferably
expressed as a percent of world-record. For endurance athletes a direct or indirect
estimate of maximum oxygen consumption helps characterize ability in a manner that is
largely independent of the sport.
Dependent and Independent Variables
Usually you have a good idea of the question you want to answer. That question defines
the main variables to measure. For example, if you are interested in enhancing sprint
performance, your dependent variable (or outcome variable) is automatically some
measure of sprint performance. Cast around for the way to measure this dependent
variable with as much precision as possible.
Next, identify all the things that could affect the dependent variable. These things are the
independent variables: training, sex, the treatment in an experimental study, and so on.
For a descriptive study with a wide focus (a "fishing expedition"), your main interest is
estimating the effect of everything that is likely to affect the dependent variable, so you
include as many independent variables as resources allow. For the large sample sizes that
you should use in a descriptive study, including these variables does not lead to
substantial loss of precision in the effect statistics, but beware: the more effects you look
for, the more likely the true value of at least one of them lies outside its confidence
interval (a problem I call cumulative Type 0 error). For a descriptive study with a
narrower focus (e.g., the relationship between training and performance), you still
measure variables likely to be associated with the outcome variable (e.g., age-group, sex,
competitive status), because either you restrict the sample to a particular subgroup
defined by these variables (e.g., veteran male elite athletes) or you include the variables
in the analysis.
For an experimental study, the main independent variable is the one indicating when the
dependent variable is measured (e.g., before, during, and after the treatment). If there is a
control group (as in controlled trials) or control treatment (as in crossovers), the identity
of the group or treatment is another essential independent variable (e.g., Drug A, Drug B,
placebo in a controlled trial; drug-first and placebo-first in a crossover). These variables
obviously have an affect on the dependent variable, so you automatically include them in
any analysis.
Variables such as sex, age, diet, training status, and variables from blood or exercise tests
can also affect the outcome in an experiment. For example, the response of males to the
treatment might be different from that of females. Such variables account for individual
differences in the response to the treatment, so it's important to take them into account.
As for descriptive studies, either you restrict the study to one sex, one age, and so on, or
you sample both sexes, various ages, and so on, then analyze the data with these variables
included as covariates. I favor the latter approach, because it widens the applicability of
your findings, but once again there is the problem of cumulative Type 0 error for the
effect of these covariates. An additional problem with small sample sizes is loss of
precision of the estimate of the effect, if you include more than two or three of these
variables in the analysis.
Mechanism Variables
With experiments, the main challenge is to determine the magnitude and confidence
intervals of the treatment effect. But sometimes you want to know the mechanism of the
treatment--how the treatment works or doesn't work. To address this issue, try to find one
or more variables that might connect the treatment to the outcome variable, and measure
these at the same times as the dependent variable. For example, you might want to
determine whether a particular training method enhanced strength by increasing muscle
mass, so you might measure limb girths at the same time as the strength tests. When you
analyze the data, look for associations between change in limb girth and change in
strength. Keep in mind that errors of measurement will tend to obscure the true
association.
This kind of approach to mechanisms is effectively a descriptive study on the difference
scores of the variables, so it can provide only suggestive evidence for or against a
particular mechanism. To understand this point, think about the example of the limb
girths and strength: an increase in muscle size does not necessarily cause an increase in
strength--other changes that you haven't measured might have done that. To really nail a
mechanism, you have to devise another experiment aimed at changing the putative
mechanism variable while you control everything else. But that's another research
project. Meanwhile, it is sensible to use your current experiment to find suggestive
evidence of a mechanism, provided it doesn't entail too much extra work or expense. And
if it's research for a PhD, you are expected to measure one or more mechanism variables
and discuss intelligently what the data mean.
Finally, a useful application for mechanism variables: they can define the magnitude of
placebo effects in unblinded experiments. In such experiments, there is always a doubt
that any treatment effect can be partly or wholly a placebo effect. But if you find a
correlation between the change in the dependent variable and change in an objective
mechanism variable--one that cannot be affected by the psychological state of the
subject--then you can say for sure that the treatment effect is not all placebo. And the
stronger the correlation, the smaller the placebo effect. The method works only if there
are individual differences in the response to the treatment, because you can't get a
correlation if every subject has the same change in the dependent variable. (Keep in mind
that some apparent variability in the response between subjects is likely to be random
error in the dependent variable, rather than true individual differences in the response to
the treatment.)
Surprisingly, the objective variable can be almost anything, provided the subject is
unaware of any change in it. In our example of strength training, limb girth is not a good
variable to exclude a placebo effect: subjects may have noticed their muscles get bigger,
so they may have expected to do better in a strength test. In fact, any noticeable changes
could inspire a placebo effect, so any objective variables that correlate with the noticeable
change won't be useful to exclude a placebo effect. Think about it. But if the subjects
noticed nothing other than a change in strength, and you found an association between
change in blood lipids, say, and change in strength, then the change in strength cannot all
be a placebo effect. Unless, of course, changes in blood lipids are related to susceptibility
to suggestion...unlikely, don't you think?
Risks to Validity
0 = no correlation
o
o
o
Significance Tests
't' test
Types of reliability
Intra-rater
Inter-rater
Intra-session
Inter-session
Mann-Whitney