MODULE 3-Essential Measuring, Validation of Instruments and Suggestions in Writing Test.
MODULE 3-Essential Measuring, Validation of Instruments and Suggestions in Writing Test.
MODULE 3-Essential Measuring, Validation of Instruments and Suggestions in Writing Test.
Medel Valencia
MODULE
ASSESSMENT OF LEARNING 2
ACADEMIC YEAR 2020-2021
Prepared by:
JASON P. RICAFORTE
Instructor
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
Cover designed by: Mr. Medel Valencia
Lesson Objective:
At the end of the module, the learners will be able to:
1. explain the Essentials Characteristics of Good Measuring Instruments
2. discuss the methods determining Reliability
3. analyze the stages in the Development and Validation of an Assessment Instruments; and
4. apply the General and Specific Suggestions in Writing a Test.
The concepts of reliability (consistency) and validity (accuracy) are dependent to each
other. An assessment that has low reliability will also tend to have low validity because it gives
poor consistency and poor accuracy. An assessment can be reliable without being valid. But an
assessment cannot be valid without being reliable. An assessment produces similar or consistent
results if it measures the same “thing”, and at the same time, reflect the learning goals.
Therefore, in assessment, validity and reliability are of equal importance.
What is Validity?
VALIDITY
Validity refers to the degree to which a test measures what it intends to measure. It is the
usefulness of the test for a given measure. A valid test is always reliable.
According to Linn & Miller (2005), there are four types of validity that must be
considered in assessment, which are content, construct, criterion and consequential
validity.
Content validity refers to the extent the knowledge or skills are measured in a test.
Construct validity refers to how well the test measures the knowledge or skills
Criterion validity refers to the extent a test can be used to predict future success or
achievement.
Consequential validity is concern about the meaning and implications of the test. A test results
can bring negative consequences such as discouragement or reduced motivation in students when
the assessment format focus on “teaching to the test”.
1.2.2. Concurrent Validity – Validity established by correlating the test with some
other measures which is obtained at the same time.
Example:
Relate the reading test result with pupils’ average grades in reading given by the
teacher.
1.2.3. Predictive Validity – Validity established by correlating the test with another
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
measure which can foretell later success in school, in one’s job or on life.
Example:
The entrance examination scores in a test administered to a freshman class at the
beginning of the school year is correlated with the average grades at the end of the school
year.
1.3. Logical and Psychological Validity – Validity is established through subjective
analysis of the test by experts in the field. This is usually done if the test cannot be
statistically measured.
Example: Artistic works.
What is reliability?
Reliability refers to consistency and accuracy of test results, the degree to which two or more
forms of the test will yield the same results under uniform conditions. Increasing the length of
the test may raise the reliability of the test. Clear and concise directions would also increase the
reliability of the test.
It explains the stability and consistency of results (scores) when repeated over time or
under varying condition. Reliability pertains to scores instead of people. It can be a measurement
of skill or knowledge. For example, when I ask the same question in two different times, and my
student is able to provide same or consistent answers/results. This shows that my assessment is
reliable. High reliability is obtained when repeated measurement shows high level of
consistency. Consistency can be shown across time, tasks and markers. It means the assessment
results are the same regardless of what day or time of the assessment, what other tasks of
assessment and who are the assessors or markers. However, no assessments are completely
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
reliable. There is always some random variation or error in an assessment. One of the factors that
affect reliability is bias of the assessor and lack of standardization of the tests.
Methods of Determining Reliability
Methods Types of Procedures Statistical
Reliability Measure
Measure
1. Test-Retest Measure of Give a test twice to the same group Pearson r
Stability with any time interval between tests
from several minutes to several
years
2. Equivalent Measure of Give parallel forms of tests to the Pearson r
Forms Equivalence same group with close time interval
between forms
3. Test-Retest Measure of Give parallel forms of tests to the Pearson r
with Equivalent Stability and same group with increased time
Forms Equivalence interval between forms
4. Split Half Measure of Give a test once. Score equivalent Pearson r and
Internal halves of the test. Spearman
Consistency Brown
Formula
5. Kuder- Measure of Give the test once then correlate the Kuder-
Richardson Internal proportion/percentage of the Richardson
Consistency students passing and not passing a Formula 20
given item and 21
What is Objectivity?
The degree to which no personal judgment, opinion, or bias will affect the scoring of the
test. This can be secured by wording the statements of items in the test in such a way that
only one answer is possible.
* A test should be such that different teachers can similarly score the test and arrive at the
1.5. Proper Mechanical Make-up of the Test: The print in the test must be clear enough
1.6. Utility: The content of the test must serve definite needs in the pupils’ activities.
1.7. Economy: The making of the test must not be too costly especially when funds are
1.8. Comparability: The test must be so constructed that the results can be compared to
1.9. Ease in Scoring: Objectivity of the test, clear directions in scoring, and adequate key
make scoring of the test easy. Use scoring procedures appropriate to your method and
purpose. The easier the procedure, the more reliable the assessment is.
1.10. Ease of Interpretation: Interpretation is easier if there is a plan on how to use the
1.11. Teacher Familiarity with the Method: The teacher should know the strengths
1.12. Time Required: Time includes construction and use of the instrument and the
Specific Suggestions
A. Supply Types of Tests
1. Word the item/s so that the required answer is both brief and specific.
2. Do not take statements directly from textbooks as a basis for short answer items.
3. A direct question is generally more desirable than an incomplete statement.
4. If the item is to be expressed in numerical units, indicate the type of answer wanted.
5. Blanks for answers should be equal in length.
6. Answers should be written before the item number for easy checking.
7. When completion items are to be used, do not have too many blanks. Blanks should
be at the center or at the end of the sentences and not at the beginning
B. Selective Type of Tests
1. Alternative – Response
a. Avoid broad statements.
b. Avoid trivial statements.
c. Avoid the use of negative statements especially double negatives.
d. Avoid long and complex sentences.
e. Avoid including two ideas in one statement unless cause-effect relationships
are being measured,
f. If opinion is used, attribute it to some source unless the ability to identify
opinion is being specifically measured.
g. The number of true statements and false statements should be approximately
equal.
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
h. Start with a false statement since it is common observation that the first
statement in this type of test is always positive.
2. Matching Type
a. Use only homogenous material in a single matching exercise.
b. Include an unequal number of responses and premises, and instruct the pupils
that responses may be used once, more than once or not at all.
c. Keep the list of items to be matched brief, and place the shorter responses at
the right.
d. Arrange the list of responses in logical order.
e. Indicate in the directions the basis for matching the responses and premises.
f. Place all the items for one matching exercises on the same page.
3. Multiple- Choice
a. The stem of the item should be meaningful by itself and should present a
definite problem.
b. The stem should be free from irrelevant material.
c. Use a negatively stated stem only when significant learning outcomes require
it.
d. Highlight negative words in the stem for emphasis.
e. All the alternatives should be grammatically consistent with the stem of the
item.
f. An item should only have correct or clearly best answer.
g. Items used to measure understanding should contain novelty, but beware of
too much.
h. All distracters should be plausible.
i. Verbal associations between the stem and the correct answer should be
avoided.
j. The relative length of the alternatives should not provide a clue to the answer.
k. The alternatives should be arranged logically.
l. The correct answer should appear in alternate positions and approximately
equal number of times but in random order.
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
m. Use of special alternatives such as “none of the above” or “all of the above”
should be done sparingly.
n. Do not use multiple choice items when other types are more appropriate.
o. Always have the stem and alternatives on the same page.
p. Break any of these rules when you have a good reason for doing so.
C. Essay Type of Tests
a. Restrict the use of essay questions to those learning outcomes that cannot be
satisfactory measured by objective items.
b. Formulate questions that will call for the behavior specified in the learning outcomes.
c. Avoid the use of optional questions.
d. Indicate the approximate time limit or the number of points for each question.
_____ 2. Validity established through subjective analysis of the test by experts in the field
_____ 3. It refers to the degree to which a test measures what it intends to measure. It is the
usefulness of the test for a given measure. A valid test is always reliable.
_____ 4. The degree to which a test can be used by the teachers and administrators without
Unnecessary waste of time, money and effort.
_____ 5. Factors that focus to the content of the test must serve definite needs in the pupils
activities
_____ 6. It refers to the extent a test can be used to predict future success
or achievement.
_____ 7. It refers to consistency and accuracy of test results, the degree to which two or
more forms of the test will yield the same results under uniform conditions.
_____ 8. The degree to which no personal judgement, opinion, or bias will affect the
Scoring of the test.
_____ 9. Types of reliability that measures of Stability.
_____ 10. Proponents who tell that there are four types of validity that must be
considered in assessment, which are content, construct, criterion and
consequential validity.
_____ 1. Which of the following is the first step in constructing test items?
A. prepare test item in advance
B. define instructional objectives clearly
C. prepare a table of specifications or blueprint
D. prepare more items than are actually needed
____ 4. The preliminary draft of a 100-item test, How many items are needed?
A. 10 B. 200 C. 125-150 D. 110
____ 5. A sixth grade teacher gave a Reading Test twice to the same class. In the first
administration, the bright students got high scores, while in the second administration the
mediocre students got high scores. Which of the following was lacking in the Reading Test?
A. Coherence B. Comprehensiveness C. Objectivity D. Reliability
____ 6. Which of the following characteristics of a good test is first given consideration?
A. Validity B. Reliability C. Administrability D. Usability
_____ 7. In a certain school, majority of the pupils who got very high score in the entrance test
got very low grade point averages at the end of the school year. What type of validity does the
entrance test lack?
A. Concurrent B. Curricular C. Predictive D. Construct
_____ 8. Which type of validity indicates the extent the test covers the content of the program
learned?
A. Predictive B. Construct C. Concurrent D. Curricular
_____ 9. When the papers of a group of students who took the test were corrected, correctors
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
had to use their personal judgments as to whether the students’ answer to many items could be
considered correct. What characteristic of the test had been overlooked?
A. Usability B. Objectivity C. Validity D. Reliability
_____ 10. It is simply defined as the numerical index of an individual’s actual performance in a
test. It is expressed in terms of time taken to complete the task or the amount done successfully
in a given time. Which of the following words is being described?
A. Percentile B. Decile C. Range D. Score
Reflections: It’s time for you to share. What have you learned about the lesson?
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
SLRC Review Center, LET Reviewer Materials for Measurement and Evaluation.