MODULE 3-Essential Measuring, Validation of Instruments and Suggestions in Writing Test.

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 16

Cover designed by: Mr.

Medel Valencia

MODULE
ASSESSMENT OF LEARNING 2
ACADEMIC YEAR 2020-2021

Prepared by:
JASON P. RICAFORTE
Instructor
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
Cover designed by: Mr. Medel Valencia

MODULES FOR ASSESSMENT OF LEARNING 2

Credits : 3 units lecture (3 hours/week)


Pre-Requisite : 3rd year

Lesson Title: Essential: Measuring, Validation of Instruments and Suggestions in Writing


Test.

Lesson Objective:
At the end of the module, the learners will be able to:
1. explain the Essentials Characteristics of Good Measuring Instruments
2. discuss the methods determining Reliability
3. analyze the stages in the Development and Validation of an Assessment Instruments; and
4. apply the General and Specific Suggestions in Writing a Test.

Lectures and Annotations:

The concepts of reliability (consistency) and validity (accuracy) are dependent to each
other. An assessment that has low reliability will also tend to have low validity because it gives
poor consistency and poor accuracy. An assessment can be reliable without being valid. But an
assessment cannot be valid without being reliable. An assessment produces similar or consistent
results if it measures the same “thing”, and at the same time, reflect the learning goals.
Therefore, in assessment, validity and reliability are of equal importance.

According to William (1993), the combination of the two concepts, dependability


(reliability + validity = dependability) is the “best” practice in assessment. However, It is
impossible to have high reliability and high validity, but it is necessary to consider the balance of
priorities. In my opinion, performance assessments, like traditional examination and test, must be
at least content valid without compromising reliability. In order for an assessment to be valid,
reliability is a prior requirement. However, when one concept is increased, it tends to decrease
the other. For example, if assessment validity is increased by including outcomes such as higher

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
level thinking skills, then reliability is likely to decrease, because these outcomes are not easily
assessed.

Essential Characteristics of Good Measuring Instruments

What is Validity?

VALIDITY
Validity refers to the degree to which a test measures what it intends to measure. It is the
usefulness of the test for a given measure. A valid test is always reliable.

Validity refers to how well the assessment measures what it is supposed to assess. It


refers to the accuracy of an assessment. For example, if I want students to demonstrate
employability when they graduate, my assessments need to emphasize on the acquisition of
employability skills such as communication and interpersonal skills, problem-solving skills,
organization skills and team working. There are many different types of validity.

According to Linn & Miller (2005), there are four types of validity that must be
considered in assessment, which are content, construct, criterion and consequential
validity.
Content validity refers to the extent the knowledge or skills are measured in a test.

Construct validity refers to how well the test measures the knowledge or skills

Criterion validity refers to the extent a test can be used to predict future success or
achievement.

Consequential validity is concern about the meaning and implications of the test. A test results
can bring negative consequences such as discouragement or reduced motivation in students when
the assessment format focus on “teaching to the test”.

Types of Validity and its Purpose


1.1. Rational Validity – This depends upon professional judgment alone; by judgment of
competent teachers, usually three or more experts in the field.

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
1.1.1. Content / Curricular Validity – Validity established by comparison of the
content of the test with a particular type of curriculum, textbook, course of the
study or outline.
Example:
A teacher made a test in biology. Her test has curricular validity if the content is
biology and not history or geography.

1.1.2. Concept / Construct Validity – Validity established by analyzing the activities


and processes that correspond to a particular concept.
Example:
Analysis of the scientific method, of critical thinking, and efficient skill in
writing.

1.2. Statistical / Empirical / Criterion-Related Validity – Validity established by


correlating the results of test with an outside criterion or against an outside valid
criterion.
1.2.1. Congruent Validity – Validity which is established when a test is correlated whit
an existing measure which has a similar function.
Example:
A group of intelligence test is valid if it correlates reasonably with another
intelligence test with known high validity, such as the Otis intelligence test.

1.2.2. Concurrent Validity – Validity established by correlating the test with some
other measures which is obtained at the same time.
Example:
Relate the reading test result with pupils’ average grades in reading given by the
teacher.

1.2.3. Predictive Validity – Validity established by correlating the test with another
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
measure which can foretell later success in school, in one’s job or on life.
Example:
The entrance examination scores in a test administered to a freshman class at the
beginning of the school year is correlated with the average grades at the end of the school
year.
1.3. Logical and Psychological Validity – Validity is established through subjective
analysis of the test by experts in the field. This is usually done if the test cannot be
statistically measured.
Example: Artistic works.

Factors Influencing the Validity of an Assessment Instrument


1. Appropriateness of the test 5. Construction of test items.
2. Directions 6. Length of the test
3. Reading vocabulary and sentence structures 7. Arrangement of Test items
4. Difficulty of items. 8. Patterns of answers

What is reliability?
Reliability refers to consistency and accuracy of test results, the degree to which two or more
forms of the test will yield the same results under uniform conditions. Increasing the length of
the test may raise the reliability of the test. Clear and concise directions would also increase the
reliability of the test.

It explains the stability and consistency of results (scores) when repeated over time or
under varying condition. Reliability pertains to scores instead of people. It can be a measurement
of skill or knowledge. For example, when I ask the same question in two different times, and my
student is able to provide same or consistent answers/results.  This shows that my assessment is
reliable. High reliability is obtained when repeated measurement shows high level of
consistency. Consistency can be shown across time, tasks and markers. It means the assessment
results are the same regardless of what day or time of the assessment, what other tasks of
assessment and who are the assessors or markers. However, no assessments are completely
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
reliable. There is always some random variation or error in an assessment. One of the factors that
affect reliability is bias of the assessor and lack of standardization of the tests.
Methods of Determining Reliability
Methods Types of Procedures Statistical
Reliability Measure
Measure
1. Test-Retest Measure of Give a test twice to the same group Pearson r
Stability with any time interval between tests
from several minutes to several
years
2. Equivalent Measure of Give parallel forms of tests to the Pearson r
Forms Equivalence same group with close time interval
between forms
3. Test-Retest Measure of Give parallel forms of tests to the Pearson r
with Equivalent Stability and same group with increased time
Forms Equivalence interval between forms
4. Split Half Measure of Give a test once. Score equivalent Pearson r and
Internal halves of the test. Spearman
Consistency Brown
Formula
5. Kuder- Measure of Give the test once then correlate the Kuder-
Richardson Internal proportion/percentage of the Richardson
Consistency students passing and not passing a Formula 20
given item and 21

What is Objectivity?
The degree to which no personal judgment, opinion, or bias will affect the scoring of the
test. This can be secured by wording the statements of items in the test in such a way that
only one answer is possible.
* A test should be such that different teachers can similarly score the test and arrive at the

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
same scores. In other words, the more objective the test is, the greater is its reliability.

What is Usability / Practicability?


The degree to which a test can be used by teachers and administrators without
unnecessary waste of time, money, and effort.

It involves the following factors:


1.4. Ease in Administration: The giving and taking of the test is facilitated by clear and

specific directions provided to both the examiner and examinees.

1.5. Proper Mechanical Make-up of the Test: The print in the test must be clear enough

and appropriate to the grade level for which it is used.

1.6. Utility: The content of the test must serve definite needs in the pupils’ activities.

1.7. Economy: The making of the test must not be too costly especially when funds are

limited for the purpose of the test.

1.8. Comparability: The test must be so constructed that the results can be compared to

other test results.

1.9. Ease in Scoring: Objectivity of the test, clear directions in scoring, and adequate key

make scoring of the test easy. Use scoring procedures appropriate to your method and

purpose. The easier the procedure, the more reliable the assessment is.

1.10. Ease of Interpretation: Interpretation is easier if there is a plan on how to use the

results prior t assessment.

1.11. Teacher Familiarity with the Method: The teacher should know the strengths

and weaknesses of the method and how to use them.

1.12. Time Required: Time includes construction and use of the instrument and the

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
interpretation of results. Other things being equal, it is desirable to use the shortest

assessment as possible that provides valid and reliable results.

Stages in the Development and Validation of an Assessment Instrument


Phase I Phase II Phase III Phase IV
Planning Stage Test Construction Test Administration Evaluation Stage
/ Item Writing Stage / Try out Stage
Stage
1. Specify the 1. First Trial Run – using 1. Administration
objectives / 1. Writing of test 50 to100 students of the final
skills and items based on 2. Scoring form of the
content areas to the table of 3. First Item Analysis* - test
be measured. specifications. determine difficulty and 2. Establish test
2. Prepare the 2. Consultation discrimination indicates validity
Table of with experts – 4. First Option Analysis 3. Estimate test
Specifications. subject 5. Revision of the test items reliability
3. Decide on the teacher / test – based on the results of
item format – expert for test items analysis
short answer validation 6. Second Trial Run / Field
form / multiple (content) and testing
choice, etc. editing. 7. Second Item Analysis
8. Second Option Analysis
9. Writing the final form of
the test

General Suggestion in Writing Test


1. Use test specifications as guide to item writing.
2. Construct more test items than needed.
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
3. Write the test items well in advance of the testing date.
4. Write the test items so that the task to be performed is clearly defined.
5. Write each test item in appropriate reading level.
6. Write a test item in a way that it does not become a clue to other test items.
7. Write a test item whose answer is one that would be agreed upon by the experts.
8. Write test item in proper level of difficulty.
9. Whenever a test is revised, recheck its relevance.

Specific Suggestions
A. Supply Types of Tests
1. Word the item/s so that the required answer is both brief and specific.
2. Do not take statements directly from textbooks as a basis for short answer items.
3. A direct question is generally more desirable than an incomplete statement.
4. If the item is to be expressed in numerical units, indicate the type of answer wanted.
5. Blanks for answers should be equal in length.
6. Answers should be written before the item number for easy checking.
7. When completion items are to be used, do not have too many blanks. Blanks should
be at the center or at the end of the sentences and not at the beginning
B. Selective Type of Tests
1. Alternative – Response
a. Avoid broad statements.
b. Avoid trivial statements.
c. Avoid the use of negative statements especially double negatives.
d. Avoid long and complex sentences.
e. Avoid including two ideas in one statement unless cause-effect relationships
are being measured,
f. If opinion is used, attribute it to some source unless the ability to identify
opinion is being specifically measured.
g. The number of true statements and false statements should be approximately
equal.
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
h. Start with a false statement since it is common observation that the first
statement in this type of test is always positive.
2. Matching Type
a. Use only homogenous material in a single matching exercise.
b. Include an unequal number of responses and premises, and instruct the pupils
that responses may be used once, more than once or not at all.
c. Keep the list of items to be matched brief, and place the shorter responses at
the right.
d. Arrange the list of responses in logical order.
e. Indicate in the directions the basis for matching the responses and premises.
f. Place all the items for one matching exercises on the same page.
3. Multiple- Choice
a. The stem of the item should be meaningful by itself and should present a
definite problem.
b. The stem should be free from irrelevant material.
c. Use a negatively stated stem only when significant learning outcomes require
it.
d. Highlight negative words in the stem for emphasis.
e. All the alternatives should be grammatically consistent with the stem of the
item.
f. An item should only have correct or clearly best answer.
g. Items used to measure understanding should contain novelty, but beware of
too much.
h. All distracters should be plausible.
i. Verbal associations between the stem and the correct answer should be
avoided.
j. The relative length of the alternatives should not provide a clue to the answer.
k. The alternatives should be arranged logically.
l. The correct answer should appear in alternate positions and approximately
equal number of times but in random order.
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
m. Use of special alternatives such as “none of the above” or “all of the above”
should be done sparingly.
n. Do not use multiple choice items when other types are more appropriate.
o. Always have the stem and alternatives on the same page.
p. Break any of these rules when you have a good reason for doing so.
C. Essay Type of Tests
a. Restrict the use of essay questions to those learning outcomes that cannot be
satisfactory measured by objective items.
b. Formulate questions that will call for the behavior specified in the learning outcomes.
c. Avoid the use of optional questions.
d. Indicate the approximate time limit or the number of points for each question.

Suggestions in Writing Non-Test or Attitudinal Test


1. Avoid statements that refer to the past rather than to the present.
2. Avoid statements that are factual or capable of being interpreted as factual.
3. Avoid statements that may be interpreted in more than one way.
4. Avoid statements that are irrelevant to the psychological object under consideration.
5. Avoid statements that are likely to be endorsed by almost everyone or by almost none.
6. Select statements that are believed to cover the entire range of affective scale of interests.
7. Keep the language of the statements simple, clear and direct.
8. Statements should be short, rarely exceeding 20 words.
9. Each statement should contain only one complete thought.
10. Statements containing universals such as all, always, none and never often introduce
ambiguity and should be avoided.
11. Words such as only, just, merely, and others of similar nature should be used with care
and moderation in writing statements.

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
Activity 1. IDENTIFICATION.
Direction. Read the sentence carefully. Identify the word being described in the box. Write your
answer on the space provided before each number

A. Criterion Validity F. Validity


B. Reliability G. Usability/Practicability
C. Logical and Psychological Validity H. Utility
D. Objectivity I. Test-Retest
E. Rational Validity J. Linn & Miller

_____ 1. This depends upon professional judgment alone; by judgment of competent


teachers, usually three or more experts in the field.

_____ 2. Validity established through subjective analysis of the test by experts in the field
_____ 3. It refers to the degree to which a test measures what it intends to measure. It is the
usefulness of the test for a given measure. A valid test is always reliable.
_____ 4. The degree to which a test can be used by the teachers and administrators without
Unnecessary waste of time, money and effort.
_____ 5. Factors that focus to the content of the test must serve definite needs in the pupils
activities
_____ 6. It refers to the extent a test can be used to predict future success

or achievement.

_____ 7. It refers to consistency and accuracy of test results, the degree to which two or
more forms of the test will yield the same results under uniform conditions.
_____ 8. The degree to which no personal judgement, opinion, or bias will affect the
Scoring of the test.
_____ 9. Types of reliability that measures of Stability.
_____ 10. Proponents who tell that there are four types of validity that must be
considered in assessment, which are content, construct, criterion and
consequential validity.

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
Academic:
Matching Type: Match the descriptions from Column A with the words in column B. Write
your answer on the space provided before each number.
A B
_____ 1. measure internal consistency then correlate the proportion/ A. Equivalent Forms
percentage of students passing and not passing a given item
_____ 2. measures of stability B. Split half

_____ 3. Measure equivalence C. Kuder-Ricardson


_____ 4. Measure both Stability and Equivalence D. Test-Retest
_____ 5. give a test once. Score equivalent halves of the test E. Test-Retest with
the instruction Equivalent forms

Life Activity: Share your thoughts.


Explain the stages in the development and validation of an Assessment Instruments
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
Assessment
MULTIPLE CHOICE
Directions: Read the questions carefully. Choose the letter of the correct answer. Write your
answer on the space provided before each number.

_____ 1. Which of the following is the first step in constructing test items?
A. prepare test item in advance
B. define instructional objectives clearly
C. prepare a table of specifications or blueprint
D. prepare more items than are actually needed

____ 2. In test standardization, when is a table of specification needed?


A. During the tryout of the test.
B. When norms have to be developed.
C. In planning and writing the test items.
D. When test results have to be interpreted.

____ 3. Which of the following statements is INCORRECT?


A. Write test items more than what is needed.
B. Write test items in advance of the testing date
C. Write test items that are appropriate to the level of the examinees.
D. Write test items whose answers could be interpreted differently by the experts.

____ 4. The preliminary draft of a 100-item test, How many items are needed?
A. 10 B. 200 C. 125-150 D. 110

____ 5. A sixth grade teacher gave a Reading Test twice to the same class. In the first
administration, the bright students got high scores, while in the second administration the
mediocre students got high scores. Which of the following was lacking in the Reading Test?
A. Coherence B. Comprehensiveness C. Objectivity D. Reliability

____ 6. Which of the following characteristics of a good test is first given consideration?
A. Validity B. Reliability C. Administrability D. Usability

_____ 7. In a certain school, majority of the pupils who got very high score in the entrance test
got very low grade point averages at the end of the school year. What type of validity does the
entrance test lack?
A. Concurrent B. Curricular C. Predictive D. Construct

_____ 8. Which type of validity indicates the extent the test covers the content of the program
learned?
A. Predictive B. Construct C. Concurrent D. Curricular

_____ 9. When the papers of a group of students who took the test were corrected, correctors
JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021
[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
had to use their personal judgments as to whether the students’ answer to many items could be
considered correct. What characteristic of the test had been overlooked?
A. Usability B. Objectivity C. Validity D. Reliability

_____ 10. It is simply defined as the numerical index of an individual’s actual performance in a
test. It is expressed in terms of time taken to complete the task or the amount done successfully
in a given time. Which of the following words is being described?
A. Percentile B. Decile C. Range D. Score

Reflections: It’s time for you to share. What have you learned about the lesson?
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246
References:

Asaad, Abubakar S. Measurement And Evaluation Concepts And Principles.

Cajigal, Ronan M. and Mantuano, Maria Leflor D. (2014). Assessment of Learning 2.


Adriana Publishing Co., Inc.

De Guzman-Santos, Rosita, Advanced Methods In Educational Assessment And


Evaluation (Assessment Of Learning 2)

Linn, R.L., & Miller, M.D. (2005). Measurement and assessment in teaching (9th ed.).


New Jersey: Pearson Education.

Nicol, D. J., & Macfarlane‐Dick, D. (2006). Formative assessment and self‐regulated


learning: A model and seven principles of good feedback practice. Studies in higher
education, 31(2), 199-218.

Popham W.J. (2002) Modern Educational Measurement: Practical Guidelines for


Educational Leaders. Needham, MA, USA: Allyn and Bacon.

Rust, C., Price, M., & O'DONOVAN, B. E. R. R. Y. (2003). Improving students' learning


by developing their understanding of assessment criteria and processes. Assessment &
Evaluation in Higher Education, 28(2), 147-164.

Sadler R. (1989) Formative assessment and the design of instructional


systems. Instructional Science 18: 119-144.

SLRC Review Center, LET Reviewer Materials for Measurement and Evaluation.

JASON P. RICAFORTE,EdD Assessment of Learning 2, 2020-2021


[email protected] Facebook Acct: https://www.facebook.com/jason.ricaforte.7/
Contact Numbers: 09173187246

You might also like