Educational Measurement & Evaluation

Up to Date
Objective MCQs, Topic Wise

Guide for Pedagogy Teaching
Jobs
Educational Measurement
& Evaluation
Prepared by:
Muhammad Naseer Khan
Subject Specialist English
E&SE Dept. Govt of AJ&K
Whatsap +923229310761
Part I
1. Items analysis focuses to find out:
A. Facility index
B. Discrimination Power
C. Effectiveness of distractors
D. All of the above
Answer is: D
Item analysis is the act of analyzing student responses to individual exam
questions with the intention of evaluating exam quality. It is an important
tool to uphold test effectiveness and fairness. Item analysis is likely
something educators do both consciously and unconsciously on a regular
basis.
The facility index of a test item is the percentage of a group of testees
that chooses the correct response. What is it in a particular item which
determines whether it will be easy or difficult?
Discriminatory power measures the degree to which a test score varies
with the level of the measured trait and thus reflects the effectiveness of
a test detect differences between participants concerning the respective
traits or discriminate between high achievers and low achievers.
An item distractor, also known as a foil or a trap, is an incorrect option for
a selected-response item on an assessment.
Distractor Effectiveness When distractors are ineffective and obviously
incorrect as opposed to being more disguised, then they become
ineffective in assessing student knowledge. An effective distractor will
attract test takers with a lower overall score than those with a higher
overall score
2. Facility index of an item determines?
A. Ease or difficulty
B. Discrimination power
C. Objectivity
D. Reliability
Answer is = A
3. High and low achievers are sorted out by?

A. Ease or difficulty
B. Discrimination power
C. Both of the Above
D. None of the above
Answer is = B
Discriminatory power measures the degree to which a test score varies

with the level of the measured trait and thus reflects the effectiveness of
a test detect differences between participants concerning the respective
traits or discriminate between high achievers and low achievers.
4. Test item is acceptable which its faculty index /difficulty level ranges
from?
A. 30-70 %
B. 70 %
C. 30%
D. None
Answer is =A
5. Test item is very easy when value of faculty index/ difficulty level is
higher than?
A. 30-70 %
B. 70 %
C. 30%
D. None
Answer is =B
6. Test item is very difficult when value of facility index/ difficulty level is
less than?
A. 30-70 %
B. 70 %
C. 30%
D. None
Answer is =C
7. Discrimination power of an item is acceptable when its value

ranges from?
A. 0.30 – 1
B. 0.20-0.30
C. 0.10 – 0.20
D. None
Answer is = A
Understanding Item discrimination
Item discrimination is the difference between the percentage correct for
these two groups. The maximum item discrimination difference
is 100 percent. This would occur if all those in the upper group answered
correctly and all those in the lower group answered incorrectly.
8. Test item discriminates 100% when its value for discrimination is?
A. 0.30 – 1
B. 1
C. 0.30
D. None
Answer is = B
9. Test item cannot discriminate low achievers and high achievers when
its value is lower than?
A. 0.30 – 1
B. 1
C. 0.30
D. None
Answer is = C
10. The quality of test that measures “what it claims to measure” is?
A. Validity
B. Differentiability
C. Objectivity
D. Reliability
Answer is = A
Objectivity:
Objectivity is an important characteristic of a good test. It affects both
validity and reliability of test scores. Objectivity of a measuring
instrument moans the degree to which different persons scoring the
answer receipt arrives of at the same result. C.V. Good (1973) defines
objectivity in testing is “the extent to which the instrument is free from
personal error (personal bias), that is subjectivity on the part of the
scorer”.
(i) Objectivity of Scoring:

Objectivity of scoring means same person or different persons scoring the
test at any time arrives at the same result without may chance error. A
test to be objective must necessarily so worded that only correct answer
can be given to it. In other words the personal judgement of the
individual who score the answer script should not be a factor affecting
the test scores. So that the result of a test can be obtained in a simple
and precise manner if the scoring procedure is objective. The scoring
procedure should be such that there should be no doubt as to whether
an item is right or wrong or partly right or partly wrong.
(ii) Objectivity of Test Items:

By item objectivity we mean that the item must call for a definite single
answer. Well-constructed test items should lead themselves to one and
only one interpretation by students who know the material involved. It
means the test items should be free from ambiguity. A given test item
should mean the same thing to all the students that the test maker
intends to ask. Dual meaning sentences, items having more than one
correct answer should not be included in the test as it makes the test
subjective.
Reliability:
The dictionary meaning of reliability is consistency, dependence or trust.
So in measurement reliability is the consistency with which a test yields
the same result in measuring whatever it does measure. A test score is
called reliable when we have reason for believing the score to be stable
and trust-worthy.
Test validity is the extent to which a test accurately measures what it is

supposed to measure.
11. The greatest danger in using the test-retest method of estimating
reliability is that the test takers will score differently on the test because
of:
a. practice effects.
b. alternate forms.
c. order effects.
d. parallel forms.
Answer is: A
There are four main types of reliability. Each can be estimated by
comparing different sets of results produced by the same method.
Reliability
Type of reliability Measures the consistency of…
Test-retest The same test over time.
Interrater The same test conducted by different people.
Parallel forms Different versions of a test which are designed to be

equivalent.
Internal The individual items of a test.

consistency
Test-retest reliability
Test-retest reliability measures the consistency of results when you
repeat the same test on the same sample at a different point in time. You
use it when you are measuring something that you expect to stay
constant in your sample.
A test of colour blindness for trainee pilot applicants should have high
test-retest reliability, because colour blindness is a trait that does not
change over time.
Interrater reliability
Interrater reliability (also called interobserver reliability) measures the
degree of agreement between different people observing or assessing
the same thing. You use it when data is collected by researchers assigning
ratings, scores or categories to one or more variables.
Parallel forms reliability
Parallel forms reliability measures the correlation between two
equivalent versions of a test. You use it when you have two different
assessment tools or sets of questions designed to measure the same
thing.
Alternate forms reliability
It is a measure of reliability between two different forms of the same

test. Two equivalent (but different) tests are administered, scores
are correlated, and a reliability coefficient is calculated. A test would be
deemed reliable if differences in one test’s observed scores correlate
with differences in an equivalent test’s scores.
Internal consistency
Internal consistency assesses the correlation between multiple items in a
test that are intended to measure the same construct.
Split-half reliability: You randomly split a set of measures into two sets.
After testing the entire set on the respondents, you calculate the
correlation between the two sets of responses.
12. When a test developer gives the same test to the same group of test
takers on two different occasions, he/she can measure
a. internal consistency.
b. test-retest reliability.
c. split-half reliability.
d. validity.
Answer is: B
13. As a rule, adding more questions to a test that measures the same
trait or attribute _________ the test’s reliability.
a. can decrease
CE b. can increase
c. does not affect
d. lowers
Answer is: B
14. The greatest danger when using alternate/parallel forms is that the:
a. test takers will display practice effects.

b. test takers can be confused by order effects.
c. two forms will not be equivalent.
d. test will be reliable but not valid.
Answer is: C
Equivalent (parallel) forms Two or more forms of a test covering the
same content whose item difficulty levels are similar.
15. What is the amount of consistency among scorers’ judgements
called?
A. Internal reliability.
B. Interrater reliability.
C. Test-retest reliability.
D. Intrascorer reliability.
Answer is: B
Intra-rater reliability is the degree of agreement among repeated
administrations of a diagnostic test performed by a single rater.
16. Which of the following is NOT one of the four types of validity?
A. Inter-rater
B. Test-retest
C. Predictive
D. Parallel forms
Answer is: C
17. In a study of children social behavior, what type of reliability would be
considered important?
A. Inter-rater
B. Test-retest
C. Predictive
D. Parallel forms
Answer is: A
18. Measurement reliability refers to the:
A. consistency of the scores.
B. dependency of the scores.
C. comprehensiveness of the scores.
D. accuracy of the scores.
Answer is: A
19. If a measure is consistent over multiple occasions, it has:
Inter-rater
Test-retest
Predictive
Parallel forms
Answer is: C
20. The validity of a measure refers to the:
A. consistency of the measurement.
B. accuracy with which it measures the construct.
C. particular type of construct specification.
D. comprehensiveness with which it measures the construct.
Answer is; B
The four types of validity
Validity tells you how accurately a method measures something. If a
method measures what it claims to measure, and the results closely
correspond to real-world values, then it can be considered valid. There
are four main types of validity:
 Construct validity: Does the test measure the concept that it’s
intended to measure?
 Content validity: Is the test fully representative of what it aims to
measure?
 Face validity: Does the content of the test appear to be suitable to
its aims?
 Criterion validity: Do the results correspond to a different test of the
same thing?
21. When you are confident that the experimental manipulation

produced the changes you measured in the dependent variable, your
study probably has good ________ validity.
A. Internal
B. External
C. Causal
D. Construct
Answer is: A
External validity is the validity of applying the conclusions of a scientific
study outside the context of that study. In other words, it is the extent to
which the results of a study can be generalized to and across other
situations, people, stimuli, and times.
Internal validity is the extent to which a piece of evidence supports a
claim about cause and effect, within the context of a particular study. It is
one of the most important properties of scientific studies, and is an
important concept in reasoning about evidence more generally.
Alternate-form reliability is also known as
A. Split-half reliability.
B. Test-retest reliability.
C. Parallel forms
D. Convergent reliability
Answer is: D
22. The characteristic of a test to siscriminate between high achievers

and low achievers is?
A. Validity
C. Objectivity
D. Reliability
Answer is = B
23. If the scoring of the test is not effected by any factor, quality of test is
called?
A. Validity
C. Objectivity
D. Reliability
Answer is = C
24. The quality of test to give same scores when administered at different
occasions is?
A. Validity
C. Objectivity
D. Reliability
Answer is = D
25. If the sample of the question in the test is sufficiently large enough,
the quality of test is?
A. Adequacy
C. Objectivity
D. Reliability
Answer is = A
Part II
1. The quality of test showing ease of time, cost, administration and

interpretation is called?
A. Usability
C. Objectivity
D. Reliability
Answer is = A
2. The split-half method is used as a test of
A. Stability
B. Internal reliability
C. Inter-observer consistency
D. External validity
Answer: B
3. The Error free test is said to be:

A. Reliable
B. Valid
C. Objective
D. Comprehensive
Answer is: A
4. Which is not the characterisitic of a good test?

A. Reliability
B. Subjectivity
C. Objectivity
D. Usability
Answer is: B
5. If a test measures the learning objectives for which it is constructed,

then it indicates which characteristic of the test?
Validity
Reliability
Usability
Objectivity
Answer is: A
6. Limited to quantitative description of pupil’s performance is?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = B
Test:
A test or quiz is used to examine someone's knowledge of something to
determine what he or she knows or has learned. Testing measures the
level of skill or knowledge that has been reached.
Measurement:
Educational measurement is the science and practice of obtaining
information about characteristics of students, such as their knowledge,
skills, abilities, and interests. Measurement is the process of assigning
numbers to events based on an established set of rules.
Assessment and Evaluation:

Assessment is feedback from the student to the instructor about the
student's learning. Evaluation uses methods and measures to
judge student learning and understanding of the material for purposes of
grading and reporting. Evaluation is feedback from the instructor to the
student about the student's learning.
Different Types of Testing
There are four types of testing in schools today — diagnostic, formative,

benchmark, and summative. What purpose does each serve? How should
parents use them and interpret the feedback from them?
1. Pre-assessment or diagnostic assessment
This testing is used to “diagnose” what a student knows and does not
know. Diagnostic testing typically happens at the start of a new phase of
education, like when students will start learning a new unit. The test covers
topics students will be taught in the upcoming lessons.
Teachers use diagnostic testing information to guide what and how they
teach. For example, they will plan to spend more time on the skills that
students struggled with most on the diagnostic test. If students did
particularly well on a given section, on the other hand, they may cover that
content more quickly in class. Students are not expected to have mastered
all the information in a diagnostic test.
Diagnostic testing can be a helpful tool for parents. The feedback my kids
receive on these tests lets me know what kind of content they will be
focusing on in class and lets me anticipate which skills or areas they may
have trouble with.
2. Formative Assessment
Formative assessment is used in the first attempt of developing

instruction. The goal is to monitor student learning to provide feedback.
It helps identifying the first gaps in your instruction. Based on this
feedback you’ll know what to focus on for further expansion for your
instruction.
3. Summative Assessment
Summative assessment is aimed at assessing the extent to which the
most important outcomes at the end of the instruction have been
reached. But it measures more: the effectiveness of learning, reactions
on the instruction and the benefits on a long-term base. The long-term
benefits can be determined by following students who attend your
course, or test. You are able to see whether and how they use the
learned knowledge, skills and attitudes.
4. Benchmark Testing
This testing is used to check whether students have mastered a unit of
content. Benchmark testing is given during or after a classroom focuses on
a section of material, and covers either a part or all of the content has
been taught up to that time. The assessments are designed to let teachers
know whether students have understood the material that’s been covered.
Unlike diagnostic testing, students are expected to have mastered material

on benchmark tests, since they covers what the children have been
focusing on in the classroom. Parents will often receive feedback about
how their children have grasped each skill assessed on a benchmark test.
This feedback is very important to me as a parent, since it gives me insight
into exactly which concepts my boys did not master. Results are broken
down by skills, so if I want to further review a topic with my boys, I can find
corresponding lessons, videos, or games online, or ask their teachers for
resources.
5. Criterion-referenced assessment
It measures student’s performances against a fixed set of predetermined

criteria or learning standards. It checks what students are expected to
know and be able to do at a specific stage of their education. Criterion-
referenced tests are used to evaluate a specific body of knowledge or skill
set, it’s a test to evaluate the curriculum taught in a course.
6. Norm-referenced assessment
This compares a student’s performance against an average norm. This
could be the average national norm for the subject History, for example.
Other example is when the teacher compares the average grade of his or
her students against the average grade of the entire school.
7. Ipsative assessment
It measures the performance of a student against previous performances

from that student. With this method you’re trying to improve yourself by
comparing previous results. You’re not comparing yourself against other
students, which may be not so good for your self-confidence.
8. Self Assessment
Self-assessment is defined as 'the involvement of learners in making
judgements about their achievements and the outcomes of
their learning' and is a valuable approach to supporting student learning
9. Confirmative assessment
When your instruction has been implemented in your classroom, it’s still
necessary to take assessment. Your goal with confirmative assessments is
to find out if the instruction is still a success after a year, for example, and
if the way you're teaching is still on point. You could say that a
confirmative assessment is an extensive form of a summative
assessment.
10. Assessment of learning

Assessments are a way to find out what students have learned and if
they’re aligning to curriculum or grade-level standards.
Assessments of learning are usually grade-based, and can include:
 Exams
 Portfolios
 Final projects
 Standardized tests
They have a concrete grade attached to them that communicates student

achievement to teachers, parents, students, school-level administrators
and district leaders.
Common types of assessment of learning include:
 Summative assessments
 Norm-referenced assessments
 Criterion-referenced assessments
11. Assessment for learning
Assessments for learning provide you with a clear snapshot of student

learning and understanding as you teach -- allowing you to adjust
everything from your classroom management strategies to your lesson
plans as you go. Assessments for learning should always be ongoing and
actionable.
Common types of assessment for learning include
formative assessments and diagnostic assessments.
12. Assessment as learning

Assessment as Learning is the use of ongoing self-assessment by students
in order to monitor their own learning, which is “characterized by
students reflecting on their own learning and making adjustments so that
they achieve deeper understanding.”
Assessment as learning actively involves students in the learning process. It
teaches critical thinking skills, problem-solving and encourages students to
set achievable goals for themselves and objectively measure their
progress.
7. The purpose of the evaluation is to make?
A. Decision
B. Prediction
C. Judgment
D. Opinion
Answer is = C
8. The purpose of evaluation is to make judgment about educational?

A. Quanitiy
B. Quality
C. Teme period
D. Age
Answer is = B
9. Evaluation that monitors learning progress is?

A. Placement evaluation
B. Formative evaluation
C. Diagnostic evaluation
D. Summative evaluation
Answer is = B
10. A formal and systematic procedure of getting information is?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = A
11. The process of obtaining numerical value is?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = B
12. A sum of questions is?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = A
13. The first step in measurement is?

A. Decision of what to measure
B. Development of the test
C. Administering the test
D. None
Answer is = A
14. The purpose of formative evaluation is?

A. Decision of what to measure
B. Development of the test
C. Administering the test
D. Monitoring progress of students
Answer is = D
15. To assess achievement at the end of instruction is?

A. Placement Assessment
B. Formative Assessment
C. Summative Assessment
D. Diagnostic Assessment
Answer is = C
16. Vast of all in scope?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = D
17. The least in scope is?

A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is = A
18. Permanent difficulties in learning are investigated in?

A. Placement Assessment
B. Formative Assessment
C. Summative Assessment
D. Diagnostic Assessment
Answer is = D
19. In norm referenced test the comparison is between?

A. Groups
B. Individuals
C. Areas
D. Interest
Answer is = B
20. In which question marking will be more reliable?

A. Completion
B. Short answer
C. Multiple choice question
D. Essay
Answer is = C
21. Objective type question have advantage over essay type because
such questions?
A. Are easy to prepare
B. Are eay to solve
C. Are easy to mark
D. None
Answer is = C
22. In multiple choice items the stem of the items should be?
A. Large
B. Small
C. Meaningful
D. None
Answer is = C
23. Running description of active behavior of a student as observed by

the teacher is?
A. Anecdotal record
B. Autobiography
C. Interview
D. None
Answer is = A
An anecdotal record is a detailed descriptive narrative recorded after a
specific behavior or interaction occurs. Anecdotal records inform
teachers as they plan learning experiences, provide information to
families, and give insights into identifying possible developmental delays.
24. A test very popular with class room teacher is?

A. True false test
B. Multiple choices
C. Matching
D. Completion test
Answer is = B
25. Frequently used tools of summative evaluation are?

A. Test
B. Teacher observation
C. Daily assignment
D. None
Answer is = A
26. The summative evaluation is?

A. Diagnostic
B. Certifying judgment
C. Continuous
D. None
Answer is = B
Part III
1. The median of 7, 6, 4, 8, 2, 5, 11 is
A. 6
B. 12
C. 11
D. 4
Answer A
Median is a statistical measure that determines the middle value of a
dataset listed in ascending order (i.e., from smallest to largest value).
How to Find the Median?
The median can be easily found. In some cases, it does not require any
calculations at all. The general steps of finding the median include:
1. Arrange the data in ascending order (from the lowest to the largest
value).
2. Determine whether there is an even or an odd number of values in
the dataset.
3. Considering the results of the previous step, further analysis may
follow two distinct scenarios:
4. If the dataset contains an odd number of values, the median is a
central value that will split the dataset into halves.
5. If the dataset contains an even number of values, find the two
central values that split the dataset into halves. Then, calculate the
mean of the two central values. That mean is the median of the
dataset.
2. Number which occurs most frequently in a set of numbers is

A. mean
B. median
C. mode
D. None of above
Answer is: C
3. The mode of 12, 17, 16, 14, 13, 16, 11, 14 is

A. 13
B. 11
C. 14
D. 14 and 16
Answer is: D
4. The average of all observations in a set of data is known as
A. median
B. range
C. mean
D. mode
Answer is: C
5. A multiple choice question is composed of question referred as?

A. Stem
B. Distracter
C. Foil
D. Response
Answer is = A
6. In a norm referenced test which item is best?

A. Item difficulty is near zero
B. Item difficulty is near 100
C. Item difficulty is near 70
D. Item difficulty is near 50
Answer is = D
7. Which question has increasing objectivity of marking?

A. Unstructured essay
B. Structured essay
C. Short answer
D. Multiple type questions
Answer is = D
8. The most widely used format on standardized test in USA is?

B. Structured essay
C. Short answer
Answer is = D
9. Which questions are difficult to mark with reliability?

B. Structured essay
C. Short answer
Answer is = A
10. Projective techniques are used to measure?

A. Aptitude
B. Intelligence
C. Knowledge
D. Personality
Answer is = D
11. Test meant for prediction on a certain criterion are called?

A. Aptitude test
B. Intelligence
C. Knowledge
D. Personality
Answer is = A
12. Kuder Richardson method is used to estimate?

A. Reliability
B. Validity
C. Objectivity
D. Usability
Answer is = A
Kuder-Richardson Formula 20, or KR-20, is a measure reliability for a test
with binary variables (i.e. answers that are right or wrong). Reliability
refers to how consistent the results from the test are.
13. The alternative name of the “table of specification” is?

A. Test Blue Print
B. Test Construction
C. Test Administration
D. Test Scoring
Answer is = A
Table of specification is a chart that provides graphic representations of
the content of a course or curriculum elements and
the educational outcomes/objectives by learning outcomes/objectives
Bloom's taxonomy and its level and weightage, methods of instruction ,
assessment plan is added keeping in mind the contents, learning
outcomes, weightage and time spent on instruction.
14. ”table of specification” helps in?
A. Test development
B. Test Construction
C. Test Administration
D. Test Scoring
Answer is = A
15. The supply type test item is?

A. True / False items
B. Matching items
C. M.C.Q items
D. Completion items
Answer is = D
16. Alternative response item is?

B. Right / wrong
C. Correct / incorrect
D. All above
Answer is = D
17. How many columns matching items have?

A. One
B. Two
C. Four
D. Five
Answer is = B
18. The item in the column for which a match is sought is?
A. Premise
B. Response
C. Destructor
D. None
Answer is = A
19. Identifying relationship between two things is demonstrated by?

B. Matching items
C. M.C.Q items
D. Completion items
Answer is = B
20. The statement of problem in M.C .Qs is?

A. Premise
B. Response
C. Stem
D. None
Answer is = C
21. The correct option in M.C.Q is?

A. Answer
B. Premise
C. Response
D. Destructor
Answer is = A
22. The incorrect options in M.C.Q are?
A. Answer
B. Premise
C. Response
D. Distractor
Answer is = D
23. The type of essay item in which contents are limited is?
A. Restricted Response Questions
B. Extended Response Questions
C. Matching items
D. M.C.Q items
Answer is = A
24. The ability to select organize, integrate and evaluate ideas is

demonstrated by?
A. Restricted Response Questions
B. Extended Response Questions
C. Matching items
D. M.C.Q items
Answer is = B
25. The Analysis of items is necessary in?

A. Standardized Test
B. Essay Type Test
C. Objective type test
D. Norm referenced test
Answer is = A
26. Which one is not the type of test of test by purpose?
B. Essay Type Test
C. Criterion Referenced Test
Answer is = B
27. The type of the test by method is?

B. Essay Type Test
C. Objective type test
Answer is = C
28. Test that measure learning outcome of students is

A. Achievement test
B. Aptitude test
C. Criterion reverenced test
Answer is =A
29. The type of interview when the interviewee is one is called:

A. Individual interview
B. Single interview
C. Structural interview
D. Focussed interview
Answer is: A
30. What is interview called when interviewee are more than one:
A. Group interview
B. Panel interview
D. Focused interview
Answer is: A
31. The planned interview is:
A. Group interview
B. Panel interview
C. Structural or structured interview
Answer is: C
32. Discussion is concentrated on one problem in:

A. Group interview
B. Panel interview
Answer is: D
33. Which type of test tends to have lower reliability?
A. True false
B. Completion
C. Matching
D. Essay
Answer is: D
34. Most of the tests used in our schools are:
A. Intelligence tests
B. Achievement tests
C. Aptitude tests
D. Personality tests
Answer is: B
35. The term evaluation usually covers:

A. Students’ performance
B. Teacher’s performance
C. Instructional performance
D. All of the above
Answer is: D
Part IV
1. Project is concerned with:

A. Practical work
B. Theoretical work
C. Physical work
D. Mental work
Answer is: A
2. What the individual can perform in future is measured by:

A. Intelligence test
B. Personality test
C. Achievement test
D. Aptitude test
Answer is: D
3. The best measure to avoid guessing in a structured test is to use:

A. True false items
B. Completion items
C. Matching items
D. Multiple choice items
Answer is: B
4. The basic function of educational measurement is to find out students:

A. Achievement
B. Attitude
C. Habits
D. Interests
Answer is: A
5. The most comprehensive term used in the process of educational
testing is called:
A. Test
B. Interview
C. Measurement
D. Evaluation
Answer is: D
6. Process of quantifying given traits, achievement or performance of
someone is called:
A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is: B
7. A collection of procedure used to collect information about students’
learning progress is called:
A. Measurement
B. Assessment
C. Evaluation
D. All of the above
Answer is: B
8. The process of collection, synthesis, and interpretation of information
to aid the teacher in decision making is called:
A. Test
B. Measurement
C. Assessment
D. Evaluation
Answer is: D
9. Test items in which examinees are required to select one out of two
options in response to a statement are called:
A. Multiple choices
B. Matching items
C. Alternate response items
D. Restricted response items
Answer is: C
ALTERNATIVE- RESPONSE TEST (true/false test) - Consist of a declarative
statements that the student is asked to mark true or false, right or wrong,
correct or incorrect, yes or no, fact or opinion, agree or disagree or the
like.
10. A brief written response is required in:
A. Short answer type items
B. Restricted response items
C. Extended response items
D. Completion type items
Answer is: A
11. Topics of limited scope are assessed by:
A. Short answer type items
B. Restricted response items
C. Extended response items
D. Completion type items
Answer is: B
12. Tests developed by a team of experts are termed as:
A. Teacher made tests
B. Standardized tests
C. Board tests
D. Published tests
Answer is: B
13. A standardized achievement test has definite unique feature,
including:
A. A fixed set of items
B. Specific directions for administration and scoring the test
C. Answer keys
D. All of the above
Answer is: D
14. High technical quality is assured in:
A. Teacher made tests
B. Standardized tests
C. Achievement tests
D. Published tests
Answer is: B
15. The test designed to measure the number of items an individual can
attempt correctly in a given time is referred type of test as:
A. Power
B. Supply
C. Achievement
D. Speed
Answer is: D
16. An aptitude test measures:
A. Overall mental ability
B. Attained ability
C. Present attainment
D. Potential Ability
Answer is: D
Potential Ability: It dictates the maximum that a person's
Current Ability attribute can ever rise to, and therefore how good they
can possibly become.
17. The quality testing in education is only possible by using:
A. Achievement test
B. Intelligence test
C. Aptitude test
D. Standardized achievement test
Answer is: D
18. A test designed to know the students’ position in a group is called:
A. Criterion referenced
B. Norm referenced
C. Achievement
D. Aptitude
Answer is: B
19. The scores of a student in a paper is:
A. Test
B. Measurement
C. Evaluation
D. All
Answer is: B
20. A test answers the question:
A. How much
B. How many
C. How well
D. All of the above
Answer is: C
21. Measurement answers the question:
A. How much
B. How many
C. How well
D. All of the above
Answer is: A
22. Which of the following is not a formal assessment?
A. Assignment
B. Paper
C. Quiz
D. Discussion
Answer is: D
23. Which of the following is not an informal assessment?
A. Assignment
B. Observation
C. Discussion
D. All of the above
Answer is: A
24. Prerequisite skills needed by students to succeed in a unit or course
are evaluated by:
A. Placement assessment
B. Formative assessment
C. Diagnostic evaluation
D. Summative evaluation
Answer is: C
25. Grades in assessment are:
A. Provide data for parents on their children’s progress
B. Certify promotional status and graduation
C. Serve as an incentive to do school lesson
D. All of these
Answer is: D
26. Your principal has asked that you create a chart showing that all of
the quiz items are related to the Sunshine State Standards. What type of
validity is he investigating?
A. Content
B. Construct
C. Criterion
D. All of the above
Answer is: A
Construct validity is "the degree to which a test measures what it claims,
or purports, to be measuring."
Content validity assesses whether a test is representative of all aspects of
the construct. To produce valid results, the content of a test, survey or
measurement method must cover all relevant parts of the subject it aims
to measure.
Criterion validity is an estimate of the extent to which a measure agrees
with a gold standard (i.e., an external criterion of the phenomenon being
measured). The major problem in criterion validity testing, for
questionnaire-based measures, is the general lack of gold standards.
27. A teacher created two forms of the final exam so that students sitting
next to each other could not look at their neighbor's test. What sort of
reliability evidence might she gather to make sure they are equal
assessments?
A. Alternate form reliability
B. Internal consistency
C. Interrater reliability
D. Test re test reliability
Answer is: A
28. Ms. Smith asked Mr. Jones to review the questions on her social
studies quiz. What type of measure was she worried about?
A. Alternate form reliability
B. Internal consistency
C. Construct Reliability
D. Content validity
Answer is: D
29. What is the primary purpose of assessments?
A. Inform parents of student progress
B. Provide feedback to help students succeed
C. Allow schools to compare progress
D. Enable teachers to test strategies
Answer is: B
30. Are all assessments tests?
A. Yes
B. No
Answer is: B
31. Which type of assessments have students conducting research in the
field and experiments?
A. Authentic assessment
B. Summative assessment
C. Formative assessment
D. Performance Assessment
Answer is: A
Authentic assessment is the idea of using creative learning experiences to
test students' skills and knowledge in realistic situations. Authentic
assessment measures students' success in a way that's relevant to the
skills required of them once they've finished your course or degree
program.
32. Which assessment requires students to demonstrate their knowledge
through performing specific tasks?
Answer is: D
33. Which assessment is it when you apply what you have just learned?
Answer is: A
34. A process to identify students’ learning styles, learning difficulties in
order to enhance their learning is called
A. Assessment
B. Evaluation
C. Measurement
D. Test
Answer is: A
35. Use of unfamiliar vocabulary or sophisticated term is a good tip to
make a good test question.
A. True
B. False
C. Can’t say
Answer is: B
36. Predict how well a student is likely to do in a certain school subject
A. Diagnostic tests
B. Prognostic tests
C. Norm referenced tests
D. Criterion referenced tests
Answer is: B
Part V
1. _________exams are intended to test a child's ability to understand

and reason using words, and are a test of skill, rather than of learned
knowledge. The theory is that they allow the examining body to build a
picture of a child's potential for critical thinking, problem-solving and
ultimately, intelligence.
A. Performance tests
B. Ability tests
C. Personality tests
D. Verbal reasoning exams
Answer is: d
Verbal reasoning is the ability to understand and logically work through
concepts and problems expressed in words. Verbal reasoning tests tell
employers how well a candidate can extract and work with meaning,
information and implications from text.
2. The tests which use pictures or symbols are termed as:
A. Performance tests
B. Ability tests
C. Non verbal tests
D. Verbal tests
Answer is: c
Non verbal tests, such test are also called diagrammatic or abstract
reasoning tests. Non-verbal reasoning involves the ability to understand
and analyze visual information and solve problems using visual reasoning.
Which of the following are projective techniques or personality
assessment techniques?
A. Indirect open ended questioning
B. Semi structured interview
C. Both of the above
Answer is: C
Which of the following are the tools of psychological assessment?
A. Port folio
B. Case history
C. Behavioral observations
D. All of the above
Answer is: D
Overt and Covert behaviour tests are included in the category of:
A. Aptitude tests
B. Achievement tests
C. School Tests
D. Psychological tests
Answer is: D
Overt and Covert Behaviour:
Psychologists often classify behaviors into two categories: overt and
covert. Overt behaviors are those which are directly observable, such as
talking, running, scratching or blinking. Covert behaviors are those which
go on inside the skin. They include such private events as thinking and
imagining.
Following is the type of a Personality Test:
A. Structured
B. Projective
C. Measured
D. Both A and B
Answer is: D
In structured personality tests, there are direct and structured questions

that is why these tests are also called:
A. Self report
B. Peer report
C. Psychologist’s report
Answer is: A
The Ability Tests include:

A. Achievement tests
B. Aptitude tests
C. Intelligence tests
D. All of the above
Answer is: D
The degree to which test items correlate with each other is called:
A. Parallel or alternate form reliability
B. Inters corer consistency
C. Split Half method
D. Inter-item Consistency
Answer is: D
A test is never perfectly reliable:

A. True
B. False
Answer is: A
Strong reliability contributes to the validity of a test:

A. True
B. False
Answer is: A
The greater the number of reliable test items, the higher the reliability
will be:
A. True
B. False
Answer is: A
Factors affecting the reliability are:

A. Test length
B. Test situation
C. Item Difficulty
D. All of the above
Answer is: D
Items for which equally able persons from different cultural groups have
different probabilities of success is called:
A. Item difficulty
B. Item differentiability
C. Item validity
D. Item Bias
Answer is: D
Binet Scales and Wechsler Scales are the type of test tools for the
measurement of:
A. Intelligence
B. Performance
C. Skills
D. Knowledge
Answer is: A
The Stanford–Binet Intelligence Scale is now in its fifth edition (SB5) and
was released in 2003. It is a cognitive ability and intelligence test that is
used to diagnose developmental or intellectual deficiencies in young
children. ... The test originated in France, then was revised in the United
States.
The Binet Scales
Directed by the French government to develop a test for identifying
mentally retarded school children for special instruction
Considered the first intelligence test (1905)
The Wechsler Adult Intelligence Scale (WAIS) is an IQ test designed to

measure intelligence and cognitive ability in adults and older adolescents.
1916 - Lewis M. Terman at Stanford revised the Binet
Introduced the Stanford-Binet and Intelligence Quotient (ratio IQ)
Ratio between mental age and chronological age MA/CA x 100
Tests for Group Administration

• Army Alpha & Beta
The Army Alpha test was distributed to determine whether draftees
could read English, but also to evaluate soldiers so that they could be
assigned to tasks or training in alignment with their abilities. The Army
Beta test was developed for those men with limited literacy who were
unable to respond to the written test.
In general, as test length increases, test reliability:
 Increase
 Decrease
 Both A&B
 None of these
Answer is: A
The spearman Brown formula is used to estimate:
 Test retest reliability

 Internal consistency
 Equivalence
 Validity
Answer is: B
The Spearman–Brown prediction formula, also known as the Spearman–
Brown prophecy formula, is a formula relating psychometric reliability to
test length and used by psychometricians to predict the reliability of a
test after changing the test length. The method was published
independently by Spearman and Brown
The main advantage of essay type is:
A. The can measure complex learning outcomes which cannot be

measured with other types of questions
B. The students can guess the answers
C. Are essay to mark
D. Can diagnose the learning difficulties of students
Answer is: A
The difference between maximum and minimum values is:
A. Mean
B. Mode
C. Range
D. Quartiles
Answer is: C
Which is the right sequence?
A. Test, assessment, evaluation, measurement

B. Assessment, measurement, evaluation, test
C. Test, measurement, assessment, evaluation
D. Evaluation, test, measurement, assessment
Answer is: C
Where does “table of specification” help?
A. Test developing
B. Test administration
C. Test scoring
D. Test reporting
Answer is: A
What is the purpose of Table of specification?
A. To develop integration between objectives and contents

B. To develop a balanced test
C. To help the teacher to sampling question from all contents
D. All of above
Answer is: D
How is table of specification prepared?
A. By developing help of instructional contents

B. Be the draft of course of draft contents
C. By preparing two way charts
D. Al of above
Answer is: D
Good distracter is that which:
A. Attracts high achievers more than low achievers

B. Attracts low achievers more that high achievers
C. Attracts quality high and low achievers
D. Does not attract
Answer is: B
Bad distracter is that which:
A. Attracts high achievers, more than low achievers

B. Does not attract any students
C. Attracts high achievers and low achievers quality
D. All above
Answer is: D
Norm reference test are designed to rank pupil:
A. Learning
B. Effort
C. Achievement
D. Knowledge
Answer is: C
The final product of measurement is:
A. Test item
B. Scores
C. Interpretation
D. Performance
Answer is: B
A selected response question requires the student to:
A. Construct the correct answer

B. Construct the correct answer from several possibilities
C. Recognize the correct answer
D. Explain the correct answer
Answer is: C
In selected response items, students choose a response provided by the
teacher or test developer, rather than construct one in their own words
or by their own actions. Selected response items do not require that
students recall information, but only that they recognize the
correct answer.
Which of the following is an advantage of essay test items?
A. They can tap higher level thinking

B. They eliminate writing effects
C. They allow coverage of more topics
D. They are freer of grading bias
Answer is: A
A test is said to be reliable when it:

A. Is fair and free from teacher bias
B. Measure what are claims to measure
C. Produces consistent result over time
D. Has safeguard against cultural bias
Answer is: C
A test is said to be valid when it:
A. Is fair and free from teacher bias

B. Measures what it claims to measure
C. Procedure consistent result over time
D. Has safeguard against cultural bias
Answer is: B
If we say the child has an IQ of 100, what does this means?

A. The performance of the child is below average.
B. The performance of the child is above average.
C. The mental age of the child is equal to his actual age.
D. The performance of the child cannot be better.
Answer is: C
A good item must have ___________ distractors.
A. High appealing
B. Low appealing
C. No appealing
D. All of the above
Answer is: A
Which of the following is the quality of a good stem in an item?
A. A stem should only pose one problem

B. Stem should be meaningful
C. A stem should be written as a positive expression; using negative
words such as "except" or "not" creates confusion.
D. All of the above
Answer is: D

Educational Measurement & Evaluation

Uploaded by

Copyright:

Available Formats

Educational Measurement & Evaluation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Educational Measurement & Evaluation

Uploaded by

Copyright:

Available Formats

Up to Date

Objective MCQs, Topic Wise

3. High and low achievers are sorted out by?

Discriminatory power measures the degree to which a test score varies

7. Discrimination power of an item is acceptable when its value

(i) Objectivity of Scoring:

(ii) Objectivity of Test Items:

Test validity is the extent to which a test accurately measures what it is

Type of reliability Measures the consistency of…

Test-retest The same test over time.

Interrater The same test conducted by different people.

Parallel forms Different versions of a test which are designed to be

Internal The individual items of a test.

Alternate forms reliability

It is a measure of reliability between two different forms of the same

a. test takers will display practice effects.

21. When you are confident that the experimental manipulation

22. The characteristic of a test to siscriminate between high achievers

1. The quality of test showing ease of time, cost, administration and

3. The Error free test is said to be:

4. Which is not the characterisitic of a good test?

5. If a test measures the learning objectives for which it is constructed,

6. Limited to quantitative description of pupil’s performance is?

Assessment and Evaluation:

There are four types of testing in schools today — diagnostic, formative,

1. Pre-assessment or diagnostic assessment

Formative assessment is used in the first attempt of developing

Unlike diagnostic testing, students are expected to have mastered material

It measures student’s performances against a fixed set of predetermined

It measures the performance of a student against previous performances

10. Assessment of learning

They have a concrete grade attached to them that communicates student

Assessments for learning provide you with a clear snapshot of student

formative assessments and diagnostic assessments.

12. Assessment as learning

8. The purpose of evaluation is to make judgment about educational?

9. Evaluation that monitors learning progress is?

10. A formal and systematic procedure of getting information is?

11. The process of obtaining numerical value is?

12. A sum of questions is?

13. The first step in measurement is?

14. The purpose of formative evaluation is?

15. To assess achievement at the end of instruction is?

16. Vast of all in scope?

17. The least in scope is?

18. Permanent difficulties in learning are investigated in?

19. In norm referenced test the comparison is between?

20. In which question marking will be more reliable?

23. Running description of active behavior of a student as observed by

24. A test very popular with class room teacher is?

25. Frequently used tools of summative evaluation are?

26. The summative evaluation is?

2. Number which occurs most frequently in a set of numbers is

3. The mode of 12, 17, 16, 14, 13, 16, 11, 14 is

5. A multiple choice question is composed of question referred as?

6. In a norm referenced test which item is best?

7. Which question has increasing objectivity of marking?

8. The most widely used format on standardized test in USA is?

9. Which questions are difficult to mark with reliability?