There is some specialized terminology relating to testing and item development. We are introducing some of these terms here, with an attempt to be accurate, but not intimidating. Terms are clustered by concept rather than alphabetically.
- Attribute: One thing we need to remember is that we do not measure or test a person or object. We are seeking to measure or evaluate the attributes of that person or object. Those attributes may be easily observable to our five senses and therefore easy to measure, or they may be much more abstract and correspondingly more difficult to measure.
- Construct: A specific skill or knowledge to be evaluated that has been formulated (constructed) and defined. A construct is somewhat abstract. As such, it is different from a concrete, obvious attribute of a person or object, such as height, weight, or speed, which can be measured objectively, accurately and with totally consistent results. We will use the terms construct, concept, and learning objective somewhat interchangeably.
- Learning Objective (LO): A specific goal of knowledge or skill. For example, “The student will be able to count successfully by tens from 10 to 100.”
- Construct Irrelevant Variance (CIV) occurs when an item or test does not accurately measure the concepts (constructs) it is intended to measure because of errors in design or administration. For example, a student misinterprets an unclear item stem and selects the wrong response. CIV is one of the biggest threats to a test’s validity.
- Construct irrelevant mistakes, concept irrelevant errors, and similar synonymous phrases: Errors students make while testing that are not directly related to the subject matter being tested. This includes misunderstanding of directions, carelessness, mislabeling of answers, improper marking of answer sheets, and other procedural errors. Note: this particular term is not officially recognised like CIV (in the previous entry), but fits another category of testing error.
- Constructed-response / Selected-response: Test item formats either require a student to come up with an answer or to select an answer from a list. Formats such as short-answer (fill-in-the-blank) and essay questions are known as constructed-response formats. Formats that require the student to choose the correct or best answer(s) are known as selected-response formats. Examples of these are multiple choice, true/false, and matching.
- Item stem: the part of an item that states the problem and asks for a response.
- Item response (or option): The possible answers that a student must select from, typically the correct or best answer and one or more incorrect responses, known as distractors.
- Norm referenced: A test that grades primarily by comparing a student’s performance with others of his peer group. Scores are typically reported as percentile scores, grade equivalents, or stanines.
- Criterion referenced: A test that grades a student’s performance by comparing it with specific defined criteria, usually a list of concepts the student is expected to master. Scores are typically reported as percentage correct, pass/fail, or letter grades.
- Scoring Rubric: A set of guidelines for scoring items - especially constructed-response items - clearly stating what to look for and how much credit to assign to each aspect of the answer. For example, a student may be assigned to come up with a pie chart for a set of data provided. A clear scoring rubric will define out of a total of 10 points for the item, how many points are awarded for correct percentages calculated, how many points are awarded for accurate piece sizes, how many points are awarded for clear, accurate, and complete labeling, how many points are awarded for neatness, and so on.