Scoring and Scaling Educational Tests

Michael J. Kolen; Ye Tong; Robert L. Brennan

doi:10.1007/978-0-387-98138-3_3

Scoring and Scaling Educational Tests

Michael J. Kolen, Ye Tong, Robert L. Brennan

Source

Statistics for Social and Behavioral Sciences > Statistical Models for Test Equating, Scaling, and Linking > Research Questions and Data Collection Designs > 43-58

Abstract

The numbers that are associated with examinee performance on educational or psychological tests are defined through the process of scaling. This process produces a score scale, and the scores that are reported to examinees are referred to as scale scores. Kolen (2006) referred to the term primary score scale, which is the focus of this chapter, as the scale that is used to underlie psychometric properties for tests.

A key component in the process of developing a score scale is the raw score for an examinee on a test, which is a function of the item scores for that examinee. Raw scores can be as simple as a sum of the item scores or be so complicated that they depend on the entire pattern of item responses.

Raw scores are transformed to scale scores to facilitate the meaning of scores for test users. For example, raw scores might be transformed to scale scores so that they have predefined distributional properties for a particular group of examinees, referred to as a norm group. Normative information might be incorporated by constructing scale scores to be approximately normally distributed with a mean of 50 and a standard deviation of 10 for a national population of examinees. In addition, procedures can be used for incorporating content and score precision information into score scales.