p.5,7 Reliability and Validity Assessment by Edward G. Carmines and Richard A. Zeller presents an elementary
and exceptionally lucid introduction to issues in measurement theory... RELIABILITY AND VALIDITY ASSESSMENT is merely the
first step toward understanding the complex issues of measurement in theoretical and applied research settings... The Carmines
and Zeller paper provides an excellent basis for understanding some of the more complex issues in measurement theory. - John
L. Sullivan, Series Editor
p.10 measurement is most usefully viewed as the "process of linking abstract concepts to empirical indicants"
(Zeller and Carmines, forthcoming) [JLJ - from Zeller, Carmines, Measurement in the Social Sciences: the link between
theory and data, 1980, p.2], as a process involving an "explicit, organized plan for classifying (and often quantifying)
the particular sense data at hand - the indicants - in terms of the general concept in the researcher's mind" (Riley, 1963:
23).
p.10 From an empirical standpoint, the focus is on the observable response... Theoretically,
interest lies in the underlying unobservable (and directly unmeasurable) concept that is represented
by the response... Measurement focuses on the crucial relationship between the empirically grounded indicator(s) - that is,
the observable response - and the underlying unobservable concept(s)
p.12 But an indicator must be more than reliable if it is to provide an accurate representation
of some abstract concept. It must also be valid... An indicator of some abstract concept is valid to the extent that
it measures what it purports to measure... Indeed, strictly speaking, one does not assess the validity of an indicator
but rather the use to which it is being put.
p.17 "One validates, not a test, but an interpretation of data arising from
a specified procedure" (Cronbach, 1971: 447)... one validates not the measuring instrument itself but the measuring
instrument in relation to the purpose for which it is being used.
p.17 Nunnally has given a useful definition of criterion-related validity. Criterion-related validity, he
notes, "is at issue when the purpose is to use an instrument to estimate some important form of behavior that is external
to the measuring instrument itself, the latter being referred to as the criterion" (1978: 87). For example, one "validates"
a written driver's test by showing that it accurately predicts how well some group of persons can operate an automobile...
The operational indicator of the degree of correspondence between the test and the criterion is usually estimated
by the size of their correlation.
p.17-18 for some well-defined group of subjects, one correlates performance on the test with performance
on the criterion variable... Obviously the test will not be useful unless it correlates significantly with the criterion;
and similarly, the higher the correlation, the more valid is this test for this particular criterion.
p.18 "if it were found that accuracy in horseshoe pitching correlated highly with success in college, horseshoe
pitching would be a valid measure for predicting success in college" (Nunnally, 1978: 88)." [JLJ - Nunnally implies that whatever
metric we find that correlates highly with a promising position, such as piece mobility, has potential for use in
the evaluation function of a computer chess program. If our metric correlates highly we can be more severe in our pruning.]
p.18 concurrent validity is assessed by correlating a measure and the criterion at the same point
in time... Predictive validity, on the other hand, concerns a future criterion which is correlated with the
relevant measure. Tests used for selection purposes in different occupations are, by nature, concerned with predictive
validity. Thus, a test used to screen applicants for police work could be validated by correlating their test scores with
future performance in fulfilling the duties and responsibilities associated with police work.
p.19 As we have seen, the logic underlying criterion validity is quite simple and straightforward. It has
been used mainly in psychology and education for analyzing the validity of certain types of tests and selection procedures.
It should be used in any situation or area of scientific inquiry in which it makes sense to correlate scores obtained on a
given test with performance on a particular criterion or set of relevant criteria.
p.20 Fundamentally, content validity depends on the extent to which an empirical measurement reflects a
specific domain of content... obtaining a content-valid measure of any phenomenon involves a number of interrelated steps.
First, the researcher must be able to specify the full domain of content that is relevant to the particular measurement situation...
Second, one must sample... from this collection since it would be impractical to include all [domain content] in a single
test.
p.22-23 As Cronbach and Meehl observe, "Construct validity must be investigated whenever no criterion or
universe of content is accepted as entirely adequate to define the quality to be measured" (1955: 282). Construct validity
is woven into the theoretical fabric of the social sciences, and is thus central to the measurement of abstract theoretical
concepts.
p.23 Construct validation involves three distinct steps. First, the theoretical relationship between the
concepts themselves must be specified. Second, the empirical relationship between the measures of the concepts must be examined.
Finally, the empirical evidence must be interpreted in terms of how it clarifies the construct validity of the particular
measure.
It should be clear that the process of construct validation is, by necessity, theory-laden. Indeed,
strictly speaking, it is impossible to "validate" a measure of a concept in this sense unless there exists a theoretical
network that surrounds the concept. For without this network, it is impossible to generate theoretical predictions
which, in turn, lead directly to empirical tests involving measures of the concept.
p.27 The social scientist can assess the construct validity of an empirical measurement if the measure
can be placed in theoretical context. Thus, construct validation focuses on the extent to which a measure
performs in accordance with theoretical expectations. Specifically, if the performance of the measure is consistent
with theoretically derived expectations, then it is concluded that the measure is construct valid.