Key points are not available for this paper at this time.
For over fifty years, hundreds of colleges and universities throughout the country have evaluated the effectiveness of their courses and instructors by means of student ratings 2, 11, 16, 31, 59. Despite the predominance of this approach to instructional evaluation, there is considerable confusion over the construction of rating instruments. Many of the instruments currently being used by college administrators and instructors lack reliability and validity. This stems in part from the fact that, in many cases, the persons who assume or are charged with the task of evaluation do not possess training in psychometrics. Regardless of one's expertise in that area, however, the research on the topic offers little assistance. The studies dealing with rating instruments are legion and methodologically diverse in the educational and psychological literature. What emerges from a survey of the literature are issues and problems rather than a clear set of guidelines for instrument construciton. These issues include the following: How should the domain of instructional characteristics be defined? What method of scaling is most appropriate? Is a graphic scale more effective than a numerical scale? Are seven-point scales more reliable than five-point scales? Should a neutral position be used? How should the items be generated? Are both item analysis and factor analysis necessary? The intent of this paper is to synthesize the fragmented research on student ratings in order to provide: (1) a clarification of the issues and (2) definitive guidelines for constructing the instruments. The issues that will be examined are those that must be confronted by anyone who attempts to
Ronald A. Berk (Sat,) studied this question.