Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)
ANGELO S. DENISI, W. ALAN RANDOLPH and ALLYN G. BLENCOE: Potential Problems with Peer Ratings
164 D E NISI & AL .: POTENTIAL PROBLEMS WI TH PEER RATINGS on 7-point scales for each category. Individual categories were combined, according to Bales' suggestion, to form two general dimensions of interaction —task and socioemotional behavior. Ratings for these general dimensions were computed by taking the average of each observer's ratings of all 12 categories for both tasks. These were taken to be the total set of observations (24) for that rater. This was done so that any reliability coefficients computed would be based on a reasonable number of observations. A separate reliability coefficient then was computed for each group by correlating the 24 ratings made by the two observers assigned to that group. The range of reliability coefficients across the 34 groups was between .81 and .95, and the average was .92. 2. Satisfaction. Group members were asked to assess their satisfaction with the participation and the contribution of the other members, the group solution, and overall satisfaction with group members. Each item was rated on a 7-point scale with higher ratings indicating greater satisfaction, and the responses to the four items were averaged to form a single measure of satisfaction. The internal consistency of this measure (coefficient alpha) was computed to be .84 and .87 for the two administrations, respectively. 3. Group Cohesiveness. Subjects completed a 4-item cohesiveness scale similar to that used by Terborg, Castore, and DeNinno (1976). Each item was rated on a 7-point scale with the items averaged to form a single measure of cohesiveness. The internal consistency of the scale (coefficient alpha) was .85 for both administrations. 4. Perceived Performance. Subjects rated their perception of the group's performance on a single 7-point scale with higher ratings indicating more effective performance. 5. Peer Ratings. Subjects rated the overall task performance of each of the other group members, by name, using a single 7-point rating scale (1 = very low; 7 = very high). The ratings given by the subjects to peers were averaged separately for each task, and the average ratings given by a subject served as a dependent measure. Ratings of group interaction were obtained from the two independent observers, but all other measures came from the subjects themselves. Objective task performance was the actual dollar value of the cargo collected by the group on its route, expressed in points. Conditions Groups were randomly assigned to either positive (17 groups, n = 10) or negative (17 groups, /i = 73) peer rating feedback conditions. After subjects completed peer ratings for the first task, an experimenter collected them and left the room, telling the subjects that their ratings would be averaged and returned to them. While the experimenter was absent, subjects were told to complete the other parts of the questionnaire (i.e., the satisfaction, cohesiveness, and perceived performance items) and were instructed not to discuss the peer ratings or the questionnaires. They were told that