Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)
DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation
78 CICHETTI: THE RELIABII .ITY OF PEER REVIEW dictated by the reviews; (8) whether revised manuscripts are sent for further review to the same or different reviewers; and (9) whether authors have the right to challenge reviews or request re-reviews. These policy decisions are typically made in an ad hoc fashion, and editors have little guidance in establishing the policy practices that constitute the peer review process. Whereas research reviewed by Cicchetti addresses some policy practices, too little research has been done summarizing existing policy options or testing their effectiveness. The design of review surveys: Multidimensional components and externally anchored scales. Several studies reviewed by Cicchetti considered ratings on specific multiple criteria in addition to overall ratings, and better agreement was found for these (e.g. , attention to relevant literature and research design). Some disagreement among reviewers may arise from the way they weight the different components in determining their overall recommendation, or it may be limited to one particular aspect of the manuscript. There is surprisingly little effort to determine what the factorial structure of responses is, however, or whether more reliable composites could be obtained by averaging the different subscales. Marsh and Ball (1989) developed a 21-item reviewer survey based on a content analysis of written critiques. Factor analysis of responses to this survey clearly identified four factors affecting the outcome of reviews: research methods, relevance to readers, writing style and presentation clarity, and significance or importance. Multitrait-multimethod analyses of agreement among multiple raters of the same manuscripts provided modest support for convergent validity and for the distinctiveness of the rating components, but it also indicated a substantial "halo effect" in the ratings by a given reviewer. It is interesting that halo effects associated with the overall recommendations were much smaller than with responses to the 21 rating items (even though one was also an overall rating item). The explanation seemed to be that the response categories in the overall recommendation were much better anchored to concrete behaviors (e.g., accept as is or with slight revisions, reject outright) than the 9-point rating scale for the 21 items. Consequently, singlereviewer reliabilities based on various combinations of the 21 rating items were no higher than for the overall recommendation. The results suggest the potential usefulness of multidimensional rating scales, but also point out the importance of having well-anchored response scales that minimize halo effects and response biases idiosyncratic to how each reviewer interprets the response scale. In discussing attempts to improve reviewer reliability, Marsh and Ball (1989) also noted that such proposals must be evaluated in terms of their likely impact on validity. For example, there are relatively objective characteristics on which reviewers could agree that are unrelated to manuscript quality. In addition, specific strategies may affect differentially reliability and validity. For example, editors are likely to send the same manuscript to reviewers having different perspectives, and this strategy may lower reliability but increase validity. The process of peer review: Unanswered questions Linda D. Nelson Department of Psychiatry, Medical Center. University of California, Irvine, Orange, CA 92668 Peer review is essentially a classification system that involves both process (i.e., the activity that led to a decision) and outcome (i.e., the decision itself). Although Cicchetti states from the outset that a major objective of his study was to analyze the peer review process, outcome appears to be the focus of his target article. Dependent variables (e.g., accept, reject, resubmit) were carefully examined in the context of their own and others' work, with the results clearly displayed in tabular form (e.g., rates of interrater agreement). Their recommendations regarding the appropriate statistics for evaluating and determining standards of reliability added new and potentially useful information to the study of peer review. His conclusions regarding possible interactions between the nature of the discipline (e. g. , general vs. diffuse) and acceptance rates were interesting and highlighted bases for differentia] outcomes in levels of agreement. Focusing part of his discussion on peer review as it relates to major funding sources (e.g., agreement on grant proposals by type of study and area of discipline) offered new interpretations regarding outcome to this important phenomenon. In short, tne author is to be commended for his efforts in updating and expanding our understanding of peer review as it applies to manuscripts and grant proposals. Rather than replacing outcome as a topic, per se, I would have liked some additional discussion on the process involved in peer review. This point is important to me as a psychologist and researcher because it challenges the interplay between what a person thinks and what a person does. The author cites Kuhn (1962) as someone who stresses the importance of peer evaluation on scientific activity. It is noteworthy that Kuhn's remarks (1962) were used to support the importance of considering the relationship between process and outcome in psychotherapy 16 years later (Orlinsky & Howard 1978). To ignore the influence of process on outcome in peer review misses an important link between what (or how) individuals think and what leads them to certain conclusions. Although Cicchetti touches on this in his discussion of reviewer bias, I am not certain from his presentation of supportive literature whether bias is actually an operative factor affecting outcome: One experiment used to support the role of author affiliation status and review outcome was soundly criticized (see commentaries on Peters & Ceci, 1982); another (Mahoney 1977) relied on a "qualitative analysis" of reviewers' remarks. Cicchetti further implies that peer review can engage referee/reviewer variables that are so powerful that evaluative criteria (e.g., adequacy of methodology) become secondary factors in decision making. He states On the basis of the best controlled studies of the peer review process to date, we are forced to conclude that referees do at times apply subjective criteria (that) cannot be described as "fair," "careful," "tactful," or "constructive," despite the fact that such traits are widely accepted as desirable characteristics of referees. The author then cites, as an example of this phenomenon, the increased likelihood of reviewers accepting manuscripts based on type of results (e. g. , positive) instead of a manuscript's overall worth (e.g., adequacy of methodology). The notion that unfair, subjective criteria may be imposed by journal "gatekeepers" is a provocative one. Even more unsettling is the contention by Mahoney (1977) that this phenomenon represents "confirmatory bias," wherein reviewers deem acceptable manuscripts that coincide with their beliefs and reject those that do not. Does this mean that papers tend to get published on the basis of statistical significance? Furthermore, are we to assume that reviewers tend to agree with any alternative hypothesis set forth in a paper in such a way that failure to reject null hypotheses is viewed as inconsistent with their beliefs? Considering direction of results as an operative variable in peer review is further clouded by the fact that the main experiment used to justify the notion involved a direct manipulation of the operative variable in question (Mahoney 1977). An investigator's choice of an independent variable, and Cicchetti's conclusions regarding its impact on the review process, represent subjective determinations, in this instance regarding which reviewer variables within the complex process of peer review characterize prepotent predictors of outcome. As a journal reviewer, I would venture to say that unfortunately, papers with