Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)

DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation

97 CICHETTI: THE RELIABII .ITY OF PEER REVIEW Table 6. The fate of 112 Journal of Abnormal Psychology manuscripts receiving three reviews (1973-1977) Reviewer Number of Editorial decision Percentage Recommendation Manuscripts Accept Reject Accepted 3 "Accept" 5 5 0 100.0 2 "Accept," 1 "Reject" 33 24 9 72.7 1 "Accept," 2 "Reject" 65 13 52 20.0 3 "Reject" 9 0 9 00.0 Total 112 42 70 37.5 Note. "Accept" = "Accept/As Is" or "Accept/Revise"; "Reject" = "Resubmit" or "Reject" directive of securing more expert and even-handed reviews and better serving our readers by attracting a larger share of the important papers in this area. To date, the experiment has been working smoothly, and we sense an improvement in relations with the commu­nity of particle theorists. This is reflected in increased submissions and publication in this area: Submissions for 1985 and 1986 were about 35% above the 1984 level and the numbers of published papers were about 45% above the 1984 level (reflecting also a moderately in­creased acceptance rate). 4. Improving the reliability and validity of peer review 4.1. Rationale, unifying concepts. The thoughtful, thought-provoking, and conceptually sound ideas of Kraemer seem especially valuable at this point in the ex­position. Her very special talent for accurately relating pure mathematical reasoning to the flawed world of clinical reality has been achieved once again, and the field of peer review will be the richer for it. Hers is a well­reasoned commentary on the subtle interplay between issues of reliability and validity. I share her view on the importance of improving reliability but not at the expense of validity and, conversely, of improving validity without compromising reliability; I agree that both can be accom­plished. Kraemer's most valuable contribution is a the­oretically and empirically sound framework, in which more specific ideas for improving the quality of peer review can be better classified, integrated, and examined critically. Take Kraemer's first comment about editors' basic need to select reviewers with varying degrees of expertise to achieve a balanced and comprehensive review. Her conclusion that "maximal validity" is achieved when errors are independent and the editor uses as many reliable reviewers as possible is correct. Moreover, if one evaluates this proposition in a cross-disciplinary or cross­specialty sense, its meaning can be further elucidated. For example, in the general subspecialty fields of so­ciology, psychology, medicine, and physics, it would be essential to select reviewers for their area of expertise (e.g., content specialist, biostatistician, biochemist) to increase the validity of the review. However, this careful selection of reviewers (weeding out the biased and non­discriminating) should also increase significantly the reli­ability of the peer review process, at least for the overall reviewer recommendation. (See also the commentaries of Kraemer and Lock.) Now if one considers the same issue for specific spe­cialties of sociology, psychology, medicine, or physics, the community of peers may be so well defined that one could select reviewers randomly. This might obviate the nonrandom and unwitting selection of potentially biased reviewers. In summary, a selection process that would be a disaster for a general area may be just what is required in a more specific subspecialty area. A second, unifying matter that Kraemer suggests may not be aimed at improving the reliability or the validity of peer review is the delicate and sensitive problem of striking a meaningful balance between committing Type I (alpha) errors (accepting flawed submissions) and Type II (beta) errors (rejecting nonflawed submissions). Kraemer and I agree that Type II error needs to be reduced but not at the expense of magnifying a Type I error. Again, if Kraemer's broad framework is extended to specific as well as more general areas of inquiry, her ideas dovetail nicely with those of Cole, whose theoretically meaty commen­tary uses the Type I-Type II distinction to explain dif­ferences in acceptance and rejection rates for the social and natural sciences. Cole argues that behavioral scientists prefer to make Type II errors, whereas social scientists prefer to make Type I errors. There is certainly evidence for this hypoth­esis. Thus, the data presented in section 3 show quite clearly that when editors of major general journals in psychology are faced with a split review, they over­whelmingly opt for rejection. Moreover, Lock (1985) shows that the same phenomenon is at work for editors of general medical journals. The much higher rejection rates for general areas in physics itself, however, seems to indicate that the phenomenon is even broader than Cole suggests. Perhaps the idea needs to be amended: Editors in general areas across and within fields of inquiry desire to avoid Type I errors, whereas their specialized counter­parts try to avoid Type II errors. Consistent with this philosophy, Roediger, a former editor of a psychology specialty journal, recommends the "when in doubt, ac­cept" philosophy for editorial decisions on manuscripts receiving split reviews. There appears to be a concerted effort on the part of general focus scientists to set criteria for deciding on just how to achieve the best balance between Type I and Type II errors. It is especially important to accomplish this because it is a well-known biostatistical fact that one can avoid a Type II error by simply increasing sufficiently the

Next

/
Oldalképek
Tartalom