Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993) | Library

Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)

DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation

50 CICHETTI: THE RELIABII.ITY OF PEER REVIEW mental studies of peer review of grant proposals do not appear to have been undertaken, there are some less direct data that bear on the subject. Mittroff and Chubin (1979) describe a report by Hensler (1976) that notes that both NSF reviewers and applicants feel that, all things being equal, applicants have a better chance of being funded if they are affiliated with a better known institution, are well established and well known, or are submitting a "mainstream" rather than a more innovative proposal. In a more comprehensive survey, however. Cole and Cole (1981) report little effect if any on NSF funding associated with the following: previous publication record, institutional affiliation, or the applicant's age. The lack of a substantial relation between track record and the probability of being funded is described by Cole and Cole (1981, p. 2) as "surprising, since one of the stated evaluation criteria used by the NSF in evaluating proposals is the ability of the scientists to conduct the research proposed." What Cole and Cole find to be the major determining factor in whether or not a given NSF grant is funded is the score (perceived merit) given to the grant by the reviewers. In commenting negatively on this phenomenon, one peer-review expert describes an alternative system of peer review "that involves not a promise in an essay (i.e., proposal), but uses a track record of performance in research" (Roy 1985, p. 73; see also, Chubin, 1982, in support of this general strategy). Other factors contribute to the unreliability of the peer-review process in a much more subtle or enigmatic manner (e. g. , Cicchetti 1982; Smigel & Ross 1970). 6.3. "Enigmatic" Issues and their influence on the reliability of peer review. In examining the content of referee comments and their relation to specific recommendations to the editor, Smigel and Ross (1970) identified two types of problem cases. In one, the referees agreed on either acceptance, resubmission, or rejection, but for entirely different and sometimes even conflicting reasons. If the editor were to focus solely on final reviewer recommendations (i.e., ignore the content of the reviews), then the conclusion to accept, require revision and resubmission, or reject would at times be based on illusory reliability. The reverse phenomenon, an even more subtle one, occurs when referees are basically in agreement about the content of their reviews, but differ considerably in their recommendations to the editor. Specifically, one referee may opt for acceptance because he believes his criticisms are minor ones. The second referee, citing the same criticisms, feels they are major, and hence opts for rejection. On which referee does the editor rely? Understandably, no one has yet been able to resolve such difficult problems. As a result, we are left with the apparent paradox of instances in which conscientious and wellqualified reviewers and editors will offer essentially the same evaluation of a given manuscript, while drawing very different conclusions about its punishability. Evidence suggests that this same phenomenon faces program directors in the peer review of grant proposals. One NSF program director noted that some of his reviewers never rate a grant proposal as "excellent," no matter how meritorious they perceive it to be. Directors learn not to "downgrade" an applicant on this basis, since one reviewer's rating of excellent for a given proposal may have the same meaning as another reviewer's "very good' (i.e., see Cole & Cole 1981). 7. Improving the reliability of peer review 7.1. Rationale. Somewhat paradoxically, disagreement among reviewers can sometimes serve a useful purpose. Thus, one referee may detect a flaw in reasoning that a second referee has failed to uncover (e.g., Bailar & Patterson, 1985, in the context of journal peer reviews; Cole & Cole, 1981, in the context of NSF peer reviews; Harnad, 1979; 1983, in the context of "creative" disagreement in open peer commentary). But whereas a valid case can be made for the potential informativeness of this kind of reviewer "unreliability," it is not really inconsistent with a concurrent desire to strengthen both the reliability and the validity of the peer-review process, as espoused, for example, by Harnad (1985). Yet, even adopting this desideratum, Mahoney (1977; 1985) warns that one should not seek to improve reliability in peer review at the enormous expense of increasing the extent of referee bias or prejudice. Thus, training referees to agree by simply sharing the same biases or prejudices against various types of scientific documents would be quite "counterprogressive" (Mahoney 1985, p. 2). We would strongly agree. How to deal with this important issue then? 7.2. The role of multiple reviewers. To improve the reliability of peer review, a minimum of three independent referees has been recommended (e.g., Glenn 1976; Newman 1966). The procedure is already used by Behavioral and Brain Sciences (BBS), which sends a given manuscript to anywhere from five to eight reviewers (sometimes even more) explicitly chosen to represent the manuscript's specialty, as well as other specialties on which it impinges, and to include investigators likely to be favorable, critical, and neutral. Moreover, BBS's decision to accept or reject hardly amounts to a "majority vote," referees' recommendations being weighted by their backgrounds, alignments and, above all, their reasons (Harnad 1983; 1985). There are several arguments for consulting more than two referees: (1) The number of manuscripts that receive split reviews (therefore usually requiring a third review anyway) can be quite substantial: about 25% of manuscript submissions to the Journal of Abnormal Psychology over a six-year period (Cicchetti & Eron 1979 and additional unpublished data). (2) Existing pools of referees are large enough to make this option viable for behavioral science, medicine, and the physical sciences (e.g., see Lindsey 1978, p. 107). (3) Concerning issues of validity, the likelihood that an important feature of an article (or grant proposal, e.g., detection of a fatal design flaw) will be missed decreases as the number of independent reviews increases. (4) Consistent with argument (3), it is a well-known statistical fact that the reliability of ratings does increase as the number of raters is increased (Hargens & Herting 1990b; Nunnally 1978). 7.3. Using author anonymity or blind review. The main argument in favor of blind review for journal submissions

Thumbnails

Contents