Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)
DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation
44 CICHETTI: THE RELIABII .ITY OF PEER REVIEW but only 70% and 69% for general physics and particles and fields, respectively. The nonparametric Jonckheere (1970) test of trend (Leach 1979) showed a highly significant trend, producing a Z value of 21.41 (p < .00001). This is interesting in its own right because it is consistent with the known higher manuscript-rejection rates for more general disciplines compared to more specific ones, the latter being thought of as "more experimentally and observationally oriented, with an emphasis on rigour of observation and analysis" (Zuckerman & Merton 1971, p. 77). What further implications do such data have? It seemed plausible that even within the Physical Review journal, as the subfields become more and more general, there should be progressively less dependence on the deliberations of a single reviewer for any given manuscript. Would the pattern of acceptance rates across the four subfields covary with the tendency to rely on more than a single reviewer? The data in Table 3 indicate just that. Thus, the fit is quite remarkable, with the rank ordering between acceptance rates and the use of more than one reviewer proceeding about as one might predict, this despite the fact that the acceptance rates are based on 1986 data and the variation in number of reviewers per manuscript is based on 1987 data. The trend for variation among the decreasing percentage of manuscripts using a single reviewer, subfield by subfield, is also statistically significant (Jonckheere 1970, Z = 6.87, p < .00001). Since manuscripts requiring more than one reviewer tend to be those that are problematic, these data indicate that even within the same physics journal the single initial referee system is not uniformly applied, but, rather, varies as a function of the subfield, with more general subfields having higher rejection rates and also requiring more reviewers before manuscripts are finally accepted for publication. We would predict that if the editors of Physical Review were willing to undertake a reviewer reliability study of manuscripts submitted in the four subfields, one would find appreciably higher levels of agreement for nuclear physics and condensed matter than for particles and fields and general physics. These recent findings are also of great theoretical importance, since they allow one's reasoning to come "full circle" to the conclusion that Merton's normative model is not even wholly appropriate for the physical sciences. Another way of putting this is that physics itself appears to share many of the same problems facing the general journals in both behavioral science and medicine. There are other data deriving from physics that are consistent with those just presented. Qualitative statements made about manuscripts submitted to the Physical Review Letters also suggest that some of the problems about the applicability of Merton's (1973) normative model may not be unique to medical and behavioral science. According to the editors of Physical Review Letters: "The referees, representative of the readers, are severe judges of the papers. Only about 45% of the 2,300 papers submitted each year are accepted for publication" (Adair & Trigg 1979, p. 475). The editors continue in their Statement of Policy for the journal: For the majority of the papers the comments of the two referees are sufficiently equivocal so that the editor cannot decide, with confidence, on the disposition of the paper. Further information is sought from the authors, from further communication with the original referees, from other referees, and/or from the Divisional Associate Editors. The editors initiate an average of five written communications per paper to referees, authors, and Associate Editors to gather the information which allows them to come to a conclusion concerning the disposition of the paper. Even then, for most papers, accepted or rejected, the evidence is not Table 3. The parallel relationship between acceptance rates for manuscripts submitted to "Physical Review" and the use of one or more reviewers A. 1986 Data (N = 5264 Total Manuscripts [MS]) No. MS No. MS % Subfield Received Accepted Accepted C. nuclear physics 540 440 81% B. condensed matter 2281 1786 78% A. general physics 1325 931 70% D. particles & fields 1118 775 69% Across all Subfields 5264 3932 75% Subfield No. MS With 1 Reviewer B. 1987 Data (N' No. MS With 2+ Reviewers =933 Accepted MS) Total % MS With 1 Reviewer C. nuclear physics 79 12 91 87% B. condensed matter 347 93 440 79% A. general physics 168 53 221 76% D. particles & fields 122 59 181 67% Across all Subfields 716 217 933 77%