Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993) | Library

Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)

DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation

55 CICHETTI: THE RELIABII .ITY OF PEER REVIEW ANOVA statistical tables. Here, M = the total number of ratings summed across subjects. 6. For a more comprehensive treatment of appropriate and inappropriate reliability statistics, the interested reader is referred to Cicchetti (1988), Cicchetti & Feinstein (1990), and Feinstein & Cicchetti (1990). 7. These data derive from the Physical Review and Physical Review Letters ; Annual Report 1986 (published in January 1987). We wish to express our deepest appreciation to Dr. Peter D. Adams, Deputy Editor-in-Chief, American Physical Society for making available this information as well as other related material on the peer-review process for the Physical Review. 8. It has been noted by several investigators in the field of peer-review research that the extent to which editors use referee recommendations is an important and often neglected variable (e.g., Bailar & Patterson 1985; Patterson & Bailar 1985; Chubin, in a 1982 peer-reviewer comment). We agree, and we have data bearing on this issue for the 1,313 diiferent manuscripts that were evaluated by at least 2 reviewers during the period from 1973 to 1978 and were ultimately accepted or rejected by the same editor of the Journal of Abnormal Psychology. This journal uses a reviewer-summary recommendation format in which the submission can be rated as one of the following: accept (as is); accept subject to revision; revise and resubmit for further consideration; or reject outright. Hie joint reviewer recommendations compared to the final editorial decisions were as follows (with numbers and/or percentages of manuscripts following each referee or editorial judgment): Accept-accept (17), all 17 (100%) accepted by the editor; Accept/revise-accept (86), 77 (89.5%) accepted and 9 (10.5%) rejected; Accept-revise/accept-revise (100), 81% accepted, 19% rejected; Accept/resubmit (61), 38 (62.3%) accepted, 23 (37.7%) rejected; Accept/revise-resubmit (134), 69 (51.5%) accepted, 65 (48.5%) rejected; Resubmit-resubmit (49), 15 (30.6%) accepted, 34 (69.4%) rejected; Accept-reject (96), 20 (20.8%) accepted, 76(79.2%) rejected, Accept/revise-reject (219), 40 (18.3%) accepted, 179 (81.7%) rejected; Resubmit-reject (200), 11 (5.5%) accepted, 189 (94.5%) rejected; Reject-reject (351), 2 (0.6%) accepted and 349 (99.4%) rejected. The total number of accepted articles was 370 (28.2%). The number rejected was 943 or (71.8%). These data are consistent with data for both the American Sociological Review and the Physical Review, which indicate that "referees' recommendations are the major factor determining the editors' dispositions" (Hargens 1988, p. 146). Consistent with such findings, Bakanic et al. (1987) reported a correlation of .81 between referees' mean overall recommendations and final editorial decisions for manuscripts submitted to the American Sociological Review. 9. It should be noted that BBS is systematically gathering and analyzing data on the relationship between levels of reviewer anonymity and the favorability and usefulness of the referee report, as indicated both by the referee's recommendations and the author's subjective ratings. 10. The authors extend appreciation to staff personnel at NSF for making available this information, which can also be obtained, in more detail, by requesting appropriate NSF publications. 11. It should be noted that the various funding agencies use different priority rating systems. For example, the Veterans Administration uses a 10-to-50 rating scale, which is the reverse of the NSF scoring system. Thus, lower scores represent higher quality grant proposals, and vice versa. In contrast to both these rating systems, reviewers for grants submitted to the American Heart Association used a 10-category, ordinally scaled grading system in which high priority (for funding) was defined as 1-3, intermediate priority as 4-7, and low priority as 8-10 (i.e., see Wiener et al. 1977). Concerning these varying scales of measurement, the empirical work of Cicchetti et al. (1985) indicates that no appreciable increment in interrater reliability is achieved by increasing the size of a rating scale beyond seven ordinal or quasi-dimensional points or categories. These data serve to validate and generalize the implications of the results of an earlier investigation that demonstrated that analogue rating scales (0 to 100 points) were no more reliable than threecategory ordinal scales (here the Present State Examination [PSE], in assessing extent of psychiatric symptomatology (Remington et al. 1979). 12. It is of interest that Professor Jared Diamond describes many difficulties that composers have encountered in their attempts to have their works published or supported by grants. TTiese and other parallels between struggling musicians and scientists were presented in the March 21, 1985, issue of Nature, in commemoration of Bach's 300th birthday. Open Peer Commentary Commentary submitted by the qualified professional readership of this journal will be considered for publication in a later issue as Continuing Commentary on this article. Integrative overviews and syntheses are especially encouraged. Peer review: An unflattering picture Kenneth M. Adams 1 Department of Psychiatry, University oi Michigan. Ann Arbor, Ml 481090704 Electronic mall: gdvr@umich.cc.umich.edu Cicchetti brings a masterly touch to an enduring problem in modern science. The problem of peer review calls to mind the old saying about the weather: "Everyone complains about it but no one does anything about it." One may find this more amusing later in one's career than earlier. Many researchers will regard Cicchetti's results with initial discouragement, if not outright embarrassment. Critics will be quick to point out the very real limits that low reliability placcs upon validity. Given these data, one must further suspect the peer-review process in science of being flawed. Yet several points deserve careful consideration. First, the entire process of reviewing manuscripts or grants is one of considering new information. Decisions concerning the disposition of these entities are simple in their result, but often complex in their structure. Cicchetti has done well in trying to capture disposition as the most solid data point. In the case of a manuscript, a research report often contains components reflecting various aspects of the research enterprise: (1) command of existing knowledge, (2) skill at formulating the research design, (3) expertise in analysis of the executed study, and (4) wisdom in guiding the reader in how to understand the place of the study in our knowledge. Predictably, the ability of investigators in each of these areas will not be uniform. Equally predictable will be the varying degree to which reviewers may have special skills and capabilities to judge the success of the project in these components. Disagreement by colleagues about papers

Thumbnails

Contents