Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)
DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation
56 CICHETTI: THE RELIABII .ITY OF PEER REVIEW may represent a healthy state of affairs in which the author and reviewer are pursuing their own land of academic freedom. A similar situation exists with respect to the novelty of information in grants and their multiple components. (As an aside, it seems that many revieyvers have forgotten that grant proposals are just that - proposals.) Second, it is interesting that all sciences are roughly in the same range of reliability when it comes to reviews. This gives some indirect support to my first point about why novel information is judged imperfectly by experts in a number of realms. One problem with manuscript publication decisions is that reviewers expect to be asked to vote on the publication of a manuscript. I would add to Cicchetti's suggestions a plea that journals stop routinely asking reviewers whether a paper should be accepted. This recommendation on a reviewer's part may not be at all helpful for either the editor or the author of the manuscript. Many journals do appeal to reviewers to avoid making a recommendation on acceptance or rejection in their review comments intended for the author. It is a plea that often goes unheeded. Journal editors are probably in a far better position to weigh the various factors affecting potential publication than are most reviewers. Editors tend to send manuscripts to reviewers who have special expertise for consultation on certain points. In supplying this information, reviewers may have their own ideas of what should or should not be published, but editors often are not aware of the fabric of reviewers' editorial philosophy. Third, it is probably inimical to true academic freedom to train reviewers. It is apparent, however, that the development of constructive reviewers is too often haphazard and uncertain. Good reviewing should be recognized by journals, institutions, and societies. We must find ways to groom reviewers without creating undue bias or suppressing scientific freedom. Fourth, while the reliability of decisions in evaluating manuscripts and adjudicating grants is poor in both cases, the results are clearly more deleterious with grants. If one wants to publish a new manuscript, there will certainly be some way to do it eventually - even if it is not in one's preferred journal. A similar situation does not prevail with respect to the awarding of grants: no grant, no money. Reality in many areas of research dictates that if the federal government does not fond certain kinds of research, it just won't be done. This has brought the peerreview process under even more scrutiny, and the sense of frustration and arbitrariness that some "pink sheet" recipients feel is not going to be assuaged by the findings. The remedies suggested for the manuscript review process need more dramatic implementation with respect to grants. The adjudication of grants is a far more social and political process than manuscript review. Steps must be taken to humanize the grant review process by putting reviewers' names on their opinions. The National Science Foundation (NSF) experience in trying to create blind applications did not work initially, but remains worth trying. Fifth, Cicchetti's finding that reviewers can agree more easily on less desirable research than on more desirable research parallels the situation now usually extant in the individual review. As a general rule, reviewers spend far too little time being constructive, collegial, and consultative in their reviews. One sure-fire way to limit the impact of our character defects and make the review process more constructive would be the aforementioned requirement that reviewers reveal their identities in all cases. The temptation to be petty or take cheap shots when reviewing others' work would diminish thereby. In closing, I would like to emphasize that the methodological basis of Cicchetti's investigation seems sound. Agreement between reviewers of manuscripts and grants is discouragingly low. The reasons for this are many, but the net result is that this study holds a mirror up to peer review that provides a distinctly unflattering picture. The remedial steps proposed by Cicchetti require urgent attention. I would underscore his suggestion that reviewers be identified, encouraged, and rewarded by whatever means available for providing quality consultation on manuscripts. The situation with grants is far more complex, but even the most stalwart defenders of current federal funding review methods cannot ignore this evidence suggesting that some scientific decisions resulting in funding or nonfonding are probably being made in a nonsystematic way. I would doubt that we are distinguishing between "shades of excellence" any more; a certain degree of caprice has entered the picture. In a complex world that can be helped by our research at a variety of levels, this should open our minds to constructive alternatives. NOTE 1. The author is affiliated with The Veterans' Affairs Medical Center, Ann Arbor, MI. Does the need for agreement among reviewers inhibit the publication of controversial findings? J. Scott Armstrong« and Raymond Hubbard" 'The Wharton Schooi. University of Pennsylvania. Philadelphia. PA 19104 6College of Business and Public Administration Drake University, Des Moines. IA 50311 As Cicchetti indicates, agreement among reviewers is not high. This conclusion is empirically supported by Fiske and Fogg (1990), who reported that two independent reviews of the same papers typically had no critical point in common. Does this imply that journal editors should strive for a high level of reviewer consensus as a criterion for publication? Prior research suggests that such a requirement would inhibit the publication of papers with controversial findings. We summarize this research and report on a survey of editors. Prior research. Horrobin (1990) suggests that the primary function of peer review should be to identify new and useful findings, that is, to promote the publication of important innovations. This function is typically subordinated to the quality control aspects of peer review, however. The quality control approach looks for agreement among the reviewers. The result, Horrobin claims, is that competent research yielding relatively unimportant findings is more readily accepted for publication. 1 He provides numerous examples of harsh peer review given to important research that presents controversial results. The popular press often reports difficulties associated with the publication of important research findings. The scanning tunneling microscope (STM) is a case in point. The STM is capable of distinguishing individual atoms and has been hailed as one of the most important inventions of this century. It earned a Nobel Prize in physics for its inventors. Nevertheless, the first attempt to publish the results produced by the STM in 1981 failed because a journal referee found the paper "not interesting enough" (Fisher 1989). Armstrong (1982c) provides additional examples of lapses in the peer review system, along with summaries of empirical evidence that discontinuing findings about important topics are difficult to publish. Among these, the experimental studies by Goodstein and Brazis (1970) and Mahoney (1977) are of particular interest. They found that reviewers were biased against negative findings. They rejected these papers on the basis of poor methodology while accepting papers with confirmatory outcomes that used the identical methodology. Given the above results, one might expect that if editors rely on consensus among reviewers for their publication decisions, few controversial findings will be published. This problem could be especially serious in social science journals. TTiese journals generally have low acceptance rates and their editors may decide to publish only manuscripts with high agreement among reviewers. A survey of journal editors. To assess how journals treat