Braun Tibor, Schubert András (szerk.): Szakértői bírálat (peer review) a tudományos kutatásban : Válogatott tanulmányok a téma szakirodalmából (A MTAK Informatikai És Tudományelemzési Sorozata 7., 1993)
DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation
DOMENIC V. CLCCHETTI: The Reliability of Peer Review for Manuscript and Grant Submissions: A Cross-Disciplinary Investigation Behavioral and Brain Science, 14 (1991) 119-186 Abstract: The reliability of peer review of scientific documents and the evaluative criteria scientists use to judge the work of their peers are critically reexamined with special attention to the consistently low levels of reliability that have been reported. Referees of grant proposals agree much more about what is unworthy of support than about what does have scientific value. In the case of manuscript submissions this seems to depend on whether a discipline (or subfield) is general and diffuse (e.g., cross-disciplinary physics, general fields of medicine, cultural anthropology, social psychology) or specific and focused (e.g., nuclear physics, medical specialty areas, physical anthropology, and behavioral neuroscience). In the former there is also much more agreement on rejection than acceptance, but in the latter both the wide differential in manuscript rejection rates and the high correlation between referee recommendations and editorial decisions suggests that reviewers and editors agree more on acceptance than on rejection. Several suggestions are made for improving the reliability and quality of peer review. Further research is needed, especially in the physical sciences. Keywords: cross-disciplinary comparisons; evaluation; grant review; manuscript reviews; peer review; quality control; reliability 1. Objectives This paper will analyze the peer-review process in the evaluation of manuscript submissions and grant applications. First, we will discuss research designs and statistical procedures, and then we will critically review the major studies of peer review across disciplines, providing some reasons and remedies for the low reliability of manuscript and grant reviews as well as some suggestions for future research. 2. Theoretical issues Gottfredson (1978, p. 920) has stressed the importance of peer evaluation in scientific activity from a Kuhnian standpoint (Kuhn 1962). Until the beginning of the nineteenth century, scientific theory was thought to be an approximation of what Laudan (1984, p. 83) referred to as "absolute truth," "certainty," or "infallible knowledge." By the twentieth century, this view of scientific theory was replaced by the more modest goal of developing theories that were, again in Laudan's (1984, p. 83) terminology "plausible," "probable,"or "well-tested." Laudan (p. 83) notes that this paradigm shift "represents one of the great watersheds in the history of scientific philosophy: the abandonment of the quest for certainty. " Kuhn's ideas about paradigm development and paradigm "shifts," although undergoing reevaluation and reinterpretation almost since their inception (e.g., Boehme 1977; Boehme et al. 1976; Gholson & Barker 1985; Lakatos 1972; Laudan 1984; Mulkay 1977; Price 1963), continue to play a central role in our understanding of the evaluation of scientific work by the community of fellow scientists or "peers" (e.g., Mahoney 1985). In a classic work, Robert Merton argued that the social system governing both the actions and the mobility of scientists is very fair and objective, supporting a normative model of science (Merton 1973): A scientist's work is judged for scientific merit on the basis of universal scholarly standards rather than by specific biases such as friendship, author affiliation, or nepotism (e.g., see Lindsey 1978, p. 55). As Lindsey (1978, p. 3) reminds us, however, Merton and his students (e.g., Cole & Cole 1973) were focusing mainly on the physical sciences. The normative model does not appear to hold well for either the behavioral or the medical sciences. In fact, as we shall discuss later, there are data to suggest that the model is not entirely appropriate for the physical sciences, either. Peer review is a system of decision making by referees, editors, and research program directors in evaluating the quality of scientific research. It is here that Merton's normative model applies to the attributes that are used in evaluating papers submitted to professional organizations, manuscripts submitted to scientific journals, and research proposals submitted to funding agencies. These attributes can be derived from either objective judgments (e.g., experimental design) or subjective ones (e.g., importance). The attributes themselves must be distinguished from the criteria (or norms) used to judge them. Thus, scientists might use the criterion "brief," "to