Vörös A. szerk.: Fragmenta Mineralogica Et Palaentologica 14. 1989. (Budapest, 1989)

this manner individual samples including anomalous variable can be easily recognized (see Fig. 6 in HARANGI, 1988, there are a few objects with extrem high Si0 2 , K 2 0 and P 2 0 5 content in the data set). If they are unique within the volcanic sequence we have to omit them from the statistical studies. Table 1 Steps of cluster analysis (after LE MAITRE, 1982, and O. KOVÁCS, 1987a) standardization Procedures: 1. X fi,jl = a íj) i xfi, j) 2. X (i.j) = X (i, j)/max X (j) 3. X (i,j) = (x (i,j) - min X íj))/ (max x (j)-min x (j)) 4. X fi.j> = (x (i,j) - mean x (j))/s. d. (j) 5. X ii,jï = x (i,j)/ mean x (j) similarity measures used in geological sciences: Euclidean distance cosine theta coefficient theta coefficient correlation coefficient linkage methods used in geological sciences: single- linkage unweighted average weighted-pair group average In most accounts of clustering, variables are standardized prior to computing distance measurements. The Euclidean distance is scale dependent, and it will be influenced most strongly by the variable which has the greatest magnitude. Therefore, transformation of data to unit variance is recommended. A few guidelines were given by DAVIS (1973) and LE MAITRE (1972) to decide on the standardization, a few procedures were listed by O. KO­VÁCS (1987a). Based on our experiences there is no need to transform the major element composition data. Due to the standardization, each variable will be weighted equally, al­though in petrochemical classification t he weights of oxides are approximately equivalent of their magnitude. Next, the user has to decide on the similarity measure and the linkage method to be used. The choice is of great importance concerning the result. The similarity between two objects (samples) can be expressed by various measures, four of them appear to be com­monly used in the geological sciences: the Euclidean distance, the cosine theta or theta co­efficient and the correlation coefficient (LE MAITRE 1982). The Euclidean distance is weight­ed in favour of the variables with large standard deviation, i. e. generally those with large values. In order to petrochemical classification this measure is preferred most commonly (EWART and LE MAITRE 1980, UPADHYAYA et al. 1988). The cosine theta coefficient is influenced by the variable-ratios of the samples. Two objects are highly correlated measur­ed by the correlation coefficient if they are compositionally alike. Dendrograms constructed from distinct similarity measures are usually different, al­though the general grouping can be similar if the data set includes characteristic natural groups. Considering a single volcanic suite with a compositional change from basic to acidic members, the main groups of rock types are generally well-defined. Overlaps can occur on­ly within the transitional rocks. Cluster analysis of Mecsek volcanics gave similar cluster structure and general grouping using correlation and theta coefficient, as well as the Eucli­dean distance with raw data and cosine theta respectively. Different cluster structures were resulted by the use of theta coefficient and Euclidean distance, although the clusters included approximately the same samples.

Next

/
Thumbnails
Contents