Liszka József (szerk.): Az Etnológiai Központ Évkönyve 2000-2001 - Acta Ethnologica Danubiana 2-3. (Dunaszerdahely-Komárom, 2001)
1. Tanulmányok - Borsos Balázs: A magyar nyelvterület számítógép segítségével meghatározott kulturális régiói
er the number of the clusters is, the farther (i.e. less similar) the points that get into the same group are. The question is to decide after how many existing clusters we should put an end to the process. The ideal number of clusters can be determined by investigating the value of variance at every step of the cluster analysis. The dramatic increase in the value means that the next point characteristically differs from the previous ones. The number of clusters at this step of the process is the ideal one. However, examples of dramatic increase can occur in the process many times, which means that we can determine clusters at different levels of detail. During the digitalisation of the data and before carrying out the analysis we again have to solve some problems. 1. In the database the variants of the different variables were determined with numbers, but as the variants had no numerical value, the program had to be written so that it would handle them not as numbers but as symbols. 2. Empty space at a certain co-ordinate can mean three different things: a) at that place the local variant of the given cultural phenomenon was not collected; b) at that place the local variant of the given cultural phenomenon does not exist at all;2 c) at that place the most common variant of the given cultural phenomenon occurs and only the special variants are shown. The last problem can be solved by comparing the maps, as this happens only on maps that deal with sub-variants of a certain cultural phenomenon. Still, it is important to decide whether a datum is missing because of cultural causes (in this case a missing datum is to be regarded as a piece of information too) or because of the problems of collecting data. In this case the cluster analysis itself helps us. This means that it orders the settlements, where, due to the problems of collecting, too many data are missing, into a separate cluster, and they do not influence the later analysis to any detailed extent. 3. If two or more variants of a cultural phenomenon occurred in certain places, the different map-constructors had different ideas on how to show them. In one map every possible constellation gets a separate symbol, in another map the constellation is shown by the presence of all the symbols and so all the values of the basic variants which occur at the given places, and the constellation itself does not get a separate symbol. These two kinds of maps cannot be compared, so before cluster- and correlation-analysis the separate symbols of constellations had to be turned into the basic variants. 4. Although the differences among the values of certain variables must be qualitatively equal, in many cases it is not so. In some cases, while defining the variants of the cultural phenomenon shown in a map, two points of view were mixed. In other cases very similar variants of a certain cultural phenomenon got separate symbols, and fairly different variants were drawn together under only one symbol. 2 Even this is not a general characteristic of the Atlas as in some maps the fact that the illustrated cultural phenomenon does not occur in a certain settlement is shown with a separate symbol (i.e. map Nr 350: Material of which butter is made and terms for the remaining liquid substance). 54