Vízügyi Közlemények, 1983 (65. évfolyam)

3. füzet - Rövidebb tanulmányok, közlemények, beszámolók

Lineáris regressziós kapcsolatok változónak függetlensége ! függősége 443 Independence and interdependence of variables involved in linear regression by Dr. Z. HANKÓ The most relevant parameter of correlations established between stochastic variables by linear regression is the empirical regression coefficient determined from the sample. In the cases where the application criteria of linear regression are satisfied (e.g. the compound distribution of the variables is a normal one), the regression coefficient itself is a (normally distributed) probability variable. This very fact offers the possibility of estimating the degree of independence and/or interdependence between the variables involved in the correlation by means of the regression coefficient. The regression coefficient (and its standard deviation) depends beyond the relative variances of the variables also on the correlation coefficient. Eqs. ( 1 ) to (4). The absolute magnitude of the latter may range from zero to unity. The value zero indicates the absence of any relationship, whereas unity implies a functional relationship between the variables. Intermediate values indicate linearly with fair approximation the closeness (as well as linearity) of the correlation. The difference between the actual regression coefficient and the two kinds of limit value provides a basis for estimating the magnitude of the risk at which the actual value differs from the limit value considered, Eqs. (2/a. b; 3/a,b and 5-7). It is recalled that if the risk of difference is higher than 5%, then the difference is insignificant (of random character), whereas ifit is lower than 0.1% then the difference is highly significant. Since the numerical value of the standard abscissa of the normal distribution is known, it is possible-with the help of Eqs. (5/a and 6/a)-to construct Fig. 1 for different magnitudes of risk, which represents the curves connecting the points of 5, 1 and 0.1",, risk of no-correlation and functional relationship in the correlation coefficient vs. number of pairs of data system of coordinates. The sub-fields enclosed by the curves can be classified as to the quality of no-correlation and/or functional relationship. Starting from the classification of the sub-fields of no-correlation and/or functional relation­ship it is further possible to classify also the independence and/or interdependence of the variables (Table I). This has been used in constructing Figs. 2 and 3. Fig. 2 is used to classify the correlation of the variables assumed as interdependent, while Fig. 3 of the variables assumed as independent. In the ideal case-presuming multiple linear regression correlation-the classification of the interdependence between the "dependent" variable and any "independent" variable (Fig. 2) is an excellent one and the classification of the independence between any two "independent" variables is also excellent (Fig. 3). These figures serve thus directly the practical-users' interests for classifying the quality of the corrélation found by regression. Figs. 2 and 3 contain also two controversial sub-fields («<18 and » > 48). The cases must be avoided where the number of pairs of data in the sample is small (the desirable number of pairs of data is around n = 30). The controversial sub-field where /(>48 implies that the independence, or uniformity of the sample elements are questionable, or that a regular error (trend, periodicity) is involved. Checks of this kind are, evidently, not superfluous even where n assumes values between 18 and 48 and thus the interdependence and/or independence of the data can be determined positively on a different basis. * * * Unabhängigkeit/Abhängigkeit der Variablen von linearen Rcgressionsbczicliuiigeii von Dr.-Ing. Zoltán HANKÓ Den charakteristischsten Parameter einer zwischen stochastischen Variablen bestehenden linearen Regressionsbeziehung stellt der aufgrund einer Stichprobe ermittelte empirische Regres­sionskoeffizient dar. Im den Anwendungsbedingungen der linearen Regression entsprechenden Fall (wenn also z. B. die gemeinsame Verteilung der Variablen normal ist), kann auch der Regressions­koeffizient selbst als eine Zufallsvariable (normaler Verteilung) betrachtet werden. Dies ermöglicht eine mit Hilfe des Regressionskoeffizienten erfolgende Schätzung des Masses der zwischen den Variablen der Beziehung bestehenden Abhängigkeit/Unabhängigkeit.

Next

/
Oldalképek
Tartalom