Big encyclopedia of oil and gas. Statistical significance of regression and correlation parameters

Page 1


The significance of the model for solving specific research problems lies in the fact that it allows us to give quantitative assessment hidden parameters, reflecting the dynamics of two-product systems. When solving such problems, the concepts of internal (product of the first kind) and external (product of the second kind) may change. Thus, in the model of protein biosynthesis constructed by V.M. Glushkov and his colleagues (1979), the role of products of the first and second kind is played by regulatory and structural proteins, in the model of the immune response - stem cells and lymphocytes, respectively, in the model of regulation of heart contractions - substances that are delivered myocardiocytes, respectively, through the coronary vessels and through the aorta.

The assessment of the significance of the model is given through the / - criterion and / J2 for each equation separately.

The assumption about the significance of the model is based on two provisions.

All this does not detract from the significance of the model. Naturally, without iotas the existence of music is unthinkable.

Finally, the maximum limitation of the significance of the contractual model as such was facilitated by the fact that almost all the rules in force in this area were of an absolutely mandatory (imperative) nature.

The use of variance analysis in addition to regression allows us to evaluate not only the significance of the model as a whole, but also the significance of particular dependencies.

From the data presented it also follows that when drilling harder rocks, the significance of the model is higher. The proof of the significance of the resulting model confirms the hypothesis that linear dependence parameters under consideration.

Despite the successes in the development of decision-making theory, it will apparently remain for a long time in an intermediate place between art - the ability to make decisions inherent in to this medium decisions - and science as a system of principles, general provisions, procedures and methods. However, this does not reduce the relevance of the book: the number of human-computer systems will increase, the importance of decision-making in complex situations will grow, and it will become increasingly difficult for a person to solve corresponding problems using old (accurate and probabilistic) methods. Therefore, the importance of models that use formalized uncertainties based on ideas other than the mathematics of chance can only increase.

With the inductive approach, characteristic of the modeling process within the framework of business analysis, the model is obtained by generalizing observations of individual particular facts, the consideration of which is considered important for decision-making. Models are developed inductively to solve specific problems economic management. Models include taking into account the specific historically formed properties of the process being modeled. The main problem of drawing up inductive models is the selection from a set of individual observations of those that determine the essence of the decision being made, and the presentation of their structure and connections in a formalized form. The significance of inductive models is that by simplifying the description of relationships, the information contained in a large set of observations will be presented in a visual and concise form. The quality of inductive models is not determined by the accuracy of copying complex reality through symbolic systems, but depends on how much it is possible, on the one hand, to simplify the model in such a way as to achieve a solution to the problem at an acceptable cost, but, on the other hand, to reflect the basic properties of reality.

If these types of labor agreements fix the level of wages, then when the market level deviates from the level expected by workers and employers when they signed the contract, then it would be optimal for both workers and employers to change the fixed nominal wage. Therefore, given that labor market conditions are constantly changing, it would be logical to assume that over time such employment agreements will cease to exist. Workers and employers will come to expect that nominal wages need to be adjusted every day, resulting in nominal wages that will fluctuate elastically in response to the dynamics of supply and demand in the labor market. In fact, the truth of such criticism is the sharp decline in union activity in US industries in the late 1970s and 1980s. Of course, nonunion workers often have formal or informal labor agreements with their employers, but some economists believe that this decline in the share of unionized workers is evidence of the declining importance of the collective bargaining model for the U.S. economy.

The coefficient of determination is a statistic because its values ​​are calculated from observed data. Based on the coefficient of determination, a statistical procedure is constructed that checks how significant the linear relationship between factors is.

Statistics that test the significance of the entire regression equation are:

We get:

Increasing values ​​of statistics correspond to increasing values ​​of statistics, therefore, a hypothesis that is not accepted when = is not accepted if the inequality is satisfied, where

The probability of incorrectly rejecting the hypothesis is equal.

Let's calculate the critical values ​​for different numbers of observations.

Consider a simple linear regression, so

Critical values ​​obtained depending on the number of observations:

That is, with a significant number of observations, even small deviations of the actual value from 0 turn out to be significant for recognition statistical significance regression coefficient, with a meaningful explanatory variable.

The value coincides with the square of the correlation coefficient between the variables, the same conclusion is true for the correlation coefficient:

Let us now consider the coefficients of determination R 2 for the full and reduced model. In the full model, the value of R2 is always greater than in the reduced one, because in a complete model with m explanatory variables, we minimize the sum

for all coefficient values. When considering a reduced model, for example, without the m-th explanatory variable, we look for the minimum of the sum

for all coefficient values, the resulting minimum value cannot be greater value, obtained by minimizing the sum of deviations over all values, including values. This is where the coefficient property comes from.

For the convenience of the procedure for selecting a model using, it is proposed to use its adjusted form instead

which introduces a penalty associated with an increase in the number of explanatory variables. We get:

Thus, the one of the competing models for which it takes the maximum possible value is recognized as the best.

When comparing competing models, if the estimation is done using the same number of observations, then comparing models by magnitude is equivalent to comparing those models by value or. In this case, an alternative model is selected with minimum value(or).

In addition to the adjusted coefficients of determination, when choosing one of several alternative models information criteria are used, such as the Schwartz criterion, the Akaike criterion, “penalizing” for an increase in explanatory variables, but using slightly different methods.

Akaike criterion (Akaike"sinformationcriterion-AIC). Using this criterion linear model with explanatory factors, constructed from observations, is compared with the value

Residual sum of squares. Because As the number of explanatory variables increases, the first term decreases, and the second term increases, then from the alternative models we select the model with the smallest value. Thus, a compromise is achieved between the residual sum of squares and the number of explanatory factors.

Schwarz criterion (Schwarz"sinformationcriterion-SC, SIC). Using this criterion, a linear model with explanatory factors, built from observations, is compared with the value

And here, just as when using the Akaike criterion, an increase in the number of explanatory factors leads to a decrease in the first term of the right side and an increase in the second. From the full and reduced alternative models, the model with the smallest value is selected.

07/25/16 Irina Anichina

33095 0

In this article we will talk about how to understand whether we have built a high-quality model. After all, it is high quality model will give us high-quality forecasts.

Prognoz Platform has an extensive list of models for construction and analysis. Each model has its own specifics and is used under different conditions.

The “Model” object allows you to build the following regression models:

  • Linear regression (least squares estimation);
  • Linear regression (instrumental variable estimation);
  • Binary choice model (maximum likelihood estimation);
  • Nonlinear regression (nonlinear least squares estimation).

Let's start with a linear regression model. Much of what has been said will apply to other species.

Linear regression model (OLS estimation)

Where y– explained series, x 1 , …, x k– explanatory series, e– vector of model errors, b 0 , b 1 , …, b k– model coefficients.

So where to look?

Model coefficients

For each coefficient in the “Identified Equation” panel, a number of statistics are calculated: standard errort-statistics, probability of coefficient significance. The latter is the most universal and shows the probability with which removing a factor corresponding to a given coefficient from the model will not turn out to be significant.

We open the panel and look at the last column, because it is the one who will immediately tell us about the significance of the coefficients.

There should be no factors with a high probability of insignificance in the model.

As you can see, when excluding the last factor, the model coefficients remained virtually unchanged.

Possible problems: What to do if, according to your theoretical model, a factor with a high probability of insignificance must exist? There are other ways to determine the significance of coefficients. For example, take a look at the factor correlation matrix.

Correlation matrix

The Factor Correlation panel contains correlation matrix between all model variables, and also builds a cloud of observations for a selected pair of values.

Correlation coefficient shows the strength of a linear relationship between two variables. It varies from -1 to 1. Closeness to -1 indicates a negative linear relationship, proximity to 1 indicates a positive one.

The observation cloud allows you to visually determine whether the dependence of one variable on another is linear.

If there are factors that strongly correlate with each other, exclude one of them. If you wish, instead of a regular linear regression model, you can build a model with instrumental variables, including factors excluded due to correlation in the list of instrumental variables.

The correlation matrix is ​​not meaningful for a nonlinear regression model because it only shows the strength linear dependencies.

Quality criteria

In addition to checking each coefficient of the model, it is important to know how good it is overall. To do this, calculate the statistics located in the “Statistical characteristics” panel.

Determination coefficient (R 2 ) – the most common statistic for assessing the quality of a model. R 2 calculated using the following formula:

Where n– number of observations; y i— values ​​of the explained variable; — the average value of the explained variable; i— model values ​​constructed from the estimated parameters.

R 2 takes a value from 0 to 1 and shows the proportion of explained variance of the explained series. The closer R 2 to 1, then better model, the smaller the proportion of the unexplained.

Possible problems: Problems with use R 2 are that its value does not decrease when factors are added to the equation, no matter how bad they are. It is guaranteed to be equal to 1 if we add as many factors to the model as we have observations. Therefore, compare models with different amounts factors using R 2 , makes no sense.

For a more adequate assessment of the model, we use adjusted coefficient of determination (Adj R 2 ) . As the name suggests, this indicator is an adjusted version R 2 , imposing a “penalty” for each added factor:

Where k– number of factors included in the model.

Coefficient Adj R 2 also takes values ​​from 0 to 1, but will never be greater than the value R 2 .

Analogue t-coefficient statistics is Fisher statistics (F -statistics). However, if t-statistics tests the hypothesis about the insignificance of one coefficient, then F-statistics tests the hypothesis that all factors (except the constant) are insignificant. Meaning F-statistics are also compared with the critical one, and for it we can also get the probability of insignificance. It is worth understanding that this test tests the hypothesis that all factors simultaneously are insignificant. Therefore, in the presence of insignificant factors, the model as a whole can be significant.

Possible problems: Most statistics are constructed for the case when the model includes a constant. However, in Prognoz Platform we have the opportunity to remove a constant from the list of estimated coefficients. It is worth understanding that such manipulations lead to the fact that some characteristics may take on unacceptable values. So, R 2 And Adj R 2 in the absence of constants they can take negative values. In this case, they can no longer be interpreted as a fraction taking a value from 0 to 1.

For models without a constant in Prognoz Platform, they are calculated non-centered coefficients of determination(R 2 And Adj R 2 ). The modified formula brings their values ​​to the range from 0 to 1, even in the model without a constant.

Let's look at the values ​​of the described criteria for the above model:

As we can see, the coefficient of determination is quite large, but there is still a significant amount of unexplained variance. Fisher's statistics indicate that the set of factors we have chosen is significant.

Comparative criteria

In addition to criteria that allow us to talk about the quality of the model itself, there are a number of characteristics that allow us to compare models with each other (provided that we are explaining the same series over the same period).

Most regression models reduce to a minimization problem sum of squared remainders (sum of squared residuals , SSR ) . Thus, by comparing models according to this indicator, it is possible to determine which of the models better explained the series under study. This model will correspond to the smallest value of the sum of squared residuals.

Possible problems: It is worth noting that with an increase in the number of factors, this indicator is the same as R 2 , will tend to the boundary value (SSR obviously has a boundary value of 0).

Some models boil down to maximizing logarithm of the maximum likelihood function (LogL ) . For a linear regression model, these problems lead to the same solution. Based LogL information criteria are constructed that are often used to solve the problem of selecting both regression and smoothing models:

  • Akaike information criterion (Akaike Information criterion, AIC)
  • Schwarz criterion (Schwarz Criterion, S.C.)
  • Hannan-Quinn test (Hannan- Quinn Criterion, HQ)

All criteria take into account the number of observations and the number of model parameters and differ from each other in the form of the “penalty function” for the number of parameters. The rule for information criteria is: best model has the lowest criterion value.

Let’s compare our model with its first version (with an “extra” coefficient):

As you can see, this model although it gave a smaller sum of squared residuals, it turned out to be worse in terms of information criteria and by the adjusted coefficient of determination.

Residue analysis

A model is considered to be of good quality if the model residuals do not correlate with each other. Otherwise, there is a constant unidirectional impact on the explained variable of factors not taken into account in the model. This affects the quality of the model's estimates, making them ineffective.

Statistics are used to check residuals for first-order autocorrelation (dependence of the current value on previous ones) Durbin-Watson (DW ) . Its value ranges from 0 to 4. In the absence of autocorrelation DW close to 2. Closeness to 0 indicates positive autocorrelation, close to 4 indicates negative autocorrelation.

As it turns out, our model contains autocorrelation of residuals. You can get rid of autocorrelation by applying the “Difference” transformation to the explained variable or by using another type of model - the ARIMA model or the ARMAX model.

Possible problems: Durbin-Watson statistics are not applicable to models without a constant, as well as to models that use lagged values ​​of the explained variable as factors. In these cases, statistics may show the absence of autocorrelation when it exists.

Linear regression model (instrumental variables method)

The linear regression model with instrumental variables is:

Where y– explained series, x 1 , …, x k– explanatory series, x1 , …,x̃ k– explanatory series modeled using instrumental variables, z 1 , …, z l– instrumental variables, e, j– vectors of model errors, b 0 , b 1 , …, b k– model coefficients, c 0 j, c 1 j, …, c lj – coefficients of models for explanatory series.

The scheme by which the quality of the model should be checked is similar, only it is added to the quality criteria J -statistics– analogue F-statistics that takes into account instrumental variables.

Binary choice model

The explained variable in the binary choice model is a value that takes only two values ​​– 0 or 1.

Where y– explained series, x 1 , …, x k– explanatory series, e– vector of model errors, b 0 , b 1 , …, b k– model coefficients, F– a non-decreasing function that returns values ​​from 0 to 1.

Model coefficients are calculated using a method that maximizes the value of the maximum likelihood function. For this model, the following quality criteria will be relevant:

  • McFadden's coefficient of determination (McFadden R 2 ) - analogue of the usual one R 2 ;
  • LR-statistics and its probability is analogous F-statistics;
  • Comparative criteria: LogL , AIC, S.C., HQ.

Nonlinear regression

By linear regression model we mean a model of the form:

Where y– explained series, x 1 , …, x k– explanatory series, e– vector of model errors, b– vector of model coefficients.

The model coefficients are calculated by the method that minimizes the value of the sum of squares of the residuals. For this model, the same criteria will be relevant as for linear regression, except for checking the correlation matrix. Note also that the F-statistic will test whether the model as a whole is significant compared to the model y = b 0 + e, even if in the original model the function f (x 1 , …, x k, b) there is no term corresponding to the constant.

Results

Let’s summarize and present a list of tested characteristics in the form of a table:

I hope this article was useful to readers! Next time we will talk about other types of models, namely ARIMA, ARMAX.

The quality of the model will be assessed using the Student and Fisher criteria by comparing the calculated values ​​with the tabulated ones.

To assess the quality of the model using the Student's criterion, the actual value of this criterion (t obs)

compared with the critical value t cr which is taken from the table of values t taking into account the given level of significance ( α = 0.05) and the number of degrees of freedom (n - 2).

If t observed > t cr, then the resulting value of the pair correlation coefficient is considered significant.

The critical value at and is equal to .

Let's check the significance of the coefficient of determination using F-Fisher criterion.

Let's calculate the statistics F according to the formula:

m = 3– number of parameters in the regression equation;

N=37– number of observations in the sample population.

Mathematical model statistical distribution F-statistics is the Fisher distribution with and degrees of freedom. The critical value of this statistic for and and degrees of freedom is equal to .

Fisher criterion
F calc F cr Regression equation
8916.383 3.276 adequately

Thus, the model explains 99.8% total trait variance Y. This indicates that the fitted model is adequate.


Calculation of predicted values ​​and sum of squared deviations.

Let's enter into the cell Q2 formula =$F$54*N2+$E$54*O2(calculation of predicted values), then copy it to the cells Q3:Q38. To cell R2 formula =(P2-Q2)^2(calculation of the sum of squared deviations), then copy it into the cells R3:R38, and calculate the sum of the resulting values ​​in the cell R39.

X 2 X 5 Y y(x) (Y - y(x)) 2
605.1 2063.2 1626.7 1589.7 1367.523
620.1 2143.7 1602.5 1650.5 2303.318
2447.7 1880.7 1914.5 1144.709
862.1 2406.4 1982.7 1876.9 11189.53
958.4 2592.9 2026.7 106.5821
1488.9 2193.9 2180.4 182.342
1231.5 2529.7 2152.1 2020.4 17335.88
1429.6 2644.9 2133.1 8814.026
1679.5 2793.7 2344.4 2277.8 4436.216
1326.2 2669.2 2341.7 2135.8 42415.15
1456.8 2211.9 2282.7 5014.463
2523.6 2990.5 2629.8 2543.9 7377.384
2659.8 2017.5 2059.0 1722.637
923.8 2636.6 2009.4 2053.4 1939.955
1173.3 2943.1 2312.8 2792.24
1156.7 2890.9 2400.1 2272.4 16298.85
1450.2 3051.5 2508.1 2432.0 5784.146
1845.2 2684.1 2633.3 2581.453
1566.4 3052.6 2736.6 2449.8 82275.65
1729.7 3349.7 2824.5 2689.8 18152.31
1987.3 3456.3 2880.2 2804.9 5676.928
1902.7 3731.2 2812.9 2992.6 32297.9
1839.1 3517.8 2704.2 2828.0 15336.69
3953.7 3823.1 3224.2 3358.1 17922.28
1351.2 3482.9 2584.7 2731.6 21584.07
1185.3 3347.6 2466.7 2609.0 20246.66
1715.5 3585.4 2928.3 2859.2 4768.047
1536.4 3678.3 3036.4 2900.8 18389.81
1823.1 3801.6 3021.1 3032.3 124.6986
2452.1 4002.1 3237.6 3269.8 1034.273
2076.6 3990.3 3247.1 3206.5 1647.633
2129.2 3436.9 3375.5 3767.099
2502.7 4154.2 3472.8 3387.8 7220.377
2238.7 4322.7 3504.1 3472.0 1028.291
2417.6 4623.1 3357.1 3716.7 129321.2
3838.4 4817.9 4034.7 4065.3 937.7363
1468.6 3450.4 3585.0 18128.14
532666.2



Report form

Options

Tariffs for advertising and magazine characteristics
Magazine name Y, tariff (one page of color advertising), USD. X 1, planned audience, thousand people X 2, percentage of men X 3, median family income, dollars
Audubon 25 315 51,1 38 787
Better Homes & Gardens 198 000 34 797 22,1
Business Week 68,1 63 667
Cosmopolitan 15 452 17,3 44 237
Elle 55 540 12,5 47 211
Entrepreneur 40 355 2 476 60,4 47 579
Esquire 71,3 44 715
Family Circle 147 500 24 539 38 759
first For Women 28 059 3 856 3,6 43 850
Forbes 59 340 68,8 66 606
Fortune 3 891 68,8 58 402
Glamor 85 080 7,8
Goff Digest 6 250 78,9
Good Housekeeping 166 080 25 306 12,6 38 335
Gourmet 49 640 29,6 57 060
Harper's Bazaar 52 805 2 621 11,5 44 992
Inc. 70 825 66,9
Kiplinger's Personal Finance 65,1 63 876
Ladies" Home Journal 127 000 6,8
Life 63 750 14 220 46,9
Mademoiselle 55 910
Martha Stewart's Living 93 328 4 849 16,6
McCalls 7,6 33 823
Money 98 250 60,6
Motor Trend 79 800 5 281 88,5 48 739
National Geographic 44 326
Natural History
Newsweek 148 800 20 720 53,5 53 025
Parents Magazine 72 820 18,2
PC Computing 40 675 57 916
People 125 000 33 668
Popular Mechanics 86,9
Reader's Digest 42,4 38 060
Redbook 95 785 13 212 8,9 41 156
Rolling Stone 78 920 8 638 59,8 43 212
Runner's World 36 850 2 078 62,9 60 222
Scientific American 37 500 2 704
Seventeen 71 115 5 738 37 034
Ski 32 480 2 249 64,5 58 629
Smart Money 42 900 2 224 63,4
Smithsonian 73 075 8 253 47,9
Soap Opera Digest 35 070 7 227 10,3
Sports Illustrated 162 000 78,8 45 897
Sunset 56 000 5 276 38,7 52 524
Teen 53 250 3 057 15,4
The New Yorker 62 435 3 223 48,9
Time 162 000 22 798 52,4
True Story 12,2
TV Guide 42,8 37 396
U.S. News & World Report 98 644 9 825 57,5 52 018
Vanity Fair 67 890 4 307 27,7
Vogue 63 900 12,9 44 242
Woman's Day 137 000 22 747 6,7
Working Woman 87 500 6,3 44 674
YM 73 270 14,4 43 696
Average value 83 534 39,7 47 710
Standard deviation 25,9 10 225

Control questions

Paired Regression

1. What is meant by pairwise regression?

2. What problems are solved when constructing a regression equation?

3. What methods are used to select the type of regression model?

4. What functions are most often used to construct the equation of pairs?

5. Noah regression?

6. What is the form of the system of normal equations of the method of least

7. squares in the case of linear regression?

8. How is the determination index calculated and what does it show?

9. How is the significance of a regression equation checked?

10. How is the significance of regression equation coefficients checked?

11. The concept of a confidence interval for regression coefficients.

12. The concept of point and interval forecast according to the linear regression equation.

13. How the elasticity coefficient is calculated and what it shows E, average coefficient elasticity Ý ?

Multiple regression

1. What is meant by multiple regression?

2. How does a multiple linear regression model differ from a paired linear regression model? Write down the multiple linear regression equation.

3. What problems are solved when constructing a regression equation?

4. What problems are solved when specifying a model?

5. What are the requirements for the factors included in the regression equation?

6. What is meant by collinearity of factors?

7. How is collinearity checked?

8. What approaches are used to overcome interfactor correlation?

9. Which functions are most often used to construct an equation? multiple regression?

10. What formula is used to calculate the multiple correlation index?

11. How is the multiple determination index calculated?

12. What is the coefficient of determination? How can it be used to assess the adequacy of the model?

13. What does it mean low value multiple correlation coefficient?

14. How is the significance of the regression equation and individual coefficients checked?

15. How are hypotheses constructed to test the significance of model parameters?

16. How are partial regression equations constructed?

17. How are average partial elasticity coefficients calculated?

18. How are confidence intervals constructed for model parameters?

19. What is meant by homoscedasticity of a series of residuals?

20. How is the hypothesis about the homoscedasticity of a number of residuals tested?

21. What is the dependent variable called in a model?

22. What are the independent variables called in the model?

23. Name the main method for constructing a model.

24. Write a multiple regression model in general view with 3 independent variables

25. Write down the sum of squared deviations of the model (formula)

26. What is RSS? (definition and formula)

27. How to check the significance of the constructed model as a whole?

28. How to check the significance of the coefficient for the variable X_3?

29. Formulate the economic meaning of the coefficient for example with variable X_5

30. What is a “short model” of multiple regression

Literature

1. Shanchenko, N. I. Econometrics: laboratory workshop: tutorial/N. I. Shanchenko. – Ulyanovsk: Ulyanovsk State Technical University, 2011. – 117 p.

2. Davnis V.V., Tinyakova V.I. Computer workshop on econometric modeling. Voronezh, 2003. - 63 p.


The initial data characterizes the selling price of a certain product at certain points in time. Needs to be built regression model dynamics of change this indicator. Factors believed to influence this value include the selling price of the substitute product, the volume of sales of the product, the amount of advertising costs, and average advertising costs.

The selling price is a dependent quantity, let’s denote it Y.

Factors influencing (presumably) the value of Y will be denoted by X i: X 1 – price of the substitute product, X 2 – sales volume, X 3 – volume of advertising costs, X 4 – average advertising costs.

Initial data

Testing the significance of the model using the likelihood ratio test (Wald test) begins with putting forward the main hypothesis:

To test this hypothesis, sample statistics are calculated

Here lnL is the value of the maximum value of the logarithm of the likelihood function, and lnL0 is the value of the logarithm of the likelihood function if the main hypothesis is true.

If the main hypothesis is true, then the sample statistics (4.7.1) are distributed according to law 2 with (m-1) degrees of freedom. The boundary of the right-hand critical region K2 is looked for using tables of chi-square critical points according to the level of significance (1-b) and (m-1) degrees of freedom. If the inequality holds:

then the main hypothesis is rejected, the alternative hypothesis is accepted and we say, that the model is statistically significant. Otherwise, they accept the hypothesis that the model is not significant and proceed to revise it.

For binary choice models, the significance of factors is checked by testing for each factor хi, i=1,…, (m-1) hypotheses of the form:

The sample statistics that are used to test these hypotheses have asymptotically normal distribution and are called z-statistics. The boundary of the two-sided critical region is sought using Laplace tables at a given significance level (1-b).

If the inequality holds:

K 1

then they accept the main hypothesis about the insignificant difference from zero of the coefficient i and conclude that the corresponding factor is insignificant for the model.

For binary choice models, the concept of coefficient of determination is not defined. However, the so-called pseudo coefficient of determination is determined for them, which no longer characterizes the explanatory power of the model

Definition 4.7.1. The pseudo-coefficient of determination is the following value:

Definition 4.7.2. The McFadden likelihood ratio index is the characteristic:

It should be emphasized that if the parameters of the binary choice model do not differ significantly from zero, then both introduced coefficients are equal to zero.

In the lecture, we looked at nonlinear regression models, in particular, models for a binary dependent variable. We examined these models for two regression functions: logit (we used the logistic function) and probit (we used the distribution function of the standard normal distribution law). Parameter estimates for such regression functions are obtained using the maximum likelihood method. The model is tested using the Wald test, which is based on statistics that have a chi-square distribution. When studying multivariate regression models, we interpreted parameter estimates in j as the marginal effect of the independent variables on y. Let's return to binary choice models. If we try to find the derivative of P(Y=1|X), we will arrive at the following expression:

where Z= 0+1x1+...m-1xm-1.

By the theorem on the derivative of a complex function, and from the density property (the derivative of the distribution function is the distribution density f(Z)), we obtain:

or, using the second notation for parameter estimates:

P(Y=1|X)=вjf(Z)

As before, bj denotes estimates of unknown parameters.

Then, we can reason as follows: the distribution density is always non-negative, therefore the sign of the derivative

will depend only on the sign of the parameter estimates, but will be a function of all independent variables. Moreover, if the parameter estimate is positive, then an increase in the variable xj will lead to an increase in the probability

and if the parameter estimate is negative, then, accordingly, the indicated probability will decrease.

Comment. If the factor x is a binary variable, then the concept of a marginal effect cannot be introduced for it.

For each variable x (quantitative!!!), the so-called average marginal effect is introduced. To do this, calculate sample means for quantitative variables and the percentage “1” for binary ones, and substitute them into the expression for the distribution density instead of the variables.

Another question for discussion: how to predict the value of y after estimating the parameters of the logit (probit) model? For example, proceed as follows. Substitute the found values ​​of parameter estimates and the values ​​of xj into Z and calculate the value of the variable. If Z>0, then consider that Y=1, if Z<0, то считают, что У=0. Замечание. Мы рассмотрели ситуацию, когда переменная у была измерена в номинальной шкале, но принимала всего два значения: 0 и 1. В общем случае, когда у может принимать несколько значений, например 0, 1, 2, 3, используют множественный (по у!!) логит или пробит. Кроме того, у может быть измерен в порядковой шкале, тогда в Стате используют порядковый логит (пробит) ologit (oprobit).

Comment. Very often in research it is necessary to conduct studies on a truncated sample. For example, if household incomes are studied, there are situations when respondents with very high incomes (for example, more than 1 million rubles) should be excluded from the study, that is

In such cases, Tobit models are used.

F(0+1x1+...m-1xm-1)

F(0+1x1+...m-1xm-1)

F(0+1x1+...m-1xm-1)

F(0+1x1+...m-1xm-1) - (F(0+1x1+...m-1xm-1))2