The concept of a saddle point. Kuhn-Tucker theorem. Karush-Kuhn-Tucker conditions

The central place in the theory of nonlinear programming is occupied by the Kuhn-Tucker theorem. Let a nonlinear programming problem be given:

find the maximum value of a function Z=f(x 1, x 2, ..., x n) with restrictions

Let's compose the Lagrange function for this problem:

(4.2)

If the regularity condition is satisfied (there is at least one point X, for which g i(X)>0 for everyone i), then the following theorem holds.

Theorem. Vector X(0) if and only if is an optimal solution to problem (4.1) when there exists a vector Λ (0) such that when for all

Dot ( X(0),Λ (0)) is called saddle point for function F(X,Λ), and the theorem is called the saddle point theorem. Let us prove the sufficiency of conditions (4.3).

Proof. Let X(0) >0 and Λ (0) >0 - saddle point of the function F(X,Λ). If in (4.3) instead F(X,Λ) substitute its value (4.2), we get

at .

The right inequality is valid for any , therefore

Then from the left inequality we get

Since in this case

That f(X(0))>f(X).

So the point X(0) satisfies (4.1) and at all other points satisfying (4.1), the function takes a value less than at X(0).

This statement leads the solution of the NLP problem to finding the saddle points of the Lagrange function F(X,Λ).

The proof of the necessity of conditions (4.3) is not considered due to its complexity.



If f(X) And g i(X)-differentiable functions, then conditions (4.3) are equivalent to the following local Kuhn-Tucker conditions:

Expression

means that the value of the partial derivative of the Lagrange function is taken at the point ( X(0), Λ (0)), where

X(0)=(x 1 (0) , x 2 (0) , ..., x n(0)), Λ (0) =(λ 1 (0) , λ 2 (0) , ..., λ n (0)).

These conditions can be written in vector form:

Example. Find max Z=-x 1 2 -x 2 2 with restrictions

Let us show that there exists Λ (0) 0 for which the Kuhn-Tucker conditions (4.4), (4.5) for the function are satisfied at the optimum point F(X,Λ):

F(X,Λ)=- x 1 2 -x 2 2 +λ 1 (2 x 1 +x 2 -2)+λ 2 (8-2 x 1 -x 2)+λ 3 (6- x 1 -x 2).

According to conditions (4.5) λ 2 and λ 3 should take zero values, since, substituting x 1 =0.8 and x 2 =0.4 in expression

we have values ​​greater than zero, and by condition

According to the condition, λ 1 can take a non-zero value, since

In accordance with (2.16), the derivative

must take zero values, since the coordinates of the vector X(0) are different from zero. We find λ 1 =0.8. Therefore, at the point ( X(0),Λ (0)) the Kuhn-Tucker conditions are satisfied and it is indeed an extremum point.

Let's consider the Kuhn-Tucker conditions in a slightly different form.

Let us have an optimization problem with constraints in the form of equalities:

z= f(x 1 , x 2 , …, xn) → min

under conditions:

g 1 ( x 1 , x 2 , ... , x n) = 0,

g 2 ( x 1 , x 2 , ... , x n) = 0,

g n(x 1 , x 2 , . . . , x n) = 0.

Conditional minimum point of the function f coincides with the saddle point of the Lagrange function:

In this case, the saddle point must provide a minimum in its variables x i and maximum over variables λ j.

Necessary conditions for a stationary point are that partial derivatives of the first order with respect to all variables are equal to zero:

Note that the second equation implies that only valid points will satisfy the necessary conditions.

For getting sufficient condition For the existence of a minimum, it is necessary to add the requirement that the Hessian of the objective function be positive definite.

Let's consider general mathematical programming problem:

Z= f(X) → min,

under conditions:

Inequality constraints can be converted to equality constraints by adding to each of them weakening variables

Let's form the Lagrange function:

Then the necessary conditions the minimum takes the form:

The second equation can be transformed by discarding the attenuating variables and moving to inequality constraints. Let us obtain the constraints of the original problem. The third equation can be multiplied by ui/2 and replace the weakening variables by expressing them from the second equation of the system.

There is one more condition that must be met at the minimum point. This condition: λ i= 0, which is a consequence of the analysis physical meaning coefficients of the Lagrange function.

It can be shown that

Lagrange coefficient at the minimum point;

f* - optimal value functions.

Obviously, as b increases i the admissible region expands, which means that minimum value can only decrease, that is, the derivative must be negative (non-positive). Therefore, at the conditional minimum point

We finally obtain the necessary conditions for the conditional minimum:

The expressions in the second line ensure that the optimal point is valid.

The third line contains the following information: If the constraint is active (i.e. the expression in parentheses is equal to zero), then the corresponding Lagrange coefficient is strictly positive. The positivity of the Lagrange coefficient means that the corresponding constraint is active, i.e. The fact that this limitation is deficient is what stops further improvement of the target function. If the constraint is not active (i.e., the expression in parentheses is not equal to zero), then the corresponding Lagrange coefficient must be equal to zero, i.e. This limitation is not deficient; it does not affect further improvement of the target function.

For the conditional maximum point, the Lagrange coefficients change sign to the opposite (since the optimal value of the objective function at the conditional maximum point should increase).

The above conditions are equivalent Kuhn-Tucker theorem and are often called the same.

A sufficient condition for a minimum (maximum) is the convexity (concavity) of the objective function at a stationary point. This means that the Hessian must be positive (negative) definite.

A summary of the material in this chapter can be viewed in two presentations:

file “Nonlinear programming”;

file "Kuhn-Tucker Theorem".

Slides 10-14 of the presentation “Kuhn-Tucker Theorem” show an example of solving the Kuhn-Tucker problem.

4.5. Control questions

(Developed by Afanasyev M.Yu. and Suvorov B.P.)

Question 1. Given a real function f(X S= . Let X 1 and X 2 - points of this segment and 0 £ l £ 1.

Which of the following inequalities is a condition for the convexity of a function?

Possible answers:

Question 2. Given a real function f(x), defined on the segment real numbers S=. Let x 1 and x 2 are the points of this segment and 0 £ l £ 1.

Which of the following inequalities is a condition for a function to be strictly concavity?

Possible answers:

Question 3. Function

1) convex;

2) strictly convex;

3) concave;

4) strictly concave;

5) convex and concave.

Question 4. Function

3) concave; 4) strictly concave;

5) convex and concave.

Question 5. Function

1) convex; 2) neither convex nor concave;

3) strictly convex; 4) concave:

5) convex and concave.

Question 6. New model high-speed motorcycle “Snail” is sold by the company at a price of (30 – 2 x) thousand dollars per piece, where X- number of motorcycles sold. Variable manufacturing costs are $6,000 per unit and fixed costs are $30,000. Maximize your business's profit for the week.

Let's say that the change in the sales tax rate resulted in an additional $4,000 in sales tax for each motorcycle sold.

How will the optimal production of motorcycles change compared to the initial situation?

(Solve using Lagrange function.)

Possible answers:

1) will increase by 2 ; 2) will decrease by 2 ;

3) will not change; 4) will increase by 1 ;

5) will decrease by 1 .

Question 7. Let's assume you have 2 weeks (14 days) of holiday to spend in the Canary Islands and Nice. Let your utility function be of the form 2 KN – 3K 2 – 4N 2, Where TO And N- the number of days you spend in the Canary Islands and Nice respectively.

How many days should you spend in Nice to maximize your utility function?

(To solve, use the Lagrange function. Round the result to the nearest integer. Check whether the Kuhn-Tucker optimality conditions are met.)

Possible answers:

1) 3 ; 2) 4 ; 3) 5 ; 4) 6 ; 5) 7 .

Question 8. For the problem in Question 7, find the value of the dual estimate of the constraint.

(Round the result to the nearest integer.)

Possible answers:

1) 41 ; 2) 34; 3) 29 ; 4) 39 ; 5) 44 .

Question 9. The monopolist plans a production and sales program for the next period. Prices: R 1 = 14 – 0,25x 1 (per product 1); R 2 = 14 – 0,5X 2 (per product 2), where x 1 and X 2 - sales volumes of products. Let us assume that all produced products are sold. The maximum total sales volume is 57.

What is the optimal release of product 2?

Possible answers:

1) 36,4 ; 2) 30,7 ; 3) 26,3 ; 4) 20,6 ; 5) 41,8 .

Question 10. The owner of a small enterprise has 100 thousand rubles for the next month, which he can spend on increasing fixed assets TO(purchase of equipment) at a price of 1 thousand rubles per unit or for the purchase of additional labor L at a price of 50 rub./hour. Increase in finished products that can be sold for 10 thousand rubles. per unit, determined by the production function F(K, L)= L 2/7 K 2/5.

How much money should be spent on increasing fixed assets?

Possible answers:

1) 74.36 thousand rub.; 2) 58,33 thousand roubles.; 3) 63,44 thousand roubles.;

4) 45,66 thousand roubles.; 5) 39,77 thousand roubles.

Answers on questions:

1 -4,2 - 1,3 -4,4 - 5,

5 -2, 6 -5,7- 4,8- 2,9- 4,10- 2.

In the previous section, the Kuhn conditions were constructed-Tucker for tasks conditional optimization. Using the Lagrange multiplier method, an intuitive idea was obtained that the Kuhn-Tanker conditions are closely related to the necessary optimality conditions. IN this section Rigorous formulations of necessary and sufficient conditions for the optimality of solving a nonlinear programming problem are considered.

Theorem 1. Necessity of the Kuhn-Tucker conditions

Let's consider the nonlinear programming problem (0) - (2). Let be differentiable functions, and x* be an admissible solution to this problem. Let's put it. Further, let them be linearly independent. If x* - optimal solution nonlinear programming problem, then there exists a pair of vectors that is a solution to the Kuhn-Tucker problem (3) - (7).

The condition that they must be linearly independent is known as the condition linear independence and essentially represents a certain condition for the regularity of the admissible region, which is almost always satisfied for optimization problems encountered in practice. However, in general, checking the fulfillment of the condition of linear independence is very difficult, since it is required that the optimal solution to the problem be known in advance. At the same time, the condition of linear independence is always satisfied for nonlinear programming problems that have the following properties.

  • 1. All restrictions in the form of equalities and inequalities contain linear functions.
  • 2. All inequality constraints contain concave functions, all equality constraints contain linear functions, and there is at least one feasible point x, which is located in the interior of the region defined by the inequality constraints. In other words, there is a point x such that

If the condition of linear independence at the optimum point is not satisfied, then the Kuhn-Tucker problem may not have a solution.

Minimize

under restrictions

In Fig. Figure 1 shows the region of feasible solutions to the nonlinear problem formulated above. It is clear that there is an optimal solution to this problem. Let us show that the condition of linear independence is not satisfied at the optimum point.

Rice.

It is easy to see that the vectors are linearly dependent, i.e. the condition of linear independence at the point is not satisfied.

Let's write down the Kuhn-Tucker conditions and check whether they are satisfied at point (1, 0). Conditions (3), (6) and (7) take the following form;

At, it follows from equation (11) that, whereas equation (14) gives, Therefore, the optimum point is not a Kuhn-Tucker point.

Note that violation of the linear independence condition does not necessarily mean that the Kuhn-Tucker point does not exist. To confirm this, let's replace target function from this example function. In this case, the optimum is still achieved at point (1,0), at which the condition of linear independence is not satisfied. Kuhn-Tucker conditions (12) - (16) remain unchanged, and equation (11) takes the form

It is easy to check that the point is a Kuhn-Tucker point, i.e. satisfies the Kuhn-Tucker conditions.

The theorem on the necessity of the Kuhn-Tucker conditions allows us to identify non-optimal points. In other words, Theorem 1 can be used to prove that a given feasible point that satisfies the linear independence condition is not optimal if it does not satisfy the Kuhn-Tucker conditions. On the other hand, if at this point the Kuhn-Tucker conditions are satisfied, then there is no guarantee that the optimal solution to the nonlinear problem has been found. As an example, consider the following nonlinear programming problem.

The following theorem establishes the conditions under which the Kuhn-Tucker point automatically corresponds to the optimal solution to a nonlinear programming problem.

Theorem 2. Sufficiency of the Kuhn-Tucker conditions

Let's consider the nonlinear programming problem (0) - (2). Let the objective function be convex, all inequality constraints contain concave functions, and equality constraints contain linear functions. Then if there is a solution that satisfies the Kuhn-Tucker conditions (3) - (7), then x* is the optimal solution to the nonlinear programming problem.

If the conditions of Theorem 2 are met, then finding the Kuhn-Tucker point provides an optimal solution to the nonlinear programming problem.

Theorem 2 can also be used to prove the optimality this decision nonlinear programming problems. To illustrate, consider the example again:

Minimize

under restrictions

Using Theorem 2, we prove that the solution is optimal. We have

Since the matrix is ​​positive semidefinite for all x, the function turns out to be convex. The first inequality constraint contains linear function, which is both convex and concave at the same time. For that

to show that the function is concave, we calculate

Because the matrix is ​​negative definite, the function is concave. The function is included in the linear constraint in the equality equation. Consequently, all conditions of Theorem 2 are satisfied; if we show that is the Kuhn-Tucker point, we will indeed establish the optimality of the solution. The Kuhn-Tucker conditions for example 2 have the form

The point satisfies constraints (24) - (26) and, therefore, is admissible. Equations (22) and (23) take the following form:

Putting it, we get and. Thus, the solution x*=(1, 5) satisfies the Kuhn-Tucker conditions. Since the conditions of Theorem 2 are satisfied, then the optimal solution to the problem from Example 3. Note that there are also other values ​​of and that satisfy system (22) - (29).

Notes

  • 1. For problems encountered in practice, the condition of linear independence is usually satisfied. If all functions in the problem are differentiable, then the Kuhn-Tucker point should be considered as possible point optimum. Thus, many of the nonlinear programming methods converge to the Kuhn-Tucker point. (Here it is appropriate to draw an analogy with the case of unconstrained optimization, when the corresponding algorithms make it possible to determine a stationary point.)
  • 2. If the conditions of Theorem 2 are met, the Kuhn-Tucker point at the same time turns out to be a global minimum point. Unfortunately, checking sufficient conditions is very difficult, and, in addition, applied problems often do not have the required properties. It should be noted that the presence of at least one nonlinear constraint in the form of equality leads to a violation of the assumptions of Theorem 2.
  • 3. The sufficient conditions established by Theorem 2 can be generalized to the case of problems with non-convex functions included in constraints in the form of inequalities, non-convex objective functions and nonlinear equality constraints. In this case, such generalizations of convex functions as quasi-convex and pseudo-convex functions are used.

Theorem 7.2. For a convex programming problem (7.4)–(7.6), the set of feasible solutions of which has the regularity property, a plan is an optimal plan if and only if there exists a vector such that
– saddle point of the Lagrange function.

Assuming that the objective function
and functions are continuous and differentiable, then the Kuhn-Tucker theorem can be supplemented with analytical expressions that determine the necessary and sufficient condition for the point
was the saddle point of the Lagrange function, i.e. was a solution to the convex programming problem. These expressions have the following form:

Where And are the values ​​of the corresponding partial derivatives of the Lagrange function calculated at the saddle point.

The Kuhn-Tucker theorem occupies a central place in the theory of nonlinear programming. It is also valid for so-called quadratic programming problems, i.e. where is the objective function
with restrictions:

.

IN general case There is no effective computational method for solving nonlinear programming problems to determine the global extremum of the problem if it is unknown that any local extremum is also global. Thus, in a convex and quadratic programming problem, the set of feasible solutions is convex, then if the objective function is concave, any local maximum is also global.

Example 7.3. Using the Kuhn-Tucker theorem we find
under restrictions

We solve graphically (Fig. 7.2) and find:
;
;
(see solution below for more details). Let us show that there is a vector Y 0 0, at which the Kuhn-Tucker conditions are satisfied at the optimum point. Let's compose the Lagrange function:

Solution
we find from the system:
. From the second equation
substitute into the first equation of the system. Differentiate by :
. In expressions And substitute the values
And
. We have values ​​of partial derivatives greater than zero, and according to the condition they must be equal to zero, then =0 And =0. On the other side, can take non-zero values, because
from here
; All
those. the Kuhn-Tucker conditions are satisfied, therefore, this point is an extremum point.

Fig.7.2. Graphical solution to the nonlinear problem

programming example 7.3

Necessary condition for optimality for a nonlinear programming problem can be formulated as follows. Let
is the optimal solution to problem (7.4)–(7.6), and the functions
,
,
differentiable at this point. If
is a nonsingular point of the admissible set of problem (7.4)–(7.6), then it is a Kuhn-Tucker point of this problem.

Thus, if an admissible set
problem (7.4)-(7.6) has no singular points, and the functions F(M),g i (M), (
) differentiable at all points
, then the optimal solution to this problem should be sought among the Kuhn-Tucker points.

As is known from mathematical analysis, special point sets of admissible solutions to a system of linear equations and inequalities of the form

i.e., sets of feasible solutions

if at the point
the gradients of those functions are linearly dependent
, which in it turn into . For example,
– singular point of the set

Really,

Thus, it is a linearly dependent system.

Gradient function
is a vector whose coordinates are respectively equal to the values ​​of the partial derivatives of the function
at the point M 0 . This vector shows the direction of the fastest growth of the function.

The necessary optimality condition is of little use for solving most nonlinear programming problems, because Only in rare cases is it possible to find all the Kuhn-Tucker points of a nonlinear programming problem.

Sufficient condition for optimality for a nonlinear programming problem: if
is the saddle point of the Lagrange function for problem (7.4)–(7.6), then
is the optimal solution to this problem.

This condition is not necessary in the general case: the Lagrange function may not have saddle points, at the same time original problem nonlinear programming has an optimal solution.

Various methods are known that allow one to find the optimal solution to a nonlinear programming problem approximately. However, these methods provide a fairly good approximation only for the convex programming problem, where every local extremum is also global. In general, under gradient methods understand methods in which the movement to the optimum point of a function coincides with the direction of the gradient of this function, i.e. starting from some point
a sequential transition is carried out to some other points until an acceptable solution to the original problem is identified. In this case, gradient methods can be divided into 2 groups.

TO first This group includes methods in which the points under study do not go beyond the range of feasible solutions to the problem. The most common of these methods is the method Frank-Wolf (Wolf). Such methods include the method projected Rosen gradients, method valid directions of Zeutendijk.

Co. second This group includes methods in which the points under study may or may not belong to the region of feasible solutions. However, as a result of the implementation of the iterative process, a point in the region of feasible solutions is found that determines an acceptable solution.

Of these methods, the most commonly used method is penalty functions or method Arrow-Hurwitz.

When finding solutions to problems using gradient methods, including those mentioned above, the iterative process is carried out until that moment, while the gradient of the function
at the next point
will not become equal to zero or until
, Where – a fairly small positive number characterizing the accuracy of the resulting solution.

Solving complex nonlinear programming problems using gradient methods involves a large amount of calculations and is only advisable using a computer.

Using an example, we will show the general meaning of gradient methods for solving nonlinear programming problems.

Let there be a function of two variables
. Let the initial values ​​of the variables be
, and the value of the function
. If
is not an extremum, then such new values ​​of the variables are determined
And
to the next value of the function
turned out to be closer to the optimum than the previous one.

How is the size of the increments determined? And ? To do this, at points And The direction of change of the function towards the extremum is determined using the gradient. How more value partial derivative, the faster the function changes towards the extremum. Therefore the increments And are chosen proportional to the value of the partial derivatives And at points
And
. Thus,
And
, Where – a value called step, which sets the scale of change And .

Example 7.4.

Let it be necessary to maximize the function

(maximum at point (3;2))


.

At the point
,
at
we have

;
,

Let's do one more iteration. At the point
,
at
we have

;
,

Fig.7.3. Geometric interpretation of two steps

gradient method for example 7.4

In Fig. 7.3 shows the movement along the gradient, which was carried out in in this example. Attitude indicates the direction of change of the function towards the maximum. If we take the gradient with a minus sign, then we will move towards the minimum.

A specific problem with gradient methods is the choice of step size t. If we take small steps, we will look for the optimum for a very long time. If you choose its value too large, then the optimum can be “overshooted”. The intermediate problem of choosing the optimal step size is solved using appropriate methods. If step t is determined approximately, then, as a rule, they fall not into the optimum, but into its zone. Gradient methods provide the determination of local optima. When searching for a global optimum using a gradient, there is often doubt that the found optimum is global. This is the disadvantage of this method when solving non-convex programming problems.

Self-test questions

    Statement of the nonlinear programming problem.

    The process of finding a solution to a nonlinear programming problem using its geometric interpretation.

    Algorithm of the Lagrange method for solving a nonlinear programming problem.

    Application of the Lagrange method to solving a nonlinear programming problem in the case where the connection conditions are inequalities.

    Auxiliary definitions and theorems necessary for considering the central theorem of nonlinear programming - the Kuhn-Tucker theorem.

    Kuhn-Tucker theorem.

    Necessary and sufficient optimality condition for a nonlinear programming problem.

    The general meaning of gradient methods for approximate solution of nonlinear programming problems.

    Groups of gradient methods for approximate solution of nonlinear programming problems.

Kuhn-Tucker theorems are a generic name for statements that represent generalizations

application of Lagrange’s theorem to the case of optimization problems with constraints in the form of inequalities, i.e. problems

the following type:

gj(x) > 0, j = 1, .

M, (?)

x = (x1, . . . , xn) 2 X.

Here f: X 7! R - (in accordance with established terminology) objective function, gr: X 7! R,

r = 1, . . . ,m, are constraint functions, X _ Rn is an open set.

Theorem 196 (John's theorem in terms saddle point):

Let the functions f( ), g1( ), . . . , gn( ) are concave and?x is a solution to problem (?), such that?x 2 intX.

Then there are Lagrange multipliers _j >

X is a solution to the problem

We present these statements for the case when the functions f, gr are differentiable (theorems of Ku-

on-Tucker in differential form).

Recall that the function

L(x,_) = _0f(x) +

is called the Lagrange function (Lagrangian) of this problem, and the coefficients _j are multipliers

Lagrange.

The following statement holds.

Theorem 197 (John's theorem for differentiable functions):

Let?x be a solution to problem (?), such that?x 2 intX and functions f( ), g1( ), . . . , gn( ) differential

are quantifiable at point?x.

Then there are Lagrange multipliers _j > 0, j = 0, . . . ,m, not all equal to zero, such that

the following relations are satisfied (Kuhn-Tucker conditions):

0, i = 1, . . . , n

J = 0 (conditions of complementary

non-rigidity).

Note that the conditions for complementary slackness can be written in the form

gj(?x)_j = 0, j = 1, . . . , m.

From these conditions it follows that if the Lagrange multiplier is positive (_j > 0), then the corresponding

the constraint in solving the problem (at x = ?x) is satisfied as an equality (i.e. gj(?x) = 0). Others

in other words, this limitation is active. On the other hand, in the case when gj(?x) > 0, then the corresponding

the Lagrange multiplier _j is equal to zero.

If in the problem (?) some of the restrictions have the form of restrictions on the non-negativity of some xi,

then for them you can not introduce Lagrange multipliers by writing the following restrictions separately:

gj(x) > 0, j = 1, . . . , m, (??)

xi > 0, i 2 P _ (1, . . . , n). At the interior point (in the sense that1 ?x 2 intX) the first order conditions for i 2 P are then

will look like this:

For i /2 P here, as in the case of representing the problem in the form (?), the derivative of the Lagrange function

for that variable will look like @L(?x,_)

In addition, the conditions of complementary non-rigidity are also satisfied

From the second of these conditions it follows that for?xi > 0 (i 2 P)

On the other hand, if @L(?x,_)/@xi Another modification of the theorem is associated with the presence of constraints in the form of equalities in the problem. Designation

let us define the set of corresponding indices through E. The problem takes the following form:

gj(x) > 0, j 2 (1, . . . ,m)\E,

gj(x) = 0, j 2 E, (???)

xi > 0, i 2 P _ (1, . . . , n).

At the same time, John’s theorem removes the condition that all Lagrange multipliers are non-negative -

Lagrange multipliers _j for j 2 E can have an arbitrary sign.

John's theorem does not guarantee that the Lagrange multiplier of the objective function, _0, is nonzero.

However, if _0 = 0, then the Kuhn-Tucker conditions characterize not the solution to the problem under consideration, but

structure of the set of restrictions at the point?x and the theorem has no direct connection with the interest

our current task of maximizing the function f( ), since the gradient of the function f( ) itself disappears. from

Kuhn-Tucker conditions.

Therefore, it is important to characterize the conditions that guarantee that _0 > 0.

Such conditions are called regularity conditions.

In the case when the problem under consideration is convex, one of the conditions for regularity is

the so-called Slater condition has the form:

In the case when the objective function and the constraints of the problem are differentiable, the simplest

The regularity condition is formulated in terms of gradients of constraint functions and has the form:

the gradients of active constraints at point?x are linearly independent. (Among the restrictions considered are

restrictions on non-negativity should also be included.)

Let us denote by A the set of indices of those constraints that are active at the optimum point?x

(including indices of all constraints in the form of equalities), i.e.

gj(?x) = 0, j 2 A.

Then if the constraint gradients are vectors

are linearly independent2, then _0 > 0. This condition is called the Kuhn-Tucker regularity condition.

Note that if _0 > 0, then without loss of generality we can assume _0 = 1, which is usually done.

The corresponding theorem is called the (direct) Kuhn-Tucker theorem. Theorem 198 (Direct Kuhn-Tucker theorem, necessary condition for optimality):

Let the functions f( ), g1( ), . . . , gn( ) are differentiable, and?x is a solution to problem (?), such that

X 2 intX and the Kuhn-Tucker regularity condition is satisfied.

Then there are Lagrange multipliers _j > 0, j = 1, . . . ,m, such that when _0 = 1 are satisfied

the following ratios:

0, i = 1, . . . , n

It is easy to reformulate this theorem for problems (??) and (???). The same capabilities are required here.

modification of the Kuhn-Tucker conditions, as in John's theorem.

0, i = 1, . . . , n

can be rewritten as:

This relationship shows that at the optimum point the gradient of the objective function is a linear com-

combination of antigradients of restrictions, and all the coefficients of this linear combination are non-negative

valuable. Rice. Figure 17.1 illustrates this property. Intuitively, the idea behind this property is that

if any coefficient of a linear combination were negative, then it would be possible

increase the value of the objective function by moving along this constraint. One of the inverse versions of the Kuhn-Tucker theorem states that when functions are concavity

f( ), (gk( )) fulfillment of these conditions in an admissible solution?x (i.e., a point satisfying the constraint

values) for some Lagrange multipliers that satisfy the requirements of the direct theorem,

guarantees that?x is a solution to the problem.

Theorem 199 (Inverse Kuhn-Tucker theorem /sufficient condition for optimality/):

Let f( ) be a differentiable concave function, g1( ), . . . , gn( ) - differentiable

quasi-concave functions, the set X is convex and the point?x is admissible in the problem (?), and?x 2

Let, in addition, there exist Lagrange multipliers _j > 0, j = 1, . . . ,m, such that when

0 = 1 the following relations are satisfied:

0, i = 1, . . . , n

Then?x is the solution to the problem (?).

The theorem can be reformulated in an obvious way for problems (??) and (???). For the task (???)

constraints in the form of equalities can only be linear (this is due to the fact that the constraint in the form

equalities, gj(x) = 0, should be represented using two restrictions in the form of inequalities, gj(x) > 0

and?gj(x) > 0, each of which is specified by a quasi-concave function; this can only happen if

the constraint is linear).

In another version of the sufficient optimality condition, the assumption that the target is concavity

function is replaced by the assumption of quasi-concavity with the addition of the condition rf(?x) 6= 0.

The Kuhn-Tucker theorem is a generalization of methods for solving optimization problems in two directions:

Linear programming to the nonlinear case, which by analogy received the not very successful name of “nonlinear programming”;

Nonlinear equality constraints on inequality constraints.

Karush-Kuhn-Tucker method and conditions ( Karush-Kuhn-Tucker conditions, KKT) are formally necessary conditions for the optimality of a nonlinear programming problem. In addition, the necessary conditions include the so-called regularity conditions for stationary points - the non-degeneracy of the set of constraint gradients. The method is a generalization of the Lagrange multiplier method to the case of inequality constraints.

Kuhn and Tucker generalized the Lagrange multiplier method (for use in constructing optimality criteria for problems with constraints in the form of equalities) to the case common task nonlinear programming with restrictions, both in the form of equalities and inequalities.

The main method in nonlinear programming still remains the use of the Lagrange function obtained by transferring restrictions to the objective function. The Kuhn-Tucker conditions are also derived from this principle.

William Karush in his diploma work in 1931 he found necessary conditions in the general case of restrictions on equalities and inequalities. Independently, Harold Kuhn and Albert Tucker came to the same conclusions later in 1941. Their results were more general.

If, under the imposed restrictions, is a solution to the problem, then there is a vector of Lagrange multipliers such that for the Lagrange function conditions are met:

- stationarity: ;

- complementary softness: ;

- non-negativity:.

The listed necessary conditions for the minimum of a function are not sufficient in the general case. There are several options additional conditions, which make them sufficient:

- simple condition – 1) the point is stationary, 2) the condition of complementary non-rigidity is satisfied, 3) Lagrange multipliers;

- Slater's condition (weaker) – 1) the point is stationary, 2) the condition of complementary non-rigidity is satisfied, 3) .

Let us proceed directly to the proof of the Kuhn-Tucker theorem.

If a continuous function of n variables x = (x1,...,xn) F(x) has at the point X opt maximum, then there exists ε > 0 such that for all x from the ε-neighborhood of the point X wholesale

F(x)≤F(x opt)

F(x)-F(x opt)≤0.

Let's choose two types of increment x j along j th coordinates

Δx j =x j -x jopt >0,

Δx j =x j -x jopt<0.



Passing in these relations to the limit at Δ x j→0, we get:

From these relations it follows that

(3.6.1)

A similar relation can be obtained for the case of a minimum function. Thus, the necessity of conditions (3.6.1) to achieve at the point x wholesale maximum or minimum function f(x), i.e. if there is an extremum, then conditions (3.6.1) are satisfied. But the equality to zero of all derivatives at the point x wholesale does not yet ensure the existence of an extremum in it, i.e., conditions (3.6.1) are not sufficient. Geometrically, this means that in the case of a zero derivative of a function of one variable there may be an inflection point, and not a maximum (or minimum), and in the case of a function of two variables, a saddle point, and not an extremum, etc. Therefore, the points x wholesale, in which relations (3.6.1) are satisfied are called stationary.

Note that condition (3.6.1) was obtained thanks to the ability to assign a variable X increments of two signs, which is where the two inequalities at and at . If the valid range X limited to non-negative values X≥0, then inside the region where X> 0, condition (3.6.1) remains valid, since increments of both signs are allowed there. On the border of the region X≥ 0, where X= 0, only positive increment Δ is allowed X>0, we can only talk about a one-sided derivative, and from (3.6.1) the following necessary condition for a maximum follows:

Accordingly, the necessary condition for a minimum at the boundary of the region x j= 0 will be written as

b) Conditional extremum problem

When determining conditional extremum functions when you need to determine the maximum (or minimum) of a function F(x) under limiting conditions:

g i (x) = b i , i = 1, ..., m,

f(x)=max;

g i (x)=b i ;

The method of Lagrange multipliers is also used, which, as in the case of classical calculus of variations, consists in introducing the Lagrange function

(3.6.2)

where λ i - indefinite factors Lagrange.



Assuming that the function is a special case of the functional, we obtain that the necessary conditions for the extremum are found by direct differentiation of relation (3.6.2) and are written in the form


If we introduce vectors

relations (17-8) and (17-9) will be rewritten as

grad Φ = grad F - λ grad φ = 0; (3.6.6)

where the equality of vectors to zero is understood componentwise.



Figure 3.6.1 - Explanation of the conditional extremum problem

When n= 2 and m= 1, the geometric problem of finding a conditional extremum is reduced (Fig. 17-6) to finding the tangency point A of the curve φ = b to one of the constant level curves f= const.