Optimal resource allocation using dynamic programming method. The principle of optimality. Bellman equation. Problem on the order of matrix multiplication: given matrices, ..., it is required to minimize the number of scalar operations for their multiplication

ABSTRACT


Introduction


Dynamic programming- an optimization method adapted to operations in which the decision-making process can be divided into stages (steps). Such operations are called multi-step.

Beginning of development dynamic programming dates back to the 50s of the twentieth century. and is associated with the name of Richard Ernest Bellman.

If the models linear programming can be used in economics to make large-scale planning decisions in complex situations, then dynamic programming models are used to solve problems of a much smaller scale:

ü when developing inventory management rules;

ü when distributing investment resources between alternative projects;

ü when drawing up calendar plans for the current and overhaul complex equipment and its replacement, etc.


1. General formulation of the dynamic programming problem

dynamic bellman equation programming

Considered controlled process, for example, the process of distributing funds between enterprises, using resources over a number of years, replacing equipment, etc. As a result of control, the system (control object) S is transferred from the initial state s 0to state s n . Let the control be divided into n steps, i.e. the decision is made sequentially at each step, and the control that transfers the system S from the initial state to the final state is a set of n step-by-step management decisions.

Let us denote by X k management decision on kth step(k=1, 2, …, n). Variables X k satisfy some restrictions and in this sense are called admissible (X k can be a number, a point in n-dimensional space or a qualitative feature).

Let X=(X 1, X 2, …, X n ) - control that transfers system S from state s 0to state s n . Let us denote by s k state of the system (characterized a certain set parameters and their specific values) after the k-th control step. Moreover, the state of the system s k at the end of the kth step depends only on the previous state s k-1 and management decision at the k-th step X k (i.e. does not directly depend on previous conditions and management decisions). This requirement is called "no consequence" and can be expressed by the following equations of state:



Thus, we obtain a sequence of states s0, s1, …, sk-1, sk, …, sn-1, sn. Then n-step management process can be schematically depicted as follows:


Let the efficiency indicator of the kth step be expressed by some function:



and the efficiency of the entire multi-step process under consideration is the following additive function:



Then the step-by-step optimization problem (dynamic programming problem) is formulated as follows: determine such an admissible control X that transfers the system S from the state s0 to the state sn, at which the objective function Z takes the largest (smallest) value.

A dynamic programming problem has the following features:

The optimization problem is interpreted as an n-step control process.

The objective function is equal to the sum of the objective functions of each step.

The choice of control at the kth step depends only on the state of the system at this step and does not affect the previous steps (no feedback).

The state sk after the k-th control step depends only on the previous state sk-1 and control Xk (“no consequence”).

At each step, the control Xk depends on a finite number of control variables, and the state sk depends on a finite number of parameters.


2. Optimality principle and Bellman equations


Optimality principlewas first formulated by Richard Ernest Bellman in 1953 (as interpreted by E.S. Wentzel):

Whatever the state of the system as a result of any number of steps, at the next step it is necessary to choose control in such a way that, together with optimal control at all subsequent steps, it leads to optimal gain at all remaining steps, including this one.

R.E. Bellman also formulated the conditions under which the principle is true. The main requirement is that the control process must be without feedback, i.e. control at this step should not affect previous steps.

Let's consider common task dynamic programming given above. At each step except the last for any state of the system s k-1 management decision X k it is necessary to choose “with caution”, since this choice affects the subsequent state of the system sk .

At the last step, based on the state of the system s n-1 management decision X n can be planned locally optimally, i.e. based only on the considerations of this step.

Let's consider the last one nth step:

s n-1 - state of the system at the beginning of the nth step;

s n - final state of the system;

X n - control at the nth step;

f n (s n-1 , X n ) is the objective function (payoff) of the nth step.

According to the optimality principle, X n must be chosen in such a way that for any states of the system s n-1 get the optimum objective function at this step.

Let us denote by the optimum (for definiteness we will take the maximum) of the objective function - an indicator of the efficiency of the nth step, provided that at the beginning of the last step the system S was in an arbitrary state sn-1, and at the last step the control was optimal.

is called the conditional maximum of the objective function at the nth step, and is determined by the following formula:



Maximization is carried out over all admissible controls Xn.

The solution Xn at which this is achieved also depends on sn-1 and is called the conditional optimal solution at the nth step. Let's denote it by.

Having solved the one-dimensional local optimization problem using equation (5), we define two functions and for all possible states sn-1.

Let's consider a two-step problem: add the (n-1) -th to the n-th step.

For any states sn-2, arbitrary management decisions Xn-1 and optimal control at the nth step, the value of the objective function at two last steps x is calculated by the formula:


According to Bellman's optimality principle for any s n-2 the solution must be chosen so that, together with the optimal control at the last (nth) step, it would lead to the optimum of the objective function at the last two steps. Therefore, it is necessary to find the optimum of expression (6) for all admissible management decisions Xn-1 :



It is called the conditional maximum of the objective function under optimal control at the last two steps. It should be noted that the expression in curly brackets in formula (6) depends only on sn-2 and Xn-1, since sn-1 can be found from the equation of states (1) with:



The corresponding control Xn-1 at the (n-1)th step is denoted by and is called conditional optimal control at the (n-1)th step.

The conditional optima of the objective function are determined similarly for optimal control at (n-k+1) steps, starting from the k-th to the end, provided that at the beginning of the k-th step the system was in state sk-1:



Control Xk at the kth step, at which the maximum according to (8) is achieved, is denoted and called conditional optimal control at the kth step.

Equations (5) and (8) are called recurrent Bellman equations (reverse scheme). The process of solving these equations is called constrained optimization.

As a result conditional optimization we get two sequences:

, …, - conditional maxima of the objective function at the last, two last, …, at n steps;

, …, - conditional optimal controls at the nth, (n-1) - th, …, at the 1st steps.

Using these sequences, we can find a solution to the dynamic programming problem given n and s0:

As a result, we obtain the optimal solution to the dynamic programming problem: .

Using similar reasoning, we can build a direct conditional optimization scheme:



The optimal solution to the problem in in this case is located according to the following scheme:


Thus, building a dynamic programming model and solving a problem based on it in general view can be represented in the form of the following stages:

Choose a method for dividing the management process into steps.

Define state parameters s k and control variables X k at each step, write the equations of state.

3. Enter the objective functions of the k-th step and the total objective function, as well as conditional optima and conditional optimal control at the k-th step ().

The Bellman recurrent equations are written in accordance with the inverse or direct scheme and, after performing conditional optimization, two sequences are obtained: () and ().

The optimal value of the objective function and the optimal solution are determined.


3. Resource Allocation Problem


There is a certain amount of resources s0 that must be distributed among n business entities for current activities during the period under review (month, quarter, half-year, year, etc.) in order to obtain a total maximum profit. The size of resource investments xi (;) in the activities of each economic entity is a multiple of a certain value h. It is known that each economic entity, depending on the volume of funds used xi for the period under review, brings a profit in the amount of fi(xi) (does not depend on the investment of resources in other economic entities).

Let us imagine the process of resource distribution between business entities as an n-step management process (the step number coincides with the conditional number of the business entity). Let sk() be a state parameter, i.e. the amount of available funds after the k-th step for distribution among the remaining (n - k) business entities. Then the equations of state can be written in the following form:



Let us introduce into consideration the function - the conditionally optimal total profit received from the k-th, (k+1) - th, ..., n-th economic entities, if resources in the amount of sk-1 () were optimally distributed between them. The set of possible management decisions regarding the size of allocated resources at the k-th step can be presented as follows: .

Then the recurrent equations of R.E. Bellman (reverse diagram) will look like:



Example. There is a certain amount of resources s0=100, which must be distributed among n=4 business entities for current activities during the period under review (month) in order to obtain the total maximum profit. The size of the investment of resources xi (;) in the activities of each economic entity is a multiple of the value h=20 and is specified by the vector Q. It is known that each economic entity, depending on the volume of funds used xi for the period under review, brings a profit in the amount of fi(xi) () (not depends on the investment of resources in other economic entities):

It is necessary to determine how much resources should be allocated to each enterprise in order for the total profit to be greatest.

Solution.Let's create Bellman's recurrent equations (reverse scheme):



Let us determine the conditional maximums in accordance with (13); the calculation results are presented in Table 1.


Table 1. Calculation of conditional optima

s k-1 x k s k k=3k=2k=1 123456789101112000000000000200200+20=20 22 200+22=22 2200+22=22 22020022+0=22 17+0=1714+0=14400400+33=33 42 200+42=42 4200+42=42 420202022+20=42 17+22=3914+22=3640021+0=2120+0=2026+0=26600600+46=46 55 200+55=55 59 20 0+59=59 590204022+33=5517+42=59 14+42=56402021+20=4120+22=4226+22=4860037+0=3732+0=3235+0=35800800+30=30 68 200+68=68 72 200+72=72 73 20206022+46=6817+55=7214+59=73 404021+33=5420+42=6426+42=68602037+20=5732+22=5435+22=5780067+0=6761+0=6152+0=5210001000+42=42 87 800+87=87 8700+87=87 870208022+30=5217+68=8514+72=86406021+46=6720+55=7526+59=85604037+33=7032+42=7435+42=77802067+20=87 61+22=8352+22=74100058+0=5872+0=7261+0=61Based on the results of conditional optimization, we will determine the optimal allocation of resources:

Thus, the optimal resource allocation is:



which will provide the greatest profit in the amount of 87 conventional units. den. units

Answer: optimal allocation of resources: which provides the greatest profit of 87 conventional units. den. units


Conclusion


Dynamic programming is an area of ​​mathematical programming that includes a set of techniques and tools for finding optimal solution, as well as optimizing each step in the system and developing a management strategy, that is, the management process can be represented as a multi-step process. Dynamic programming, using step-by-step planning, allows not only to simplify the solution of the problem, but also to solve those problems for which methods cannot be applied mathematical analysis. Simplification of the solution is achieved by significantly reducing the number of options under study, since instead of solving a complex multivariate problem once, the step-by-step planning method involves multiple solutions regarding simple tasks. When planning a step-by-step process, we proceed from the interests of the entire process as a whole, i.e. When making a decision at a particular stage, it is always necessary to keep in mind the final goal. However, dynamic programming also has its disadvantages. Unlike linear programming, in which simplex method is universal; there is no such method in dynamic programming. Each task has its own difficulties, and in each case it is necessary to find the most suitable solution method. The disadvantage of dynamic programming is also the complexity of solving multidimensional problems. A dynamic programming problem must satisfy two conditions. The first condition is usually called the condition of absence of aftereffect, and the second is the condition of additivity of the objective function of the problem. In practice, there are planning problems in which random factors play a significant role, influencing both the state of the system and the gain. There is a difference between deterministic and stochastic dynamic programming problems. In a deterministic problem, the optimal control is unique and is specified in advance as tough program actions. In a stochastic problem, the optimal control is random and is selected during the process itself, depending on the random situation. In a deterministic scheme, going through the process in stages from end to beginning, it is also at each stage whole line conditional optimal controls, but of all these controls, only one was ultimately implemented. This is not the case in a stochastic scheme. Each of the conditional optimal controls can actually be implemented if the previous course of the random process leads the system to the corresponding state. The principle of optimality is the basis for the step-by-step solution of dynamic programming problems. Typical representatives economic tasks dynamic programming are the so-called production and storage problems, investment distribution problems, production scheduling problems and others. Dynamic programming problems are used in planning the activities of an enterprise, taking into account changes in the need for products over time. In the optimal distribution of resources between enterprises in direction or time. The description of the characteristics of dynamic programming and the types of problems that can be formulated within its framework must necessarily be very general and somewhat vague, since there is an immense variety various tasks, fitting into the dynamic programming scheme. Study only large number examples provide a clear understanding of the structure of dynamic programming.


Bibliography

  1. Economic and mathematical models and methods. Linear programming: Tutorial for students of economic specialties / Compiled by: Smirnov Yu.N., Shibanova E.V., Naberezhnye Chelny: KamPI Publishing House, 2004, 81 p.
  2. Operations Research in Economics: Textbook. manual for universities / N.Sh. Kremer, B.A. Putko, I.M. Trishin, M.N. Friedman; Ed. prof. N.Sh. Kremer. - M.: UNITY, 2000. - 407 p.
  3. Kuznetsov A.V. and others. Higher mathematics: Mat. programming: Textbook/A.V. Kuznetsov, V.A. Sakovich, N.I. Cold; Under general ed. A.V. Kuznetsova. - Mn.: Higher. school, 1994. - 286 pp.: ill.
Tutoring

Need help studying a topic?

Our specialists will advise or provide tutoring services on topics that interest you.
Submit your application indicating the topic right now to find out about the possibility of obtaining a consultation.

There is a certain amount of resources s 0 that must be distributed among n business entities for current activities during the period under review (month, quarter, half-year, year, etc.) in order to obtain total maximum profit. The size of resource investments x i (;) in the activities of each economic entity is a multiple of a certain value h. It is known that each economic entity, depending on the volume of funds used x i for the period under review, generates a profit in the amount of f i (x i) (does not depend on the investment of resources in other economic entities).

Let us imagine the process of resource distribution between business entities as an n-step management process (the step number coincides with the conditional number of the business entity). Let s k () be a state parameter, i.e. the amount of available funds after the kth step for distribution among the remaining (n - k) business entities. Then the equations of state can be written in the following form:

Let us introduce into consideration the function - the conditionally optimal total profit received from the k-th, (k+1) - th, ..., n-th economic entities, if resources in the amount of s k-1 () were optimally distributed between them. The set of possible management decisions regarding the size of allocated resources at the k-th step can be presented as follows: .

Then the recurrent equations of R.E. Bellman (reverse diagram) will look like:

Example. There is a certain amount of resources s 0 =100, which must be distributed among n=4 business entities for current activities during the period under review (month) in order to obtain the total maximum profit. The size of the investment of resources x i (;) in the activities of each economic entity is a multiple of the value h = 20 and is specified by the vector Q. It is known that each economic entity, depending on the volume of funds used x i for the period under review, brings a profit in the amount of f i (x i) () (not depends on the investment of resources in other economic entities):

It is necessary to determine how much resources should be allocated to each enterprise in order for the total profit to be greatest.

Solution. Let's create Bellman's recurrent equations (inverse scheme):

Let us determine the conditional maximums in accordance with (13); the calculation results are presented in Table 1.

Table 1. Calculation of conditional optima

22+20=42

22+33=55

17+42=59

22+46=68

17+55=72

14+59=73

67+20=87

Based on the results of conditional optimization, we will determine the optimal allocation of resources:


Thus, the optimal resource allocation is:

which will provide the greatest profit in the amount of 87 conventional units. den. units

Answer: optimal allocation of resources: which provides the greatest profit of 87 conventional units. den. units

Conclusion

Dynamic programming is an area of ​​mathematical programming that includes a set of techniques and tools for finding the optimal solution, as well as optimizing each step in the system and developing a control strategy, that is, the control process can be represented as a multi-step process. Dynamic programming, using step-by-step planning, allows not only to simplify the solution of the problem, but also to solve those problems that cannot be applied using methods of mathematical analysis. Simplification of the solution is achieved by significantly reducing the number of options under study, since instead of solving a complex multivariate problem once, the step-by-step planning method involves solving relatively simple problems multiple times. When planning a step-by-step process, we proceed from the interests of the entire process as a whole, i.e. When making a decision at a particular stage, it is always necessary to keep in mind the final goal. However, dynamic programming also has its disadvantages. Unlike linear programming, in which the simplex method is universal, there is no such method in dynamic programming. Each task has its own difficulties, and in each case it is necessary to find the most suitable solution method. The disadvantage of dynamic programming is also the complexity of solving multidimensional problems. A dynamic programming problem must satisfy two conditions. The first condition is usually called the condition of absence of aftereffect, and the second is the condition of additivity of the objective function of the problem. In practice, there are planning problems in which random factors play a significant role, influencing both the state of the system and the gain. There is a difference between deterministic and stochastic dynamic programming problems. In a deterministic problem, the optimal control is unique and is specified in advance as a rigid program of actions. In a stochastic problem, the optimal control is random and is selected during the process itself, depending on the random situation. In a deterministic scheme, going through the process in stages from end to beginning, there is also a whole series of conditional optimal controls at each stage, but of all these controls, only one was ultimately carried out. This is not the case in a stochastic scheme. Each of the conditional optimal controls can actually be implemented if the previous course of the random process leads the system to the corresponding state. The principle of optimality is the basis for the step-by-step solution of dynamic programming problems. Typical representatives of economic problems of dynamic programming are the so-called production and storage problems, capital investment distribution problems, production scheduling problems, and others. Dynamic programming problems are used in planning the activities of an enterprise, taking into account changes in the need for products over time. In the optimal distribution of resources between enterprises in direction or time. The description of the characteristics of dynamic programming and the types of problems that can be formulated within its framework must necessarily be very general and somewhat vague, since there is an immense variety of different problems that fit into the dynamic programming scheme. Only studying a large number of examples gives a clear understanding of the structure of dynamic programming.

The dynamic programming method allows you to successfully solve many economic problems (see, for example,). Let's consider one of the simplest such problems. We have at our disposal some reserve of funds (resources) K, which must be distributed among enterprises. Each of the enterprises, when some funds are invested in it, generates income that depends on , i.e., representing some kind of function. All functions are given (of course, these functions are non-decreasing).

The question is, how should funds K be distributed among enterprises so that in total they generate maximum income?

This problem is easily solved using the dynamic programming method. Although in its formulation it does not contain any mention of time, it is still possible to mentally unfold the operation of distributing funds in some sequence, considering the first step to be the investment of funds in the enterprise, the second - to, etc.

The managed system S in this case is the funds or resources that are distributed. The state of the system S before each step is characterized by one number S - the available stock of funds not yet invested. In this task, “step management” is the funds allocated to enterprises. It is required to find optimal control, i.e. such a set of numbers for which the total income is maximum:

Let us solve this problem first in general, formulaic form, and then for specific numerical data. Let us find for each step the conditional optimal gain (from this step to the end), if we come to this step with a reserve of funds S. Let us denote the conditional optimal gain , and the corresponding conditional optimal control - the funds invested in the enterprise -

Let's start optimization from the last step. Let us approach this step with a balance of funds S. What should we do? Obviously, invest the entire amount S in the enterprise. Therefore, the conditional optimal control at the th step: give the last enterprise all available funds S, i.e.

and the conditional optimal gain

Given a whole range of values ​​of S (locating them quite closely), we will know for each value of S . The last step is optimized.

Let's move on to the penultimate step. Let us approach it with a reserve of funds S. Let us denote the conditional optimal gain at the last two steps: (which is already optimized). If we allocate funds to the enterprise at the step, then there will be left for the last step. Our winnings at the last two steps will be equal to

and you need to find one at which this gain is maximum:

The sign means that the maximum value is taken over all possible values ​​(we cannot put in more than S), from the expression in curly brackets. This maximum is the conditional optimal gain for the last two steps, and the value at which this maximum is achieved is the conditional optimal control at the step.

and the corresponding conditional optimal control is the value at which this maximum is achieved.

Continuing in this way, we will finally reach the enterprise. Here we will not need to vary the values ​​of S; we know for sure that the stock of funds before the first step is equal to K:

So, the maximum gain (income) from all enterprises has been found. Now all that remains is to “read the recommendations.” The value at which the maximum (13.4) is reached is the optimal control at the 1st step.

After we invest these funds in the first enterprise, we will have them left. “Reading” the recommendation for this value S, we allocate to the second enterprise optimal quantity means: etc. until the end.

Now let's solve a numerical example. The initial stock of funds (standard units), and it is required to optimally distribute it among five enterprises. For simplicity, we assume that only whole amounts of funds are invested. The income functions are given in Table 13.1.

Table 13.1

In each column, starting from a certain amount of investment, income stops increasing (in reality, this corresponds to the fact that each enterprise is able to “adopt” only a limited amount of funds).

Let's carry out conditional optimization as described above, starting from the last, 5th step. Every time we approach the next step, having a reserve of funds?, we try to allocate one or another amount of funds for this step, take the winnings at this step according to table 13.1, add them with the already optimized winnings at all subsequent steps until the end (taking into account that we already have less funds left, just for the amount of funds that we have allocated) and find the investment at which this amount reaches its maximum. This investment is the conditional optimal control at this step, and the maximum itself is the conditional optimal gain.

Table 13.2 shows the results of conditional optimization for all steps. The table is constructed as follows: the first column gives the values ​​of the stock of funds S with which we approach this step. The table is further divided into five pairs of columns, corresponding to the step number.

Table 13.2

The first column of each pair gives the value of the conditional optimal control, the second - the conditional optimal gain. The table is filled from left to right, top to bottom. The decision at the fifth - last - step is forced: all funds are allocated; At all other steps, the solution must be optimized. As a result of sequential optimization of the 5th, 4th, 3rd, 2nd and 1st steps we get full list all recommendations for optimal control and the unconditional optimal gain W for the entire operation - in this case it is equal to 5.6. In the last two columns of Table 13.2, only one row is filled in, since we know exactly the state of the system before the start of the first step: . Optimal controls are highlighted with a frame at all steps. Thus, we received the final conclusion: we must allocate two units out of ten to the first enterprise, five units to the second, two to the third, none to the fourth, and one unit to the fifth. With this distribution, the income will be maximum and equal to 5.6.

- 1.03 MB

Let us give a mathematical formulation of the optimality principle. For simplicity, we will assume that the initial x 0 and final x T states of the system are given. Let us denote by z 1 (x 0 , u 1) the value of the goal function at the first stage with the initial state of the system x 0 and with control u 1 , by z 2 (x 1 , u 2) the corresponding value of the goal function only at the second stage, . .., through
z i (х i -1 ,u i) – on i-th stage, ..., through z N (x N -1 , u N) -on Nth stage. It's obvious that

It is necessary to find the optimal control u*= (; ;...;), such that it delivers the extremum of the objective function (1) under restrictions.

To solve this problem, we immerse it in a family of similar ones. Let us introduce some notation. Let be, respectively, the areas

definitions for similar problems at the last stage, the last two, etc.;
– domain of definition of the original problem. Let us denote by F 1 (x N -1), F 2 (x N -2), ..., F k (x N -k), ..., F N (x 0), respectively, the conditionally optimal values ​​of the goal function at the last stage, two the last ones, etc., at the k last ones, etc., at all N stages.

Let's start with last stage. Let x N-1 be the possible states of the system at the beginning of the Nth stage. We find:

F 1 (x N -1) = z N (x N -1, u N). (2)

For the last two stages we get

F 2 (x N -2) = (Z N -1 (x N -2, u N -1) + F 1 (x N -1)). (3)

Likewise:

F 3 (x N -3) = (Z N -2 (x N -3, u N -2) + F 2 (x N -2)). (4)

………………………………………………….

F k (x N -k) = (z N-k +1 (x N -k , u N-k +1) + F k- 1 (x N-k +1)). (5)

…………………………………………………..

F N (x 0) = (z 1 (x 0, u 1) + F N -1 (x 1)). (6)

Expression (6) is a mathematical representation of the optimality principle. Expression (5) is the general form of writing the conditionally optimal value of the goal function for the k remaining stages. Expressions (2) – (6) are called functional Bellman equations. Their recurrent (recurrent) nature is clearly visible, i.e., to find the optimal control at N steps, you need to know the conditionally optimal control at the previous N – 1 stages, etc. Therefore, functional equations are often called recurrent (recurrent) Bellman relations.

    1. Features of dynamic programming problems

Based on the above, we can highlight the following features of dynamic programming problems.

  • We consider a system whose state at each step is determined by the vector x t. Further changes in her condition depend only on of this state
  • x t and does not depend on how the system came to this state. Such processes are called processes without aftereffect. At each step, one solution u t is selected, under the influence of which the system goes from previous state
  • x t -1 to new x t .
  • This new state is a function of the state at the beginning of the interval x t -1 and the decision u t adopted at the beginning of the interval, i.e. x t = x t (x t -1 ,u t).
  • It is required to find such an admissible control u t for each step t in order to obtain the extreme value of the goal function for all T steps.

Any valid sequence of actions for each step that transfers the system from the initial state to the final state is called a control strategy. A control strategy that results in an extreme value of the objective function is called optimal.

The geometric interpretation of the dynamic programming problem is as follows. Let n be the dimension of the state space. At each moment of time, the coordinates of the system have completely specific value. With a change in time t, the coordinate values ​​of the state vector can change. Let us call the transition of a system from one state to another the trajectory of its movement in the space of states. Such a transition is carried out by influencing the state coordinates. The space in which the states of the system serve as coordinates is called phase space. The problem of dynamic programming can be interpreted especially clearly if the state space is two-dimensional. The area of ​​possible states in this case will be depicted by a certain figure, the initial and final states of the system - by points x 0 (Fig. 1). Control is the influence that transfers the system from the initial state to the final state. For many economic problems, the initial or final state is not known, but the region X 0 or X T to which these points belong is known.

Picture 1

Then admissible controls transfer points from the region X 0 to X T . The dynamic programming problem can be geometrically formulated as follows: find a phase trajectory starting in the region X 0 and ending in the region X T for which the goal function reaches an extreme value. If the initial and final states of a dynamic programming problem are known, then we speak of a problem with fixed ends. If the initial and final regions are known, then we speak of a problem with free ends.

  1. RESOURCE DISTRIBUTION PROBLEM

2.1 General statement of the problem

Let's consider the application of the dynamic programming method using the example of the distribution of funds between six reconstruction objects of a city water utility company:

1. Central pumping and filtration station;

2. Eastern pumping and filtration station;

3. Water pumping station;

4. Central aeration station;

5. Eastern aeration station;

6. Country aeration station.

The total amount of funds provided for development is no more than 195 thousand hryvnia. Based on technical and economic calculations, it was established that as a result of reconstruction, depending on the amount of funds spent, the facilities will have the productivity shown in Table 1.1. It is necessary to determine the optimal distribution of funds between reconstruction objects, which will ensure the maximum increase in the productivity of these objects. Thus, this problem uses an optimization criterion - the total productivity of enterprises of reconstruction objects.

Table 1.1 Input data for the productivity of reconstruction objects

Object serial number

Volume of resources allocated for the development of facilities (thousand UAH)

Productivity of objects as a result of development (thousand m3)

    1. Block diagram of the program

Figure 1. Main program

QtObj – number of objects


QtRes – number of resources

effMatrix - object performance matrix,


distVector – vector of allocated resources


Step 1: Conditional Optimization

Step 2. Unconditional optimization


I = QtObj-1.0 form the vector result

Figure 2. Data entry

distVector – distance vector, effMatrix = performance matrix

if all matrix elements are entered



if the performance vector is not

negative

Figure 3. Conditional optimization,

we form an output matrix (maximum of the goal function)


outMatrix – maximum target matrix

QtObj – number of objects

QtRes – number of resources

Matrix – performance matrix

distVect – distance vector (resource vector)

no yes If the first enterprise

Finding the maximum


yes maxItem = temp; outMatrix[i][j] = maxItem

    1. Program algorithm structure
  1. Data input – class DataDlg.

Variable class members.

//vector for storing the amount of resources

std::vector distVector;

//object performance matrix

int** effMatrix;

//function to convert a string to a number

int StringToInt(CString);

//function to check the correctness of the entered data

BOOL FillMatrix();

//resource cleaning function after closing the window

virtual BOOL DestroyWindow();

//dialog initialization function

virtual BOOL OnInitDialog();

  1. Calculating results - the main class of the program courseWorkDlg

Variable class members

intValue; //performance value

int MaxIndex;// maximum index in the resource vector

int Facility;//enterprise

int Recource;//dedicated resource

Item ** outMatrix; //maximum target matrix

std::vector resVector; //vector of results

void BuildOutMatrix(int ​​**,std::vector );//function for generating the goal matrix (conditional optimization)

afx_msg void OnBnClickedButton1(); // handler for clicking on the “Calculate” button, which starts the calculation process.

virtual BOOL DestroyWindow();//clearing program resources

  1. Outputting results class Report

The purpose of this class is to output the result vector in tabular form.

2.4 Results of the program

Initial Data Entry

  1. Entering data on the productivity of reconstruction objects
  1. If not all fields are filled in
  1. If an incorrect character is entered

Correct data entry

Show result

  1. Data input

Result of the program

Initial Data Entry

Entering object productivity

Application.

Program listing

int DataDlg::StringToInt(CString str)

const wchar_t* s = T2CW(str);

int val = _wtoi(s);

// are all fields filled in?

BOOL DataDlg::FillMatrix()

bool flag = true;

for (int i = 0; i< Cells.GetSize(); i ++){

for (int j = 0 ; j< Cells.GetAt(i)->Edits.GetSize(); j++)(

CEdit * temp = Cells.GetAt(i)->Edits.GetAt(j) ;

if (temp->m_hWnd != NULL)(

temp->GetWindowText(str);

if (str.IsEmpty())(

MessageBox(L"You need to fill in all fields", L"Error", MB_ICONERROR | MB_OK);

Description of work

The purpose of this work is to implement on a computer a solution to the problem of optimal allocation of funds for the expansion of production.
Coursework objectives:
Consider theoretical aspects solving dynamic programming problems; recurrent nature of tasks of this type; Bellman's principles of optimality.
Algorithm development. Block diagrams. Algorithm structure.
Computer implementation of the developed algorithm in the selected programming language.

Content

INTRODUCTION………………………………………………………2
Dynamic programming
Basic concepts…………………4
Principles of dynamic programming. Bellman functional equations…………………….5
Features of dynamic programming problems……………….10
Resource allocation problem………………………12
General statement of the problem………………………….13
Block diagram of the program
Program algorithm structure
Result of the program
Conclusion
Bibliography

Send your good work in the knowledge base is simple. Use the form below

Good work to the site">

Students, graduate students, young scientists who use the knowledge base in their studies and work will be very grateful to you.

Posted on http://www.allbest.ru/

Resource allocation problem using dynamic programming method

To expand the production capacity of three enterprises A, B and C, a certain number of units of additional electricity are allocated in the amount of x 0 = 8 units. Electricity can be released in the form of 1, 2, 3, 4, 5, 6, 7 and 8 units. By investing x i units of electricity in the development of the i-th enterprise, you can receive income from i units at the enterprise. Exist different variants x i (k) allocation of additional electricity. They bring income to i (k), k=1,n. Possible options development of enterprises are given in Table 1. The total income for all enterprises should be maximum, i.e. y=? y i (k)>max

Table 1. Enterprise development options

Option k

Enterprise A

Enterprise B

Enterprise C

Mathematical staging tasks:

y=? at i (k)> max

?X i (k)?x 0

Solution:

Let's start considering the procedure for solving the problem at hand from the last (3 steps) stage (Table 2), at which investments are allocated to enterprise C. Conditional optimal control at the third stage is sought as a solution to the equation

g C (S 2)=max k f c , x C (k)?S 2 , k=1,2,3,4

Table 2. Conditionally optimal solutions (step 3)

State

Control

There are four possibilities for investing funds - four step controls x C (1) = 0 units, x C (2) = 1 unit, x C (3) = 2 units, x C (4) = 3 units. and nine theoretically possible states of the system S 2 preceding the allocation of funds to enterprise C - the volumes of investments not distributed to the 3rd stage: 0,1,2,3,4,5,6,7,8.

Let us assume that the system was in state S 2 =2. Then, for step control x C (2) = 1, the income of C (2) will be equal to 3 units. (Table 3), and step control x C (3) = 2 will be optimal for this state, giving a conditionally maximum gain g c (S 2) = 5. If the system was in state S 2 =3, then all step controls are allowed: x C (1) = 0 units, x C (2) = 1 unit, x C (3) = 2 units, x C (4) = 3 units. , and the optimal control will be x C (4) = 3, which provides the conditionally maximum gain g c (S 2) = 6.

Table 3 dynamic programming investment distribution

All possible states preceding the 3rd stage are filled in similarly. Optimal values indicators are highlighted in bold in the tables.

Next, the second stage is considered in the same way (Table 4), which consists of allocating investments to enterprise A. At the second stage, the total gain is the sum of the winnings received in the third and second stages, and is given by the ratio:

g A (S 1)=max k f A +g c], x A (k)?S 1, k=1,2,3,4

Thus, for the state S 1 =3 with step control x A (2) = 1 we obtain:

g A (S 1)=max k f A +g c ]

Max k 4+g c =4+5=9, where we find from Table 1, and g c from Table 3. All states are filled in similarly.

Table 4. Conditionally optimal solutions (step 2)

State

f A +g c

Control

Here situations arise in which the optimal solution will not be the only one. Thus, in the state S 1 = 3, step controls x A (2) = 1 and x A (3) = 2 will be conditionally optimal, giving the same gain g A (S 1)=9

Table 5. Unconditionally optimal solutions (step 1)

At the first stage (Table 5) - the allocation of investments to enterprise B - there is only one previous state of the system, corresponding to the initial state S 0 =8. The unconditionally optimal gain is determined by the expression:

y * = g B (S 0)= max k (f A +g A) x in (k)?S 0 =x 0, k=1,2,3,4,5

Unconditionally optimal controls that ensure maximum income can be different.

Scheme for finding everyone optimal options distribution of investments between enterprises (Table 6) is presented in Figure 1.

Table 6. Optimal distribution of investments.

Figure 1. Scheme of optimal distribution of investments between enterprises

Conclusion: having considered the problem of resource allocation using the dynamic programming method, two options for optimal resource allocation have been identified.

Posted on Allbest.ru

...

Similar documents

    general characteristics and economic performance indicators of the three enterprises under study. Solving the problem of production planning, as well as investment distribution using linear and dynamic programming. Comparative analysis results.

    course work, added 04/25/2015

    Multi-step processes in dynamic tasks. The principle of optimality and recurrence relations. Dynamic programming method. Problems of optimal allocation of funds for production expansion and production program planning.

    course work, added 12/30/2010

    Dynamic programming method and its main stages. Optimal equipment replacement strategy. Minimizing costs for construction and operation of enterprises. Optimal distribution of resources in STROYKROVLYA LLC and investments in PKT Khimvolokno.

    course work, added 01/08/2015

    Mathematical model of production planning. Drawing up an optimal plan production activities enterprises using the linear programming method. Finding the best way distribution of monetary resources during the planning period.

    thesis, added 08/07/2013

    Calculation of transportation costs using the method minimum costs. Finding conditional optimal equality in the process of dynamic programming. Linear algebraic equation Kolmogorov for the average non-failure time of a redundant system.

    course work, added 01/14/2011

    Graphical method solving an optimization problem production processes. Application of a simplex algorithm to solve an economically optimized production management problem. Dynamic programming method for selecting the optimal path profile.

    test, added 10/15/2010

    Optimal distribution plan Money between enterprises. Developing a plan for each enterprise in which the return on investment will be highest value. Using linear and dynamic programming methods.

    course work, added 12/16/2013

    Character traits linear programming problems. General formulation of the production planning problem. Construction mathematical model distribution of company resources. Sensitivity analysis of the optimal solution. Compiling a sustainability report.

    presentation, added 12/02/2014

    Finding the optimal securities portfolio. Review of methods for solving the problem. Construction of a mathematical model. Cone programming problem. Dependence of the initial capital distribution vector on one of the initial parameters.

    thesis, added 02/11/2017

    Dynamic programming model. The optimality principle and the Bellman equation. Description of the process of modeling and constructing a computational dynamic programming scheme. The problem of minimizing the costs of construction and operation of enterprises.