[Math] How to solve this LP problem as a Dynamic Programming problem

dynamic programminglinear programmingoperations research

The standard form LP problem is

$$\min -3x_1-7x_2-10x_3 \text{ s.t. }$$
$$x_3\leq 2$$
$$40x_3+40x_2+20x_1\leq 180$$
$$x_1,x_2,x_3\geq 0$$

My last lecture covered the Bellman equation with stationary discounted case getting $J^*(x)=\min_u\left\{g(x,u)+\alpha J^*(f(x,u))\right\}$ and the general form of the DP algorithm: $J_N(x_N)=g_N(x_N)$ and $J_k(x_k)=\min_{u_k}\left\{g_k(x_k,u_k)+J_{k+1}(f_k(x_k,u_k))\right\}$ where $k=0,1,…,N-1$.

I am trying to solve this LP as Dynamic Programming problem but I cannot see where to start. Is the $g(x)=-3x_1-7x_2-10x_3$? What is the $J_k(x_k)$ here? Does this hold $J_k(x_k)=J(x)$?

Solution to the problem solved as LP

By simplex, the optimum is $x_B=(x_2,x_3)=(2.5,2)$ with the $c_B'B^{-1}b=-37.5$.

Best Answer

The basic idea is to use Backward Induction. The marginal benefits are $b=(b_1,b_2,b_3)=(0.15,0.175,0.25)$. So we will first look at $x_3$ with the largest marginal gain, then $x_2$ and lastly $x_1$. The optimality is based on the optimality principle of Dynamic programming in the following examples: suppose a-b-c is optimal from the state A to the state C, then b-c must be optimal from the state B to the state C.

Suppose 1-2-3 is optimal, then 1-2 and 2-3 must be optimal. 2-3 means the choice of $x_3$ when $x_3\leq 2$ so $40*2\leq 180$ so this is optimal choice. Now with 100 resources, 1-2 is optimal because $2*40\leq 100$ and $3*40\not \leq 100$. Now because all resources are used at end when $x_1=1$, we have found the optimal solution, the example with the integers -- similarly with the real numbers.

Integer problem

We use everything of $x_3$ so $x_3=2$ because of the constraint $x_3\leq 2$. Then we will look at $x_2$ and use that 2 units because otherwise $40*2+40*2=200\leq 180$, breaking a consraint. So the last unit is $x_1=1$ hence $x=(1,2,2)$ when $x\in\mathbb Z^3$.

Real problem

Similarly as above but the step 2 is different where we can use 2.5 units instead of just 2 units. If $x\in\mathbb R^3$ then $x=(0,2.5,2)$, the same result as you got with the Simplex.

Best Answer

Related Solutions

[Math] Cost-to-go form of Dynamic Programming algorithm

[Math] Using simplex method to show that a linear program has no finite optimal solution

Related Question