271x Filetype PDF File size 0.33 MB Source: www.artima.com
Chapter 2 Dynamic Programming and Optimal Control of Discrete-Time Systems
2.1 Closed Loop Optimization
Example: Inventory Control
Weconsider the minimization of the expected cost of ordering quantities in order to meet a
stochastic demand.
Rule: The ordering is only possible at discrete time instants t0 < t1 < ··· < tN−1; N ∈ lN:
Excess demand is backlogged and filled as additional inventory becomes available.
x : available stock
k
uk : order at time tk; 0 ≤ k ≤ N−1
wk : demand
Hence, the stock evolves according to the discrete-time system
x =x +u −w ; 0≤k≤N−1:
k+1 k k k
Costs at time tk :
r(x ) holding cost (excess inventory) or shortage cost (unsatisfied demand)
k
c purchasing cost (cost per ordered unit)
Terminal cost:
R(x ) left over inventory
N
Total cost:
N−1
E(R(x )+ X (r(x )+cu )):
N k k
k=0
Optimal Control of Discrete-Time Systems (DTS)
Consider the minimization problem
N−1
(DTS) minJ (x ); J (x ) := E(g (x )+ X g (x ;µ (x );w ))
π∈Π π 0 π 0 N N k k k k k
k=0
subject to x =f (x ;µ (x );w ); 0 ≤ k ≤ N−1
k+1 k k k k k
gk : Sk × Ck × Dk → lR; 0 ≤ k ≤ N−1 cost functionals
gN : SN → lR terminal cost
f : S ×C ×D →S ; 0≤k≤N−1 discrete dynamics
k k k k k+1
S ; 0 ≤ k ≤ N state spaces x ∈S states
k k k
C ; 0 ≤ k ≤ N−1 control spaces u ∈U (x )⊂C controls
k k k k k
Dk; 0 ≤ k ≤ N disturbance spaces wk∈Dk disturbances (random)
π = {µ0;···;µN−1} control policy
µk : Sk → Sk; 0 ≤ k ≤ N−1 control laws
Π set of admissible control policies
∗
J (x ) = minJ (x ) optimal cost function (optimal value function)
0 π∈Π π 0
∗ ∗ ∗ ∗
If there exists π ∈ Π such that J (x ) = J (x ); then π is called an optimal policy.
π 0 0
Open-loop minimization (ordering decisions are made at time t0)
N−1
minimize E(R(x )+ X (r((x )+cu ))
N k k
k=0
subject to x =x +u −w ; 0≤k≤N−1
k+1 k k k
Closed-loop minimization (ordering decisions are made at time tk)
Determine a control policy π = {µ }N−1; µ = µ (x ) such that
k k=0 k k k
N−1
minimize J (x ) = E(R(x )+ X (r(x )+cµ (x )))
π 0 N k k k
k=0
subject to x =x +µ (x )−w ; 0≤k≤N−1
k+1 k k k k
Dynamic Programming is about the solution of closed-loop minimization problems.
no reviews yet
Please Login to review.