优化-拉格朗日乘子法简述

来源：互联网发布：java @apect 编辑：程序博客网时间：2024/06/02 04:42

Lagrange Multipliers are used to solve the optimal value of multivariate functions under a group of constraints. By lagrange multipliers, we can convert an optimal problem with d variables and k constraints to one with d+k variables without constraint.

Equality Constraint
Suppose x∈Rd, we would like to solve some optimal value x∗ s. t. minxf(x) and g(x)=0, i.e.
$min x f (x), s . t . g (x) = 0$
For simplicity, we omit the geometric explanation of the optimal problem. Define the Lagrange multiplier λ≠0 s.t.
$\nabla f (x) + λ \nabla g (x) = 0$
Define the corresponding Lagrange funcntion as
$L (x, λ) = f (x) + λ g (x)$
So
$\nabla x L (x, λ) = \nabla f (x) + λ \nabla g (x) = 0$
$\nabla λ L (x, λ) = g (x) = 0$
Obviously, the original optimal problem is converted to the new optimal problem with no constraint.
Inequality Constraint
In the inequality constraint case, the optimal problem can be defined as
$min x f (x), s . t . g (x) \leq 0$
When g(x)<0, the constraint g(x)≤0 makes no sense which means that we can take ∇f(x)=0 to solve the optimal problem. In addition, when g(x)=0, the inequality constraint degrades to the equality constraint.
In summary, the KKT (Karush-Kuhn-Tucker) conditions are always satisfied:
$⎧ ⎩ ⎨ ⎪ ⎪ g (x) \leq 0; λ \geq 0; λ g (x) = 0$
With efficiency concerned, we show the solution of the optimal problem together with next problem. Go on.
Multi-constraints
Consider an optimal problem with m equality constraints and n inequality constraints. In addition there is a feasible region D⊂Rd:
$min x s . t . f (x) h i (x) = 0, i = 1, 2, \dots, m, g j (x) \leq 0, j = 1, 2, \dots, n$
Define the lagrange function as
$L (x, λ, μ) = f (x) + \sum i = 1 m λ i h i (x) + \sum j = 1 n μ j g j (x)$
Since there are inequality constraints, the KKT condition is followed:
$⎧ ⎩ ⎨ ⎪ ⎪ ⎪ ⎪ g j (x) \leq 0; μ j \geq 0; μ j g j (x) = 0.$
Solving the original optimal problem, also known as primal problem, can be achieved by solving the corresponding dual problem. Then the Lagrange dual function Γ:Rm×Rn→R is defined as
$Γ (λ, μ) = inf x \in D L (x, λ, μ) = inf x \in D ⎛ ⎝ f (x) + \sum i = 1 m λ i h i (x) + \sum j = 1 n μ j g j (x) ⎞ ⎠$
Evidently, ∑mi=1λihi(x)+∑nj=1μjgj(x)≤0. Let x~∈D, then
$Γ (λ, μ) = inf x \in D L (x, λ, μ) \leq L (x ~, λ, μ) \leq f (x ~)$ That is to say, the dual function shows the lower bound of the value of the primal problem. Then the dual problem is given by
$max λ, μ Γ (λ, μ) s . t . μ \geq 0$
The dual problem is always a convex optimal problem, regardless of the convexity of the primal problem.

Let p∗ be the optimal value of the primal problem, d∗ be the optimal value of the dual problem. It has been shown that d∗≤p∗ which is also known as weak duality. If d∗=p∗, then strong duality holds. Generally, the strong duality does not always hold. However, when the primal problem is a convex optimal problem, i.e. f(x),gj(x) are convex functions, hi(x) are affine functions and there exists a least one x~ making the inequality constraints strictly hold, the strong duality holds. In this case, take the derivative of the Lagrange function over x, λ and μ and set it to be zero. Then we can solve the dual problem as well as the primal problem.

0 0