复杂度：零和博弈，最小最大定理以及LP对偶

来源：互联网发布：淘宝网家居小饰品编辑：程序博客网时间：2024/05/21 17:28

Complexity of 2-Player Zero-sum Game

lecturer： Constantinos Daskalakis

Games and Equilibria

Penaliy Shot Game
Drive/Kick Left Right Left 1,-1 -1,1 Right -1,1 1,-1
这个零和博弈存在混合策略纳什均衡，我们考虑支付期望∑i,jci,jxiyj，(x1 x2)T∗[1 −1;−1,1]。这里的均衡是1/2.1/2

[von Neumann ‘28]: An equilibrium exists in every two-player zero-sum game (R+C=0)

[Dantzig’ 40s] in fact, this follows from strong LP duality

[Khachivan ‘79’] in P time

[B. 56++] dynamics converges

Penaliy Shot Game - not zero-sum game
Drive/Kick Left Right Left 2,-1 -1,1 Right -1,1 1,-1
这里的纳什均衡是2/5，3/5

[Nash ‘50/’51]: An equilibrium exists in every finite game.

proof used Kakutani/Brouswer’s fixed point theorem, and no constructive proof has been found in 70+ years.

same is true for economic equilibria: supply different goods max utility no good is over demanded

Equlibrium:

A pair (x,y) of randomized strategies so that no player has incentive to deviate if the others does not.

x T R y \geq x' T R y, \forall x' x T C y \geq x T C y', \forall y'

Minimax Theory

Minimax Theorem [von Newmann’28]

Suppose X and Y are compact (closed and bounded) convex sets, and f:X×Y→ is a continuous function that is convex-concave, i.e., f(.,y) is convex for all fixed y, and f(x,.) is concave for all fixed x, then:

min x \in X max y \in Y f (x, y) = max y \in Y min x \in X f (x, y)

Proof: Zero-sum Game Two player game has nash equilibrum

(R,C)n×m

R+C=0

X=Δn= {X:Ex=xi≥0}

Y=Δm

In a zero-sum game, take f(x,y)=XTCY
- how much row plays colum
Then (x∗,y∗) is an equilibria, where
- x∗∈argminx∈Xmaxy∈Y f(x,y) and y∗∈argminy∈Ymaxx∈X f(x,y)
  xTCy∗≥minx xTCy∗=maxx xTCy∗=maxy minxxTCy=minxmaxyxTCy=maxy x∗Tcy

Existence of Equilibrium in Zero-Sum Game [von Neumann’28]

In two-player zero-sum (R+C=0) games, equilibrium always exists.

Proof:

Let f(x,y)=xTCy (the payoff of column player), then f(x,y) satisfies Minimax theorem. Assume

f (x *, y *) = x * T C y * min x \in X max y \in Y f (x, y) = max y \in Y min x \in X f (x, y)

Then,

x * T C y * = max y \in Y f (x *, y) \geq x * T C y', \forall y' x * T C y * = - x * T R y * \Rightarrow x * T R y * = - min x \in X f (x, y *) \geq x' T R y *, \forall x'

Presidential Elections
Clin/Tru Morality Tax Cuts Economy +3,-3 -1, +1 Society -2, +2 1, -1
Suppose Clinton commits to strategy (x1,x2)

E["Morility"]=−3x1+2x2

E["TaxCuts"]=x1−x2

Tru: max (−3x1+2x2,x1−x2)

Clin: max(-3x_1+2x_2, x_1-x_2), (x1,x2)∈argmax min(−3x1+2x2,x1−x2), which is a maximin problem

If Clinton is forced to commit to (x1,x2), argmax(x1,x2) min(−3x1+2x2,x1−x2), argmaxX min(XTR)

max z

s.t. 3x1−2x2≥z

−x1+x2≥z

x1+x2=1

x1,x2>0

No matter what Clin does Trump can guarantee 1/7 to himself by playing (3/7, 4/7)

No matter what Clin does Trump can guarantee -1/7 to himself by playing (2/7, 5/7)

i.e. (3/7, 4/7) is best response to (2/7, 5/7) and vise versa

两边的LP问题其实是对偶问题 strong linear programming duality，这也可以从minimax theory这个角度来看

方法二：从minimax问题直接切入

阅读全文

0 0