DP (Dynamic programming)

来源：互联网发布：淘宝怎么合并订单发货编辑：程序博客网时间：2024/06/16 15:04

1. Fibonacci数列

F(0) = 1, F(1) = 1

F(n+1) = F(n)+F(n-1)

最方便的方法当然是递归,但递归对堆栈需求很大。最小内存使用只需要两个变量

int F(int n)
{
    int a=1, b=1, t;
    if( n ==0 || n ==1) return 1;
    for( int i=2; i<=n; i++)
    {
        t = a+b;
        a = b;
        b = t;
    }
    return t;
}

很简单，但是利用辅助空间的思想是典型的动态规划。

2. Maximun value contiguous subsequence
有一实数数列A[i], 求区间l..m使得Sum(A[l]..A[m])最大
假设对于某个元素下标j-1, 当前最大sum是M[j-1]
那么对于j, 当前最大sum就是max(M[j-1]+A[j], A[j])

double m[N], max_sum;
m[0]=a[0];
max_sum=m[0];
for(int i=1; i<N; i++)
{
m[i]=max(m[i-1]+a[i], a[i]);
max_sum = max(max_sum, m[i]);
}

让我们把问题扩展到两维,有个a(1..n,1..m),求其一子矩阵a(i..j,k..l)其元素的和最大.
上面的问题中用到了辅助数组m[i], 这次需要用到三维辅助数组了.
定义m(i,j,k):=以a(k,i..j)为最后一行的子矩阵的最大和, m(i,j,0)=sum{a(0,i..j)}
那么 m(i,j,k)=max{0,m(i,j,k-1)} + sum{a(k,i..j)}
复杂度是O(n^3)

3. coins

Given a list of N coins, their values (V₁, V₂, ... , V_N), and the total sum S. Find the minimum number of coins the sum of which is S (we can use as many coins of one type as we want), or report that it's not possible to select coins in such a way that they sum up to S.

i. Sort the coins value ascending… Suppose there is no value “1” inside the values. Reconstruct the array, make an array V[0..N], let V[0]=1, and V[i]<V[i+1] for any i
ii. Make an auxiliary array C, C[i][j] is the min number of coins to make change for the amount of j, by coins from 0 through i. j=0..S, i=1..N 可以容易发现,因为V[0]=1, 所以每一个C[0][j]都等于j/1=j
iii. for i=1 to N
        for j=0 to S
            if( V[i]>j || C[i-1][j]<1+C[i][j-V[i]])
                C[i][j]=C[i-1][j];
            else
                C[i][j]=1+C[i][j-V[i]];

如果需要记录用到了那些硬币，再开一个数组used[i][j]，表示硬币值V[i]是否在j的最小集中出现，加在else里面。

http://condor.depaul.edu/~rjohnson/algorithm/coins.pdf

4. Maximum size square sub-matrix with all 1s

Given a binary matrix, find out the maximum size square sub-matrix with all 1s.
For example, consider the below binary matrix.

   0  1  1  0  1
   1  1  0  1  0
   0  1  1  1  0
   1  1  1  1  0
   1  1  1  1  1
   0  0  0  0  0

The maximum square sub-matrix with all set bits is

    1  1  1
    1  1  1
    1  1  1

Algorithm:
Let the given binary matrix be M[R][C]. The idea of the algorithm is to construct an auxiliary size matrix S[][] in which each entry S[i][j] represents size of the square sub-matrix with all 1s including M[i][j] and M[i][j] is the rightmost and bottommost entry in sub-matrix.

1) Construct a sum matrix S[R][C] for the given M[R][C].

     a) Copy first row and first columns as it is from M[][] to S[][]

     b) For other entries, use following expressions to construct S[][]

         If M[i][j] is 1 then

            S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1

         Else /*If M[i][j] is 0*/

            S[i][j] = 0

2) Find the maximum entry in S[m][n]

3) Using the value and coordinates of maximum entry in S[i], print

   sub-matrix of M[][]

For the given M[R][C] in above example, constructed S[R][C] would be:

   0  1  1  0  1
   1  1  0  1  0
   0  1  1  1  0
   1  1  2  2  0
   1  2  2  3  1
   0  0  0  0  0

The value of maximum entry in above matrix is 3 and coordinates of the entry are (4, 3). Using the maximum value and its coordinates, we can find out the required sub-matrix.

代码如下

#include<stdio.h>
#define bool int
#define R 6
#define C 5
void printMaxSubSquare(bool M[R][C])
{
int i,j;
int S[R][C];
int max_of_s, max_i, max_j;
/* Set first column of S[][]*/
for(i = 0; i < R; i++)
     S[i][0] = M[i][0];
/* Set first row of S[][]*/
for(j = 0; j < C; j++)
     S[0][j] = M[0][i];
/* Construct other entries of S[][]*/
for(i = 1; i < R; i++)
{
    for(j = 1; j < C; j++)
    {
      if(M[i][j] == 1)
        S[i][j] = min(S[i][j-1], S[i-1][j], S[i-1][j-1]) + 1;
      else
        S[i][j] = 0;
    }
}
/* Find the maximum entry, and indexes of maximum entry
     in S[][] */
max_of_s = S[0][0]; max_i = 0; max_j = 0;
for(i = 0; i < R; i++)
{
    for(j = 0; j < C; j++)
    {
      if(max_of_s < S[i][j])
      {
         max_of_s = S[i][j];
         max_i = i;
         max_j = j;
      }
    }
}
printf("/n Maximum size sub-matrix is: /n");
for(i = max_i; i > max_i - max_of_s; i--)
{
    for(j = max_j; j > max_j - max_of_s; j--)
    {
      printf("%d ", M[i][j]);
    }
    printf("/n");
}
}
/* UTILITY FUNCTIONS */
/* Function to get minimum of three values */
int min(int a, int b, int c)
{
int m = a;
if (m > b)
    m = b;
if (m > c)
    m = c;
return m;
}
/* Driver function to test above functions */
int main()
{
bool M[R][C] = {{0, 1, 1, 0, 1},
                   {1, 1, 0, 1, 0},
                   {0, 1, 1, 1, 0},
                   {1, 1, 1, 1, 0},
                   {1, 1, 1, 1, 1},
                   {0, 0, 0, 0, 0}};
printMaxSubSquare(M);
getchar();
}

这个题目可以有很多变种。

(1) The largest rectangle under a histogram

http://www.informatik.uni-ulm.de/acm/Locals/2003/html/judge.html

Given: An integer array represents a histogram
Goal: Find the largest rectangle under the histogram.

Complexity O(N) where N is the size of the given array.

输入为一个整数数组h[i]. 对于图中的某个面积最大的矩形，必然有一个最低的高度h[k]，即矩形的高等于 h[k]，以第k块矩形的高度，最左边可以到达这个矩形的左边，最右边可以到达这个矩形的右边。所以，可以以每块矩形进行扩展，求出最左边和最右边（即两边的高度都大于等于这块的高度），得出面积s[i]，这样就可求出最大的s[i]了。

const int MAX = 100005;
__int64 h[MAX];
__int64 left[MAX], right[MAX];        //left[i] = j表示第i个矩形以它的高度到达最左边的下标

void Solve ()
{
    int i;
    __int64 temp, max;
    for (i=1; i<=n; i++)
    {
        left[i] = right[i] = i;
    }
    for (i=1; i<=n; i++)
    {
        while ( h[left[i]-1] >= h[i] )
            left[i] = left[left[i]-1];
    }
    for (i=n; i>0; i--)
    {
        while ( h[right[i]+1] >= h[i] )
            right[i] = right[right[i]+1];
    }
    max = 0;
    for (i=1; i<=n; i++)
    {
        temp = h[i] * (right[i] - left[i] + 1);
        if ( temp > max )
            max = temp;
    }
    printf("%I64d/n", max);
}

(2) Maximum subarray with all 1’s. (Generalization of problem 1)

http://www.drdobbs.com/184410529

Given A two-dimensional array b (M rows, N columns) of Boolean values ("0" a
nd "1")
Goal: Find the largest (most elements) rectangular subarray containing all o
nes.
Key observation: 从一边（假设右边）开始逐列扫描，构造直方图。每次构造出直方图
来，用上面的解法求最大矩阵。每次构造直方图只需要O（M），解需要O（M），做N次。
Complexity O(MN).

(3) Imagine you have a square matrix, where each cell is filled with either black or white. Design an algorithm to find the maximum subsquare such that a
ll four borders are filled with black pixels. (variation of problem 3)

http://careercup.com/question?id=2445

(4) Given an NxN matrix of positive and negative integers, write code to find the sub-matrix with the largest possible sum. (Kadane's 2D)

5. max min distance

给一个M个数字的从1到N的整数数组，找出一个K个大小的子集，这个子集
每个数pair的Distance，使得这个min distance 最大化。

先Sort 数组

DP方程f(k,a,b) = max{{for a<i<=b-k}min{dis(a,i), f(k-1,i,b)}}
每算一个状态点f(k,a,b)用O(log(b-a)), 一共有O(k*n*n)个这样的点，复杂度O(n*n*klogn)

6. 编辑距离问题

Three allowed operations: delete a character, insert a character, replace a character.
Now given two words - word1 and word2 – find the minimum number of steps required to convert word1 to word2

The following recurrence relations define the edit distance, d(s1,s2), of two strings s1 and s2:

d('', '') = 0 -- '' = empty string

d(s, '') = d('', s) = |s| -- i.e. length of s

d(s1+ch1, s2+ch2)

= min( d(s1, s2) + if ch1=ch2 then 0 else 1 fi,

d(s1+ch1, s2) + 1,

d(s1, s2+ch2) + 1 )

The first two rules above are obviously true, so it is only necessary consider the last one. Here, neither string is the empty string, so each has a last character, ch1 and ch2 respectively. Somehow, ch1 and ch2 have to be explained in an edit of s1+ch1 into s2+ch2. If ch1 equals ch2, they can be matched for no penalty, i.e. 0, and the overall edit distance is d(s1,s2). If ch1 differs from ch2, then ch1 could be changed into ch2, i.e. 1, giving an overall cost d(s1,s2)+1. Another possibility is to delete ch1 and edit s1 into s2+ch2, d(s1,s2+ch2)+1. The last possibility is to edit s1+ch1 into s2 and then insert ch2, d(s1+ch1,s2)+1. There are no other alternatives. We take the least expensive, i.e. min, of these alternatives.

The recurrence relations imply an obvious ternary-recursive routine. This is not a good idea because it is exponentially slow, and impractical for strings of more than a very few characters.

Examination of the relations reveals that d(s1,s2) depends only on d(s1',s2') where s1' is shorter than s1, or s2' is shorter than s2, or both. This allows the dynamic programming technique to be used.

A two-dimensional matrix, m[0..|s1|,0..|s2|] is used to hold the edit distance values:

m[i,j] = d(s1[1..i], s2[1..j])

m[0,0] = 0

m[i,0] = i, i=1..|s1|

m[0,j] = j, j=1..|s2|

m[i,j] = min(m[i-1,j-1]

+ if s1[i]=s2[j] then 0 else 1 fi,

m[i-1, j] + 1,

m[i, j-1] + 1 ), i=1..|s1|, j=1..|s2|

m[,] can be computed row by row. Row m[i,] depends only on row m[i-1,]. The time complexity of this algorithm is O(|s1|*|s2|). If s1 and s2 have a `similar' length, about `n' say, this complexity is O(n2), much better than exponential!

7. 求最长上升子列的长度最长递增序列

给定a(1..n),求最长上升子列的长度即满足a(b(0)..b(m)),其中b(i)<b(i+1),a(b(i))<a(b(i+1))
同样我们用DP搞定它,先是一个O(n^2)的算法
f(i):=以a(i)结尾的最大上升子列的长度. (f(1)=1)
f(i):=max{1,f(j)+1} (j=1..i-1 and a(j)<a(i))

#include <iostream>

using namespace std;

int main()

{

int i,j,n,a[100],b[100],max;

while(cin>>n)

{

for(i=0;i<n;i++)

cin>>a[i];

b[0]=1;//初始化，以a[0]结尾的最长递增子序列长度为1

for(i=1;i<n;i++)

{

b[i]=1;//b[i]最小值为1

for(j=0;j<i;j++)

if(a[i]>a[j]&&b[j]+1>b[i])

b[i]=b[j]+1;

}

for(max=i=0;i<n;i++)//求出整个数列的最长递增子序列的长度

if(b[i]>max)

max=b[i];

cout<<max<<endl;

}

return 0;

}

上面在状态转移时的复杂度为o(n),即在找a[k]前面满足a[j]<a[k]的最大b[j]时采用的是顺序查找的方法，复杂度为o(n).设想如果能把顺序查找改为折半查找，则状态转移时的复杂度为o(lg(n)),这个问题的总的复杂度就可以降到nlg(n). 另定义一数组c,c中元素满足c[b[k]]=a[k],解释一下，即当递增子序列的长度为b[k]时子序列的末尾元素为c[b[k]]=a[k].

#include <iostream>

using namespace std;

int find(int *a,int len,int n)//若返回值为x,则a[x]>=n>a[x-1]

{

int left=0,right=len,mid=(left+right)/2;

while(left<=right)

{

if(n>a[mid]) left=mid+1;

else if(n<a[mid]) right=mid-1;

else return mid;

mid=(left+right)/2;

}

return left;

}

void fill(int *a,int n)

{

for(int i=0;i<=n;i++)

a[i]=1000;

}

int main()

{

int max,i,j,n,a[100],b[100],c[100];

while(cin>>n)

{

fill(c,n+1);

for(i=0;i<n;i++)

cin>>a[i];

c[0]=-1;// …………………………………………1

c[1]=a[0];// ……………………………………2

b[0]=1;// …………………………………………3

for(i=1;i<n;i++)// ………………………………4

{

j=find(c,n+1,a[i]);// ……………………5

c[j]=a[i];// ………………………………6

b[i]=j;//……………………………………7

}

for(max=i=0;i<n;i++)//………………………………8

if(b[i]>max)

max=b[i];

cout<<max<<endl;

}

return 0;

}

对于这段程序，我们可以用算法导论上的loop invariants来帮助理解.
loop invariant: 1、每次循环结束后c都是单调递增的。(这一性质决定了可以用二分查找）
2、每次循环后，c[i]总是保存长度为i的递增子序列的最末的元素，若长度为i的递增子序

                                  列有多个，刚保存末尾元素最小的那个.（这一性质决定是第3条性质成立的前提）
                           3、每次循环完后，b[i]总是保存以a[i]结尾的最长递增子序列。
initialization:    1、进入循环之前，c[0]=-1,c[1]=a[0],c的其他元素均为1000,c是单调递增的;
                           2、进入循环之前，c[1]=a[0],保存了长度为1时的递增序列的最末的元素，且此时长度为1

                                 的递增了序列只有一个，c[1]也是最小的;
                           3、进入循环之前，b[0]=1，此时以a[0]结尾的最长递增子序列的长度为1.
maintenance:   1、若在第n次循环之前c是单调递增的，则第n次循环时，c的值只在第6行发生变化，而由

c进入循环前单调递增及find函数的性质可知（见find的注释),

此时c[j+1]>c[j]>=a[i]>c[j-1],所以把c[j]的值更新为a[i]后，c[j+1]>c[j]>c[j-1]的性质仍然成

立，即c仍然是单调递增的；
2、循环中，c的值只在第6行发生变化，由c[j]>=a[i]可知，c[j]更新为a[i]后，c[j]的值只会变

小不会变大，因为进入循环前c[j]的值是最小的，则循环中把c[j]更新为更小的a[i]，当

然此时c[j]的值仍是最小的;
3、循环中，b[i]的值在第7行发生了变化，因为有loop invariant的性质2，find函数返回值

为j有：c[j-1]<a[i]<=c[j],这说明c[j-1]是小于a[i]的，且以c[j-1]结尾的递增子序列有最大的

长度，即为j-1,把a[i]接在c[j-1]后可得到以a[i]结尾的最长递增子序列，长度为(j-1)+1=j;
termination: 循环完后，i=n-1,b[0],b[1],...,b[n-1]的值均已求出，即以a[0],a[1],...,a[n-1]结尾的最长递

增子序列的长度均已求出，再通过第8行的循环，即求出了整个数组的最长递增子序列。

仔细分析上面的代码可以发现，每次循环结束后，假设已经求出c[1],c[2],c[3],...,c[len]的值，则此时最长递增子序列的长度为len,因此可以把上面的代码更加简化，即可以不需要数组b来辅助存储，第8行的循环也可以省略。

#include <iostream>

using namespace std;

int find(int *a,int len,int n)//修改后的二分查找，若返回值为x，则a[x]>=n

{

int left=0,right=len,mid=(left+right)/2;

while(left<=right)

{

if(n>a[mid]) left=mid+1;

else if(n<a[mid]) right=mid-1;

else return mid;

mid=(left+right)/2;

}

return left;

}

int main()

{

int n,a[100],c[100],i,j,len;//新开一变量len,用来储存每次循环结束后c中已经求出值的元素的最大下标

while(cin>>n)

{

for(i=0;i<n;i++)

cin>>a[i];

b[0]=1;

c[0]=-1;

c[1]=a[0];

len=1;//此时只有c[1]求出来，最长递增子序列的长度为1.

for(i=1;i<n;i++)

{

j=find(c,len,a[i]);

c[j]=a[i];

if(j>len)//要更新len,另外补充一点：由二分查找可知j只可能比len大1

len=j;//更新len

}

cout<<len<<endl;

}

return 0;

}

8. 最大公共子串问题

int commstr(char *str1, char *str2)
/* 返回str1,str2的最长公共之串长度*/
{
int len1=strlen(str1),len2=strlen(str2),row,col,max=0;
int **pf = new int*[len1+1];//动态分配一个二维数组作为辅助空间
for (row=0; row<len1+1; row++)
pf[row] = new int[len2+1];
//数组赋初值
for (row=0; row<len1+1; row++)
     pf[row][0] = 0;
for (col=0; col<len2+1; col++)
    pf[0][col] = 0;
for (row=1; row<=len1; row++)
for (col=1;col<=len2; col++)
{
      if (str1[row-1] == str2[col-1])
      {
          pf[row][col] = pf[row-1][col-1] + 1;
          max = pf[row][col] > max ? pf[row][col] : max;
      }
      else
          pf[row][col] = 0;
}
//空间回收
for (row=0; row<len1+1; row++)
delete[] pf[row];
delete[] pf;
return max;
}

上面这个DP还好理解吧，但是复杂度相当高O(MN)，串长时空间复杂度也很高。如果涉及到多个字符串求公共子串复杂度更是不得了，这时候要使用广义后缀树（Generalized Suffix Tree，简称GST)算法，就是把给定的N个源字符串的所有的后缀建成一颗树，这个树有以下一些特点：
   1.树的每个节点是一个字符串，树根是空字符串“”
   2.任意一个后缀子串都可以由一条从根开始的路径表达
    （将这条路径上的节点字符串依次拼接起来就可以得到这个后缀）
   3.特别应注意任意一个子串都可以看作某一个后缀的前缀。既然每一个后缀都可以由一条从根开始的路径表达，那么我们可以从根节点开始一个字符一个字符的跟踪这条路径从而得到任意一个子串。
   4.为了满足查找公共子串的需求，每个节点还应该有从属于哪个源字符串的信息
   由以上的定义我们不难看出，在这棵GST树上，如果找到深度最大并且从属于所有字串的节点，那么，把从根到这个节点的路径上的所有节点字符串拼接起来就是LCS。
   还是举例说明，上面提到的三个字符串【abcde cdef ccde】的所有后缀子串列表如
下：
(注：.1表示是从第一个串abcde来的，同理.2,.3分别表示从cdef，ccde来的)
   abcde.1
   bcde.1
   cde.1
   de.1
   e.1
   cdef.2
   def.2
   ef.2
   f.2
   ccde.3
   cde.3
   de.3
   e.3
   建成的GST如下图所示
(注:.1表示从属于第一个串，.123表示既从属于第一又从属于第二，第三个源串）
    --/_______【abcde.1】
        |
        |_____【bcde.1】         .....最深的并且带.123的节点
        |                        :
        |_____【c.123】____【de.123】_______【f.2】
        |               |
        |               |__【cde.3】
        |
        |_____【de.123】___【f.2】
        |
        |_____【e.123】____【f.2】
        |
        |_____【f.2】
   上图中虚线所指的【de.123】节点所表示的子串cde正是LCS
   以上是一些基本概念，但是实际应用时还要涉及到构建GST树以及查找LCS的具体算法，基本上可以以O(n)的时间复杂度进行建树及查找处理。

9. 背包问题

10. 正则匹配

KMP算法才是王道，不过，如果只匹配*和?，可以这样

规定x[i]表示字符串x的第i个字符，注意，这里的下标从1开始。定义一个函数Match[i, j]，表示特征串x的长度为i的前缀与字符串的s的长度为j的前缀是否匹配。经过分析可以写出如下的递归公式：
Match[i,j] = Match[i-1, j-1], if x[i] = ’?’
= Match[i-1, 1..j]中任何一个等于true, if x[i]=’*’
= Match[i-1, j-1] and (x[i] = s[j]), if x[i]不是通配符
该递归公式的边界条件是
Match[0,0] = true,
Match[i,0] = Match[i-1,0], if x[i] = ’*’
= false, otherwise
根据上面的递归公式和边界条件，很容易写出一个动态规划算法来判断正则表达式x是否匹配字符串s。这个算法的复杂度是O(mn)

11.最短路径

12. 划分问题

[问题]Given an array of "n" random integers and an integer k<=n. Find the k
numbers such that the minimum difference of all the possible pairs of k
numbers is maximized (maximum among other minimum differences for various
possible selections of k numbers).

首先两头的点是必须找的，中间还需要k-2个点。
令1~n闭区间内k个数的最大最小差为f(1, n, k)，假设第二个点位于i处
f(1, n, k) = max(2<=i<=n+2-k) min{f(1, i, 2), f(i, n, k-1)}
f(x, y, 2) = a[y]-a[x]

[问题]max-mindist: 一个数组有n个元素全是正数，把这个数组分成非空的k段，每段都连续。求最每段元素
和的最大值的最小值。

令1~n闭区间分成n段的最小最大和为f(1, n, k)，假设第一段为1~i
f(1, n, k) = min(1<=i<=n+1-k) max{f(1, i, 1), f(i+1, n, k-1)}
f(x, y, 1) = a[x]+a[x+1]+...+a[y]

f(k,head,end) = max_{for i in head+1, end-k}min{dis(head,i), f(k-1,i,end)}