ACM-概率dp之入门

来源：互联网发布：matlab 初始化零矩阵编辑：程序博客网时间：2024/05/20 21:59

概率dp其实就是利用动态规划的思想去解决概率、期望等题目，本质上来说与普通的dp没有太大的区别，只是可能会涉及到一些概率论方面的知识。so，练题吧......

入门题1，HDOJ：3853，时空转移（点击打开链接），题目如下：

LOOPS

Time Limit: 15000/5000 MS (Java/Others) Memory Limit: 125536/65536 K (Java/Others)
Total Submission(s): 2534 Accepted Submission(s): 1033

Problem Description

Akemi Homura is a Mahou Shoujo (Puella Magi/Magical Girl).

Homura wants to help her friend Madoka save the world. But because of the plot of the Boss Incubator, she is trapped in a labyrinth called LOOPS.

The planform of the LOOPS is a rectangle of R*C grids. There is a portal in each grid except the exit grid. It costs Homura 2 magic power to use a portal once. The portal in a grid G(r, c) will send Homura to the grid below G (grid(r+1, c)), the grid on the right of G (grid(r, c+1)), or even G itself at respective probability (How evil the Boss Incubator is)!
At the beginning Homura is in the top left corner of the LOOPS ((1, 1)), and the exit of the labyrinth is in the bottom right corner ((R, C)). Given the probability of transmissions of each portal, your task is help poor Homura calculate the EXPECT magic power she need to escape from the LOOPS.

Input

The first line contains two integers R and C (2 <= R, C <= 1000).

The following R lines, each contains C*3 real numbers, at 2 decimal places. Every three numbers make a group. The first, second and third number of the cth group of line r represent the probability of transportation to grid (r, c), grid (r, c+1), grid (r+1, c) of the portal in grid (r, c) respectively. Two groups of numbers are separated by 4 spaces.

It is ensured that the sum of three numbers in each group is 1, and the second numbers of the rightmost groups are 0 (as there are no grids on the right of them) while the third numbers of the downmost groups are 0 (as there are no grids below them).

You may ignore the last three numbers of the input data. They are printed just for looking neat.

The answer is ensured no greater than 1000000.

Terminal at EOF

Output

A real number at 3 decimal places (round to), representing the expect magic power Homura need to escape from the LOOPS.

Sample Input

2 20.00 0.50 0.50    0.50 0.00 0.500.50 0.50 0.00    1.00 0.00 0.00

Sample Output

6.000

题意：

一个r行c列的格子，起始点在（1，1），终点在（r，c），每一步可能的走法有：不动、向右走、向下走，每走一步花费两点魔法值，现给出在每一点三种走法的概率，求走完迷宫时所花魔法值的期望。

分析：

运用动态规划算法的话，首先需要确定一个合适状态来描述子问题的情况，很明显本题的状态可以定义为dp[i][j]，代表从（i，j）到（r，c）所花费魔法值的期望。然后我们需要考虑这样的状态之间能否正确的转化，利用数学期望的定义以及其线性性，不难写出如下转移方程：dp[i][j] = p[i][j][1]*dp[i][j] + p[i][j][2]*dp[i][j+1] + p[i][j][3]*dp[i+1][j] + 2（其中p[i][j][k]代表在点（i，j）选择第k种走法的概率），再化简一下：dp[i][j] = （p[i][j][2]*dp[i][j+1] + p[i][j][3]*dp[i+1][j] + 2）/（1-p[i][j][1]）。最后，需要确定边界，很明显，dp[r][c]=0，因为当在点（r，c）时，他不需要花费魔法值就可以到达（r，c），这样我们就可以从后往前递推了，那么要求的答案不就是dp[i][j]么！

源代码：

#include <cstdio>using namespace std;double p[1005][1005][3], dp[1005][1005];int main(){//freopen("sample.txt", "r", stdin);    int r, c;    while(~scanf("%d%d", &r, &c))    {        for(int i=1; i<=r; ++i)            for(int j=1; j<=c; ++j)                for(int k=1; k<=3; ++k)                    scanf("%lf", &p[i][j][k]);        dp[r][c] = 0;                             // 处理边界        for(int i=r; i>0; --i)                    // 从后往前递推            for(int j=c; j>0; --j)            {                if(p[i][j][1]==1 || (i==r&&j==c))                    continue;                dp[i][j] = (p[i][j][2]*dp[i][j+1] + p[i][j][3]*dp[i+1][j] + 2)  // 状态转移，注意需要加上本次花费的2                         / (1-p[i][j][1]);            }        printf("%.3f\n", dp[1][1]);    }    return 0;}

入门题2，HDOJ：4405，时空转移（点击打开链接），题目如下：

Aeroplane chess

Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 1563 Accepted Submission(s): 1072

Problem Description

Hzz loves aeroplane chess very much. The chess map contains N+1 grids labeled from 0 to N. Hzz starts at grid 0. For each step he throws a dice(a dice have six faces with equal probability to face up and the numbers on the faces are 1,2,3,4,5,6). When Hzz is at grid i and the dice number is x, he will moves to grid i+x. Hzz finishes the game when i+x is equal to or greater than N.

There are also M flight lines on the chess map. The i-th flight line can help Hzz fly from grid Xi to Yi (0<Xi<Yi<=N) without throwing the dice. If there is another flight line from Yi, Hzz can take the flight line continuously. It is granted that there is no two or more flight lines start from the same grid.

Please help Hzz calculate the expected dice throwing times to finish the game.

Input

There are multiple test cases.
Each test case contains several lines.
The first line contains two integers N(1≤N≤100000) and M(0≤M≤1000).
Then M lines follow, each line contains two integers Xi,Yi(1≤Xi<Yi≤N).
The input end with N=0, M=0.

Output

For each test case in the input, you should output a line indicating the expected dice throwing times. Output should be rounded to 4 digits after decimal point.

Sample Input

2 08 32 44 57 80 0

Sample Output

1.16672.3441

题意：

有0-n个格子，初始点在0，终点是>=n，每走一步之前都要丢一次六个面的色子，标上1-6，扔到几就走几步，当然色子是等概率出现数字的，还有就是中间某一点可能和其它的一点联通，比如a和b联通，当我处于a时，就可以直接飞到b，最后问走到终点时所扔色子次数的期望。

分析：

这道题目和上一道很像，仅仅是走法出现的概率产生方式不同，上一道是直接给出，而本题是扔色子，其实概率也就是1/6，但是本题还有一个不同的地方，那就是可以飞行，比如从a飞到b，而不用扔色子，这些条件都会影响我们做题的方法。仿照上一题，首先定义状态dp[i]表示处于i位置时所扔色子次数的期望，可知dp[n] = 0。然后找找转移方程，那根据题意可知，i位置要么是可以飞行的，要么就不是，所以当i点可以飞行时，dp[i] = dp[fly[i]]，其中fly[i]代表i飞行到的另一个点，否则，依据期望的定义，dp[i] = 1/6*dp[i+j] + 1，其中j = 1，2...6，代表扔色子可能的点数，最后别忘了加1，代表本次需要扔一次色子。最后，规划方向依然是从后往前，答案就是dp[0]。

源代码：

#include <cstdio>using namespace std;const int MAXN = 100005;double dp[MAXN];int   fly[MAXN];int main(){//freopen("sample.txt", "r", stdin);    int n, m;    while(~scanf("%d%d", &n, &m) && (n||m))    {        for(int i=0; i<=n; ++i)        {            dp[i]  = 0;            fly[i] = -1;        }        int a, b;        for(int i=0; i<m; ++i)        {            scanf("%d%d", &a, &b);            fly[a] = b;        }        for(int i=n-1; i>=0; --i)        {            if(fly[i] != -1)                dp[i]  = dp[fly[i]];                // 转移方程1            else            {                for(int j=1; j<=6; ++j)             // 转移方程2                    if(i+j >= n)                    // 注意走的步数可能超过n，但统一使用dp[n]                        dp[i] += 1.0/6 * dp[n];                    else                        dp[i] += 1.0/6 * dp[i+j];                ++dp[i];            }        }        printf("%.4f\n", dp[0]);    }    return 0;}

入门题3，POJ：2096，时空转移（点击打开链接），题目如下：

Collecting Bugs

Time Limit: 10000MS Memory Limit: 64000KTotal Submissions: 2634 Accepted: 1284Case Time Limit: 2000MS Special Judge

Description

Ivan is fond of collecting. Unlike other people who collect post stamps, coins or other material stuff, he collects software bugs. When Ivan gets a new program, he classifies all possible bugs into n categories. Each day he discovers exactly one bug in the program and adds information about it and its category into a spreadsheet. When he finds bugs in all bug categories, he calls the program disgusting, publishes this spreadsheet on his home page, and forgets completely about the program.
Two companies, Macrosoft and Microhard are in tight competition. Microhard wants to decrease sales of one Macrosoft program. They hire Ivan to prove that the program in question is disgusting. However, Ivan has a complicated problem. This new program has s subcomponents, and finding bugs of all types in each subcomponent would take too long before the target could be reached. So Ivan and Microhard agreed to use a simpler criteria --- Ivan should find at least one bug in each subsystem and at least one bug of each category.
Macrosoft knows about these plans and it wants to estimate the time that is required for Ivan to call its program disgusting. It's important because the company releases a new version soon, so it can correct its plans and release it quicker. Nobody would be interested in Ivan's opinion about the reliability of the obsolete version.
A bug found in the program can be of any category with equal probability. Similarly, the bug can be found in any given subsystem with equal probability. Any particular bug cannot belong to two different categories or happen simultaneously in two different subsystems. The number of bugs in the program is almost infinite, so the probability of finding a new bug of some category in some subsystem does not reduce after finding any number of bugs of that category in that subsystem.
Find an average time (in days of Ivan's work) required to name the program disgusting.

Input

Input file contains two integer numbers, n and s (0 < n, s <= 1 000).

Output

Output the expectation of the Ivan's working days needed to call the program disgusting, accurate to 4 digits after the decimal point.

Sample Input

1 2

Sample Output

3.0000

题意：

有n类bug和s个子系统，bug数量不限，且每天只能发现一个bug，要求的是当在s个子系统中发现n类bug时所需要天数的期望（平均天数）。

分析：

先确定状态，假设dp[i][j]表示已经在j个子系统中发现i类bug时所用天数的期望，明显dp[n][s] = 0。然后推导状态之间的转移，依据dp[i][j]的含义我们不难发现，下一天发现bug的情况只可能是以下四种情况：

1、在新的子系统中发现新的bug，即dp[i+1][j+1]

3、在已经发现过bug的子系统中发现新的bug，即dp[i+1][j]

2、在新的子系统中发现已经发现过的bug，即dp[i][j+1]

4、在已经发现过bug的子系统中发现已经发现过的bug，即dp[i+1][j+1]

同样，不难得出上述四种情况对应的概率分别为：p1 = (n-i)*(s-j) / (n*s)，p2 = (n-i)*(j) / (n*s)，p3 = i*(s-j) / (n*s)，p4 = i*j / (n*s)。

综上，我们的状态转移方程就出炉了，同样根据期望的定义，dp[i][j] = p1*dp[i+1][j+1] + p2*dp[i+1][j] + p3*dp[i][j+1] + p4*dp[i][j] + 1，移项合并一下，dp[i][j] = (p1*dp[i+1][j+1] + p2*dp[i+1][j] + p3*dp[i][j+1] + 1) / (1-p4)。这样一来，我们要求的答案就是dp[0][0]。顺便说一句，这道题不是和第一题一样的么！仅仅是下一个状态的情况稍微隐含一些！

源代码：

#include <cstdio>#include <cstring>using namespace std;const int MAXN = 1005;double dp[MAXN][MAXN];int main(){//freopen("sample.txt", "r", stdin);    int n, s;    while(~scanf("%d%d", &n, &s))    {        double p1, p2, p3, p4;        memset(dp, 0, sizeof(dp));        for(int i=n; i>=0; --i)            for(int j=s; j>=0; --j)            {                if(i==n && j==s)                    continue;                p1 = 1.0*(n-i)*(s-j) / (n*s);                p2 = 1.0*(n-i)*j     / (n*s);                p3 = 1.0*i*(s-j)     / (n*s);                p4 = 1.0*i*j         / (n*s);                dp[i][j] = (p1*dp[i+1][j+1] + p2*dp[i+1][j]  // 状态转移                         + p3*dp[i][j+1]    + 1)                         / (1-p4);            }        printf("%.4f\n", dp[0][0]);    }    return 0;}

入门题4，POJ：3071，时空转移（点击打开链接），题目如下：

Football

Time Limit: 1000MS Memory Limit: 65536KTotal Submissions: 3147 Accepted: 1593

Description

Consider a single-elimination football tournament involving 2ⁿ teams, denoted 1, 2, …, 2ⁿ. In each round of the tournament, all teams still in the tournament are placed in a list in order of increasing index. Then, the first team in the list plays the second team, the third team plays the fourth team, etc. The winners of these matches advance to the next round, and the losers are eliminated. After n rounds, only one team remains undefeated; this team is declared the winner.

Given a matrix P = [p_ij] such that p_ij is the probability that team i will beat team j in a match determine which team is most likely to win the tournament.

Input

The input test file will contain multiple test cases. Each test case will begin with a single line containing n (1 ≤ n ≤ 7). The next 2ⁿ lines each contain 2ⁿ values; here, the jth value on the ith line represents p_ij. The matrix P will satisfy the constraints that p_ij = 1.0 − p_ji for all i ≠ j, and p_ii = 0.0 for all i. The end-of-file is denoted by a single line containing the number −1. Note that each of the matrix entries in this problem is given as a floating-point value. To avoid precision problems, make sure that you use either the double data type instead of float.

Output

The output file should contain a single line for each test case indicating the number of the team most likely to win. To prevent floating-point precision issues, it is guaranteed that the difference in win probability for the top two teams will be at least 0.01.

Sample Input

20.0 0.1 0.2 0.30.9 0.0 0.4 0.50.8 0.6 0.0 0.60.7 0.5 0.4 0.0-1

Sample Output

Hint

In the test case above, teams 1 and 2 and teams 3 and 4 play against each other in the first round; the winners of each match then play to determine the winner of the tournament. The probability that team 2 wins the tournament in this case is:

P(2 wins) = P(2 beats 1)P(3 beats 4)P(2 beats 3) + P(2 beats 1)P(4 beats 3)P(2 beats 4)
= p₂₁p₃₄p₂₃ + p₂₁p₄₃p₂₄
= 0.9 · 0.6 · 0.4 + 0.9 · 0.4 · 0.5 = 0.396.

The next most likely team to win is team 3, with a 0.372 probability of winning the tournament.

题意：

有2^n支队，现在要进行n次比赛，并且按次序进行比赛并淘汰，胜利的队继续按次序比赛并淘汰，比如1，2，3，4进行比赛，第一轮1和2比，3和4比，假如1和3胜利了，那么第二轮1和3继续比，2，4淘汰。最后问最有可能胜利的队伍是哪一支。

分析：

分析题目，先创建状态，dp[i][j]，表示在第i轮比赛中第j支队获胜，那么dp[0][j] = 1。然后考虑状态间的转移，很明显，如果j队要在本轮中获胜，前提是在上一轮中必须要先获胜才有资格，即dp[i-1][j]，并且在本轮中要击败所有可能的对手，那同时也要求对手也要在上一轮中获胜，才有资格进入本轮，即p[j][k]*dp[i-1][k]，其中p[j][k]代表j击败k的概率，所以状态转移方程为：dp[i][j] = dp[i-1][j] * (dp[i-1][k]*p[j][k])，其中k为本轮与j比赛的队伍。可是，不知道大家发现没有，这里有一个问题，那就是题目中说的淘汰制度，很明显必须要保证本轮比赛的双方曾经是没有比赛过的，因为如果两队曾经遇到过，必然会淘汰一队，那么现在又怎么可能再次比赛呢？所以在进行状态转移之前必须加一个条件，保证j和k是第一次进行比赛，那当我们将比赛流程用二进制表示时，会发现规律，当j>>(i-1) 等于 (k>>(i-1))^1时，j和k在第i轮比赛中第一次相遇。

源代码：

#include <cstdio>#include <cstring>using namespace std;const int MAXN = 150;double dp[MAXN][MAXN], p[MAXN][MAXN];int main(){//freopen("sample.txt", "r", stdin);    int n, num;    while(~scanf("%d", &n) && n!=-1)    {        num = 1 << n;        memset(dp, 0, sizeof(dp));        for(int i=0; i<num; ++i)            dp[0][i] = 1;        for(int i=0; i<num; ++i)            for(int j=0; j<num; ++j)                scanf("%lf", &p[i][j]);        for(int i=1; i<=n; ++i)            for(int j=0; j<num; ++j)                for(int k=0; k<num; ++k)                    if((j>>(i-1)) == ((k>>(i-1))^1))                        // 转移条件                        dp[i][j]  +=  dp[i-1][j] * (dp[i-1][k]*p[j][k]);    // 状态转移方程        int ans = 0;        for(int i=1; i<num; ++i)         // 找出最大胜利概率的队            if(dp[n][i] > dp[n][ans])                ans = i;        printf("%d\n", ans+1);    }    return 0;}

1 0