distinct subsequence
来源:互联网 发布:中国民企军火出口知乎 编辑:程序博客网 时间:2024/06/05 21:10
A subsequence of a given sequence is just the given sequence with some elements (possibly none) left out. Formally, given a sequence X =x1x2…xm, another sequence Z = z1z2…zk is a subsequence of X if there exists a strictly increasing sequence <i1, i2, …, ik> of indices of X such that for all j = 1, 2, …, k, we have xij = zj. For example, Z = bcdb is a subsequence of X = abcbdab with corresponding index sequence< 2, 3, 5, 7 >.
In this problem your job is to write a program that counts the number of occurrences of Z in X as a subsequence such that each has a distinct index sequence.
LeeCode 题目如下:
Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE"
is a subsequence of "ABCDE"
while "AEC"
is not).
Here is an example:
S = "rabbbit"
, T = "rabbit"
Return 3
.
如果当前字符相同,结果加上S和T在该index之后的匹配方法数
如果当前字符不同,将S的指针向后移,递归计算
class Solution {private: int cnt; int len_s; int len_t;public: Solution():cnt(0){} void Count(string S,string T, int idx_ss, int idx_ts){ if(idx_ts == len_t){ cnt++; return; } int i; for (i=idx_ss; i<len_s; i++) { if (S[i] == T[idx_ts]) { Count(S, T, i + 1, idx_ts + 1); } } } int numDistinct(string S, string T) { len_s = S.length(); len_t = T.length(); Count(S, T, 0, 0); return cnt; }};
思路2:DP
如果当前字符相同,dp[i][j]结果等于用S[i](dp[i-1][j-1])和不用S[i](dp[i-1][j])方法数求和
如果当前字符不同,dp[i][j] = dp[i-1][j]
class Solution {private: int len_s; int len_t;public: int Count(string S,string T){ int i,j; int dp[len_s][len_t]; memset(dp, 0, sizeof(dp)); if (S[0]==T[0]) { dp[0][0] = 1; } for(i=1;i<len_s;i++){ dp[i][0] = dp[i-1][0]; if (T[0]==S[i]) { dp[i][0]++; } } for (i=1; i<len_s; i++) { for (j=1; j<len_t && j<=i; j++) { if (S[i]!=T[j]) { dp[i][j] = dp[i-1][j]; //cout<<dp[i-1][j]<<endl; } else{ dp[i][j] = dp[i-1][j-1] + dp[i-1][j]; //dp[i-1][j-1]: use S[i], as S[i]==T[j] //dp[i-1][j] : don't use S[i] //cout<<dp[i][j]<<endl; } } } return dp[len_s-1][len_t-1]; } int numDistinct(string S, string T) { len_s = S.length(); len_t = T.length(); return Count(S, T); }};
From LeetCode
Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).
Here is an example: S = "rabbbit", T = "rabbit"
Return 3.
I see a very good DP solution, however, I have hard time to understand it, anybody can explain how this dp works?
int numDistinct(string S, string T) { vector<int> f(T.size()+1); //set the last size to 1. f[T.size()]=1; for(int i=S.size()-1; i>=0; --i){ for(int j=0; j<T.size(); ++j){ f[j]+=(S[i]==T[j])*f[j+1]; printf("%d\t", f[j] ); } cout<<"\n"; } return f[0]; }
2 Answers
First, try to solve the problem yourself to come up with a naive implementation:
Let's say that S.length = m
and T.length = n
. Let's write S{i}
for the substring of S
starting at i(suffix array)
. For example, if S = "abcde"
, S{0} = "abcde"
, S{4} = "e"
, and S{5} = ""
. We use a similar definition for T
.
Let N[i][j]
be the distinct subsequences for S{i}
and T{j}
. We are interested in N[0][0]
(because those are both full strings).
There are two easy cases: N[i][n]
for any i
and N[m][j]
for j<n
. How many subsequences are there for ""
in some string S
? Exactly 1. How many for some T
in ""
? Only 0.
Now, given some arbitrary i
and j
, we need to find a recursive formula. There are two cases.
If S[i] != T[j]
, we know that N[i][j] = N[i+1][j]
(I hope you can verify this for yourself, I aim to explain the cryptic algorithm above in detail, not this naive version).
If S[i] = T[j]
, we have a choice. We can either 'match' these characters and go on with the next characters of both S
and T
, or we can ignore the match (as in the case that S[i] != T[j]
). Since we have both choices, we need to add the counts there: N[i][j] = N[i+1][j] + N[i+1][j+1]
.
In order to find N[0][0]
using dynamic programming, we need to fill the N
table. We first need to set the boundary of the table:
N[m][j] = 0, for 0 <= j < n //第m 行N[i][n] = 1, for 0 <= i <= m // 第n 列
Because of the dependencies in the recursive relation, we can fill the rest of the table looping i
backwards and j
forwards:
for (int i = m-1; i >= 0; i--) { for (int j = 0; j < n; j++) { if (S[i] == T[j]) { N[i][j] = N[i+1][j] + N[i+1][j+1]; } else { N[i][j] = N[i+1][j]; } }}
We can now use the most important trick of the algorithm: we can use a 1-dimensional array f
, with the invariant in the outer loop: f = N[i+1];
This is possible because of the way the table is filled. If we apply this to my algorithm, this gives:
f[j] = 0, for 0 <= j < nf[n] = 1for (int i = m-1; i >= 0; i--) { for (int j = 0; j < n; j++) { if (S[i] == T[j]) { f[j] = f[j] + f[j+1]; } else { f[j] = f[j]; } }}
We're almost at the algorithm you gave. First of all, we don't need to initialize f[j] = 0
. Second, we don't need assignments of the type f[j] = f[j]
.
Since this is C++
code, we can rewrite the snippet
if (S[i] == T[j]) { f[j] += f[j+1];}
to
f[j] += (S[i] == T[j]) * f[j+1];
and that's all. This yields the algorithm:
f[n] = 1for (int i = m-1; i >= 0; i--) { for (int j = 0; j < n; j++) { f[j] += (S[i] == T[j]) * f[j+1]; }}
#include <iostream>#include <vector>#include <cstdlib>#include <cstdio>//The zero initialization is specified in the//standard as default zero initialization/value//\initialization for builtin types, primarily to//support just this type of case in template use.////Note that this behavior is different from a// local variable such as int x; which leaves// the value uninitialized (as in the C language//that behavior is inherited from).using namespace std;int numDistinct(string S, string T) { vector<int> f(T.size() + 1); //默认的vector的每一个element 均被初始化为0 //set the last size to 1. f[T.size()]=1; for(int i = S.size() - 1; i >= 0; --i){ cout << "i = " << i << "\t"; // traverse the T string and compare with S for(int j=0; j < T.size(); ++j){ f[j] += (S[i] == T[j]) * f[j+1]; printf("%d\t", f[j] ); } cout<<"\n"; } return f[0];}int main() { string S = "rabbbitr"; string T = "rabit"; cout << numDistinct(S, T) << endl;}
运行结果如下:
- Distinct subsequence
- distinct subsequence
- distinct subsequence
- [Leetcode] Distinct Subsequence
- UVA 10069 Distinct Subsequence
- [LeetCode] distinct subsequence
- leetcode distinct subsequence
- LeetCode:distinct-subsequence
- 115. distinct subsequence leetcode python
- distinct
- Distinct
- distinct
- distinct
- distinct
- distinct
- distinct
- distinct
- distinct
- 【致明哥】顶起来
- 第八周项目1分段函数求和
- 编程之美---确定二进制中1的个数
- 响应式WEB设计
- Hadoop常见的45个面试题
- distinct subsequence
- Windows Auzre 微软云计算产品后台操作界面
- Linux yum源设置为本地文件夹
- 生活之手机控制电脑
- linux系统盘满了
- 4.3模式匹配-KMP算法
- bzoj 1741: [Usaco2005 nov]Asteroids 穿越小行星群
- 根据URL加载图片并付给bitmap(微信分享中使用)
- 博客搬家
for(int i = 0; i <= m; i++) { N[i][n] = 1; }
. The big difference is that that way is operational: I provide an 'algorithm' how to set the values, whereas the way in the post isdeclarative: I only care about the values, not about how to achieve them. That's a more mathematical way of writing it. – Vincent van der Weele Apr 28 at 6:19