Distinct subsequence

来源:互联网 发布:java导出excel并下载 编辑:程序博客网 时间:2024/06/06 15:45

Given a string, count the number of distinct subsequences of it ( including empty subsequence ). For the uninformed, A subsequence of a string is a new string which is formed from the original string by deleting some of the characters without disturbing the relative positions of the remaining characters. 
For example, "AGH" is a subsequence of "ABCDEFGH" while "AHG" is not.

Input

First line of input contains an integer T which is equal to the number of test cases. You are required to process all test cases. Each of next T lines contains a string s.

Output

Output consists of T lines. Ith line in the output corresponds to the number of distinct subsequences of ith input string. Since, this number could be very large, you need to output ans%1000000007 where ans is the number of distinct subsequences.

Example

Input:3AAAABCDEFGCODECRAFTOutput:4128496

Constraints and Limits

T ≤ 100, length(S) ≤ 100000
All input strings shall contain only uppercase letters.

不难看出, AAA的所有相异子字符串为: “ ”, "A", “AA”, "AAA"共有四个。 当字符串S的所有的字符均不同的时候, 我们假设S的长度为n, 那么distinct subsequence 的个数为
2^n, 例如S= ABC, S的subsequence的个数为2^3, 即共有8个。 当时当S中的有相同的字符出现了至少2次的时候, 说明个数小于 2^n。
sol:

It's a classic dynamic programming problem.

Let:

dp[i] = number of distinct subsequences ending with a[i]sum[i] = dp[1] + dp[2] + ... + dp[i]. So sum[n] will be your answer.last[i] = last position of character i in the given string.

A null string has one subsequence, so dp[0] = 1.

read an = strlen(a)for i = 1 to n  dp[i] = sum[i - 1] - sum[last[a[i]] - 1]  sum[i] = sum[i - 1] + dp[i]  last[a[i]] = ireturn sum[n]

Explanation

dp[i] = sum[i - 1] - sum[last[a[i]] - 1]

Initially, we assume we can append a[i] to all subsequences ending on previous characters(即sum[i-1]), but this might violate the condition that the counted subsequences need to be distinct. Remember that last[a[i]] gives us the last position a[i] appeared on until now. The only subsequences we overcount are those that the previous a[i] was appended to, so we subtract those(sum[last[a[i]] - 1]).

sum[i] = sum[i - 1] + dp[i]last[a[i]] = i

Update these values as per their definition.

If your indexing starts from 0, use a[i - 1] wherever I used a[i]. Also remember to wrap your computations in a mod function if you're going to submit code. This should be implemented like this:

mod(x) = (x % m + m) % m

In order to correctly handle negative values in some languages (such as C/C++).

另外:

There exists an easier solution to this problem.

The idea is : If all character of the string are distinct, total number of subsequences is 2^n. Now, if we find any character that have already occurred before, we should consider it's last occurrence only(otherwise sequence won't be distinct). So we have to subtract the number of subsequences due to it's previous occurrence.

My implementation is like this:

read sdp[0] = 1len = strlen(s)for (i = 1; i <= len; i++) {    dp[i] = (dp[i - 1] * 2)    if (last[s[i]] != 0) dp[i] = (dp[i] - dp[last[s[i]] - 1])    last[s[i]] = i

0 0
原创粉丝点击