【Leetcode】Longest Palindrome Substring

来源:互联网 发布:数据挖掘论文 编辑:程序博客网 时间:2024/04/30 15:11

【题目】Given a string S, find the longest palindromic substring in S. You may assume that the maximum length of S is 1000, and there exists one unique longest palindromic substring.

【题解】

Manacher 线性算法

利用一个辅助数组 arr[n],其中 arr[i] 记录的是以 str[i] 为中心的回文子串长度。当计算 arr[i] 的时候,arr[0...i-1] 是已知并且可被利用的。Manacher 核心在于:用 mx 记录之前计算的最长的回文子串长度所能到达的最后边界,用 id 记录其对应的中心,可以利用回文子串中的回文子串信息。

lps02

假设 id 与 mx 已经得出,当计算以 str[i] 为中心回文子串长度时,因为已经可以确定绿色部分已经是回文子串了,所以可以利用以 str[j] 为中心回文子串长度即 arr[j]。在上图的情况下,所以可以从箭头所指出开始比较。还有一种情况:

lps01

这种情况下,不能直接利用以 str[j] 为中心回文子串长度即 arr[j],因为以 id 为中心回文子串长度只计算到了绿色箭头所指之处,所以能力利用的信息是 mx-i,比较 mx-i 之后的字符。

下面个举一例:

0123456789

ceabadabac

1112141?

当计算「?」即 arr[7] 的时候,id = 5,mx = 8,所以 arr[7] 可以给一个初值为 arr[2*id-7=3]=2,并且比较 str[7-2] 与 str[7+2] 是否相等......

0123456789

cdabadabac

1113141?

当计算「?」即 arr[7] 的时候,id = 5,mx = 8,此时 arr[7] 不能赋 arr[2*id-7=3]=3 的初值,因为以 id 为中心的回文子串只为图中蓝色部分:lps03 。所以,arr[7] 只能赋值为 mx-i = 8-7 = 1,继续比较以更新 arr[7]。

Manacher 线性算法只要在纸上演算一遍就明白了。

还有一个外国的大神的解释:也是醉了。。但是脑洞没有因此打开。。。

An O(N) Solution (Manacher’s Algorithm):
First, we transform the input string, S, to another string T by inserting a special character ‘#’ in between letters. The reason for doing so will be immediately clear to you soon.

For example: S = “abaaba”, T = “#a#b#a#a#b#a#”.

To find the longest palindromic substring, we need to expand around each Ti such that Ti-d … Ti+d forms a palindrome. You should immediately see that d is the length of the palindrome itself centered at Ti.

We store intermediate result in an array P, where P[ i ] equals to the length of the palindrome centers at Ti. The longest palindromic substring would then be the maximum element in P.

Using the above example, we populate P as below (from left to right):

T = # a # b # a # a # b # a #P = 0 1 0 3 0 1 6 1 0 3 0 1 0

Looking at P, we immediately see that the longest palindrome is “abaaba”, as indicated by P6 = 6.

Did you notice by inserting special characters (#) in between letters, both palindromes of odd and even lengths are handled graciously? (Please note: This is to demonstrate the idea more easily and is not necessarily needed to code the algorithm.)

Now, imagine that you draw an imaginary vertical line at the center of the palindrome “abaaba”. Did you notice the numbers in P are symmetric around this center? That’s not only it, try another palindrome “aba”, the numbers also reflect similar symmetric property. Is this a coincidence? The answer is yes and no. This is only true subjected to a condition, but anyway, we have great progress, since we can eliminate recomputing part of P[ i ]‘s.

Let us move on to a slightly more sophisticated example with more some overlapping palindromes, where S = “babcbabcbaccba”.


Above image shows T transformed from S = “babcbabcbaccba”. Assumed that you reached a state where table P is partially completed. The solid vertical line indicates the center (C) of the palindrome “abcbabcba”. The two dotted vertical line indicate its left (L) and right (R) edges respectively. You are at index i and its mirrored index around C is i’. How would you calculate P[ i ] efficiently?

Assume that we have arrived at index i = 13, and we need to calculate P[ 13 ] (indicated by the question mark ?). We first look at its mirrored index i’ around the palindrome’s center C, which is index i’ = 9.


The two green solid lines above indicate the covered region by the two palindromes centered at i and i’. We look at the mirrored index of i around C, which is index i’. P[ i' ] = P[ 9 ] = 1. It is clear that P[ i ] must also be 1, due to the symmetric property of a palindrome around its center.

As you can see above, it is very obvious that P[ i ] = P[ i' ] = 1, which must be true due to the symmetric property around a palindrome’s center. In fact, all three elements after C follow the symmetric property (that is, P[ 12 ] = P[ 10 ] = 0, P[ 13 ] = P[ 9 ] = 1, P[ 14 ] = P[ 8 ] = 0).


Now we are at index i = 15, and its mirrored index around C is i’ = 7. Is P[ 15 ] = P[ 7 ] = 7?

Now we are at index i = 15. What’s the value of P[ i ]? If we follow the symmetric property, the value of P[ i ]should be the same as P[ i' ] = 7. But this is wrong. If we expand around the center at T15, it forms the palindrome “a#b#c#b#a”, which is actually shorter than what is indicated by its symmetric counterpart. Why?


Colored lines are overlaid around the center at index i and i’. Solid green lines show the region that must match for both sides due to symmetric property around C. Solid red lines show the region that might not match for both sides. Dotted green lines show the region that crosses over the center.

It is clear that the two substrings in the region indicated by the two solid green lines must match exactly. Areas across the center (indicated by dotted green lines) must also be symmetric. Notice carefully that P[ i ' ] is 7 and it expands all the way across the left edge (L) of the palindrome (indicated by the solid red lines), which does not fall under the symmetric property of the palindrome anymore. All we know is P[ i ] ≥ 5, and to find the real value of P[ i ] we have to do character matching by expanding past the right edge (R). In this case, since P[ 21 ] ≠ P[ 1 ], we conclude that P[ i ] = 5.

Let’s summarize the key part of this algorithm as below:

if P[ i' ] ≤ R – i,
then P[ i ] ← P[ i' ]
else P[ i ] ≥ P[ i' ]. (Which we have to expand past the right edge (R) to find P[ i ].

See how elegant it is? If you are able to grasp the above summary fully, you already obtained the essence of this algorithm, which is also the hardest part.

The final part is to determine when should we move the position of C together with R to the right, which is easy:

If the palindrome centered at i does expand past R, we update C to i, (the center of this new palindrome), and extend R to the new palindrome’s right edge.

In each step, there are two possibilities. If P[ i ] ≤ R – i, we set P[ i ] to P[ i' ] which takes exactly one step. Otherwise we attempt to change the palindrome’s center to i by expanding it starting at the right edge, R. Extending R (the inner while loop) takes at most a total of N steps, and positioning and testing each centers take a total of N steps too. Therefore, this algorithm guarantees to finish in at most 2*N steps, giving a linear time solution.



【代码】网上一个大神写的C++代码,利用manacher线性原理。复杂度O(n),耗时12ms

class Solution{public:string longestPalindrome(string s){    int size=s.size(),i;    char *t=new char[2*size+2],*q=t;    int *p=new int[2*size+1],mx=0,id=0,MAX=0,center=0;    for(*q='#',i=0;i<size;++i,*++q='#')*++q=s[i];    for(*++q=0,p[0]=i=1;t[i];i++) {        p[i]=mx>i?min(p[2*id-i],mx-i):1;        while(i+p[i]<=2*size+1 && t[i+p[i]]==t[i-p[i]])p[i]++;        if(i+p[i]>mx)mx=i+p[i],id=i;        if(p[i]>MAX)MAX=p[i],center=i;    }    delete(p),delete(t);    return s.substr((center-MAX+1)/2,MAX-1);}};


0 0