最长不重复子串的长度&最长重复子串

来源:互联网 发布:smtp端口号是多少 编辑:程序博客网 时间:2024/04/28 11:55
求给定的某一个字符串中的最长不重复子串的长度
例如字符串s为:“abcdefgegcsgcasse”,其最长的不重复子串为“abcdefg”,长度为7

最长不重复子串的解法一:

       设置一个辅助数据结构(如map)记录每个字符最后一次出现的位置;遍历字符串中的每个字符,如果在map中没有出现,则不重复子串的长度+1,并更新最大字符串的长度值; 如果在map中已经出现过,则更新当前字符在map中的位置和当前不重复子串的长度,并根据更新的长度来更新最大字符串的长度;这样就可以求出该字符串中的最大不重复子串的最大长度;

但是最长不重复子串可能有多个,如何将所有子串打印出来;可以设立一个二维数组或者二维vector,来存储所有不重复子字符串的长度长度和首位置,然后遍历这这个数据结构打印最长不重复子串;

最长不重复子串的解法二:

We want to finish this task in one scan. We can maintain a hashtable to detect if we have seen any character. We start from the beginning and walk through the array until we hit a repetition, let’s say, at position i. We then have two piece of valuable information right now:

1. s[0..i-1] is a candidate of the longest substring without repetition. Therefore we can update the max length.
2. There exists a position 0<=j<i-1
,  such that s[j] == s[i]. Substring starting from j is not a candidate because it has at least one repetition: s[j] and s[i]. Any substring ended at j will be shorter than the substring s[0..i-1] so we don’t need to look at them.

Then we can remove every elements in the hashtable from 0 to j, and make the start position of next candidate to j + 1. We keep walking and repeating this process until we hit the last element.

class Solution {public:    int lengthOfLongestSubstring(string s) {        unsigned int maxLen = 0;        set<char> hashtable;        int candidateStartIndex = 0;        for(unsigned int iter = 0; iter < s.size(); ++iter) {            char c = s[iter];            if(hashtable.find(c) != hashtable.end()) {                maxLen = max(maxLen, iter - candidateStartIndex);                while(s[candidateStartIndex] != s[iter]) {                    hashtable.erase(s[candidateStartIndex]);                    candidateStartIndex++;                }                candidateStartIndex++;            } else {                hashtable.insert(c);            }        }        maxLen = max(maxLen, s.size() - candidateStartIndex);        return maxLen;    }};

最长重复子串的解法:

大方向有两个:一个是后缀数组/后缀树, 另外一个就是KMP算法;

KMP解法:遍历字符串字符i,分别计算以i位置字符为首字符的字符串的next数组,遍历next数组选取最大的元素,更新最大重复子串的长度;打印方法可以参考不重复子串的做法;

http://hi.baidu.com/fangm/blog/item/58fd1a4c20a5eafdd72afcd0.html