187. Repeated DNA Sequences

来源:互联网 发布:软件管家360官方下载 编辑:程序博客网 时间:2024/06/05 17:56
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].



找出给定字符串中长度为10的有重复的子字符串。如果单纯字符串的比较的话肯定是会超时的,要想办法对字符串进行编码。刚好字符串只有ACGT四种字符,所以定为0、1、2、3,每十个字符求一个值,判断该值是否出现过一次,就可以知道是不是重复的了。

代码:
class Solution {public:    vector<string> findRepeatedDnaSequences(string s) {    vector<string> res;    if(s.empty() || s.size() < 11) return res;        char char_map[127];        char_map['A'] = 0;        char_map['C'] = 1;        char_map['G'] = 2;        char_map['T'] = 3;        map<int, int> nums;        for(int i = 0; i <= s.size()-10; ++i)        {        int num = 0;        for(int j = i; j < i+10; ++j)        {        num = num * 10 + char_map[s[j]];}if(nums[num]++ == 1){res.push_back(s.substr(i, 10));}}return res;    }};

原创粉丝点击