二进制逻辑运算求解187. Repeated DNA Sequences
来源:互联网 发布:软件测试表情包 编辑:程序博客网 时间:2024/06/06 20:33
题目
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Given s = “AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT”,Return:[“AAAAACCCCC”, “CCCCCAAAAA”].
题目解析:
给出一个DNA字符串,从中里面找出出现超过1次的子串,并且这个子串的长度为10个字符
思路解析:
最简单的解法就是采取暴力的方法,两层循环暴力匹配,复杂度是o(m*n),但是这样会超时,因此可以尝试使用kmp算法降低复杂度。但仔细阅读题目条件,发现字符串只包含A,C,G,T四个字母,观察这四个字母的ascii码,A的二进制ascii码为 0100 0001,C为0100 0011,G为0100 0111,T为0100 0111,发现每个字母二进制的低三位都不一样,因此可以使用这三位去表示一个字母,那么一个子字符串有10个字母,那么就可以使用30位二进制位去表示这个子字符串,为了可以提取出后30位,可以使用0x7FFFFFFF(或者0x3FFFFFFF)掩码去提取。当从S中取出第九个字符时,那么就会得到从字符串S中第一个子字符串的哈希值,那么将其存到哈希表中(将值加1),之后每向左移动3位替换一个字符,得到新的字符串哈希值,那么在哈希表中寻找哈希表中该值是否为1,如果为1,那么就说明这个子字符串已经在前面出现,同时加1,可以避免将相同的子字符串放到结果中
AC代码
#include <iostream>#include <vector>#include <unordered_map>#include <string>#include <algorithm>using namespace std;class Solution {public: vector<string> findRepeatedDnaSequences(string s) { unordered_map<int, int> m; vector<string> r; int t = 0, i = 0, ss = s.size(); while (i < 9) t = t << 3 | s[i++] & 7; while (i < ss) if (m[t = t << 3 & 0x3FFFFFFF | s[i++] & 7]++ == 1) r.push_back(s.substr(i - 10, 10)); return r; }};int main() { string s = "AAAAAAAAAAA"; Solution ss; vector<string> result = ss.findRepeatedDnaSequences(s); for (int i = 0; i < result.size(); ++i) { cout << result[i] << endl; } return 0;}
- 二进制逻辑运算求解187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 187. Repeated DNA Sequences
- 一篇了解TrustZone
- mybites查询 对象包含对象List的方法 一对多方法sql的写法
- 突破Session0之WTSSendMessage 使用
- 深入理解Android Studio之Gradle
- LeetCode——99. Recover Binary Search Tree
- 二进制逻辑运算求解187. Repeated DNA Sequences
- 2017.9.18 HH的项链 思考记录
- poj1236 Network of Schools(tarjan缩点)
- ConcurrentHashMap详解以及get方法保持同步的解释
- “裸机”与嵌入式操作系统
- nginx服务器access_log日志分析及配置详解
- php连接mysql数据库最简代码实现
- Android 动画
- 分布式NoSQL数据存储分享¬——表格存储使用教程