每天一道LeetCode-----给定字符串s和字符数组words，在s中找到words出现的位置，words内部字符串顺序无要求

来源：互联网发布：2016中国汽车产业数据编辑：程序博客网时间：2024/06/04 20:13

Substring with Concatenation of All Words

原题链接Substring with Concatenation of All Words
这里写图片描述
题目意思是给定字符串s和字符数组words，在s中寻找words出现的位置，words内部的字符串顺序没有要求

此问题可以直接顺序搜索，假设s的长度为m，words的长度为n，words中每个单词的长度为w，复杂度是O((m-n*w)*n)，需要利用两个unordered_map<string,int>，一个记录每个单词总共需要多少个，一个记录当前遍历到的单词数，如果遇到不在words中的单词，就break从下一个开始找

另一种方法是使用滑动窗口，也是利用两个unordered_map<string, int>。原理是当不匹配是移动起始点，每一移动w个距离。不匹配有两种情况

当前单词不是words中的单词，重新开始找，初始化负责记录遍历到的单词数的那个map
当前单词word是words中的单词，但是这个单词已经够了，不再需要这个单词了。此时需要将窗口向右移动，直到出现第一个word为止，从这个word后面开始记录

举个例子，

s = "barfoobarfoothebar";words = ["bar","bar","foo","the"];unordered_map<string, int> counts;/* * 记录每个单词需要的次数 * bar      foo     the * 2        1       1 */unordered_map<string, int> seen;/* * 记录当前找到的单词数 * bar      foo     the * 0        0       0 */int count;/* 记录当前找到的单词数，0，目标大小为words的大小，4 */int startIndex;/* 开始位置，0，找到就把它添加到结果的vector中 *//* 1. 从s起点开始找，每3一判断，第一个bar，"bar"是要找的单词（在counts中），给seen对应为加一，此时seen["bar"]=1, count=1; 2. 接着找第二个单词"foo"，也在counts中，给seen对应位加一，此时seen["foo"]=1, count=2; 3. 继续找"bar",也在counts中，给seen加一，此时seen["bar"]=2，count=3; 4. 继续找"foo",也在counts中，给seen加一，此时seen["foo"]=2,但是counts["foo"]=1，表示"foo"只需要一个，此时不满足条件，移动    1. 从startIndex开始删除单词，直到找到第一个"foo"    2. 删掉"bar"，此时seen["bar"]=1,count=1，startIndex=0+3=3;    3. 删掉"foo"，此时seen["foo"]=1，不改变count，因为当不满足时，没有增加count，startIndex=6;    4. 找到第一个"foo"了，继续查找单词 5. 继续找"the"，在counts中，给seen加一，此时seen["the"]=1，count=2 6. ... 7. 最终找到count==n，将startIndex=6添加到结果中 8. 再右移一个单词，删掉"bar"，此时seen["bar"]=1,count=3,startIndex=9 9. 继续寻找 10. ... 11. 遇到单词"abc"，不在counts中，重置seen，count，startIndex 12. 继续寻找 13. ... 14. 结束 */

代码如下

class Solution {public:    vector<int> findSubstring(string s, vector<string>& words) {        vector<int> res;        if(s.size() == 0 || words.size() == 0)            return res;        unordered_map<string, int> dict;        for(auto &str : words)            ++dict[str];        int n = s.size();        int cnt = words.size();        int wl = words[0].size();        /*          * 每个单词长度为wl，那么只需要循环wl次，每次移动wl个为         * 0, wl, 2*wl, 3*wl ...         * 1, 1+wl, 1+2*wl, 1+3*wl...         * 2, 2+wl, 2+2*wl, 2+3*wl...         * ...         */        for(int i = 0; i < wl; ++i)        {            int left = i;            int count = 0;            unordered_map<string, int> tdict; //seen            /* 每次移动wl个 */            for(int j = i; j <= n - wl; j+=wl)            {                string str = s.substr(j, wl);                /* 如果是需要的单词，就添加到seen中 */                if(dict.count(str))                {                    tdict[str]++;                    /* 如果已经不需要了，就不用增加count */                    if (tdict[str] <= dict[str])                         count++;                    else                    {                          /* 如果当前这个单词是多余的，超出了数量限制，就移动窗口 */                          while (tdict[str] > dict[str]) {                            string str1 = s.substr(left, wl);                            tdict[str1]--;                            /* 因为多余的那个单词没有增加count，所以也不需要减count */                            if (tdict[str1] < dict[str1])                                 count--;                            left += wl;                        }                    }                    /* 如果到所有，右移 */                    if (count == cnt) {                        res.push_back(left);                        // advance one word                        tdict[s.substr(left, wl)]--;                        count--;                        left += wl;                    }                }                /* 不是需要的单词，重置所有 */                else                {                    tdict.clear();                    count = 0;                    left = j + wl;                }            }        }        return res;    }};

啊啊啊啊下午的心情一点也不好-.-

阅读全文

0 0