leetcode438. Find All Anagrams in a String

来源：互联网发布：阿里云退款电话编辑：程序博客网时间：2024/05/21 16:14

@小花

题目描述：

Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.

Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.

The order of output does not matter.

Example 1:

Input:s: "cbaebabacd" p: "abc"Output:[0, 6]Explanation:The substring with start index = 0 is "cba", which is an anagram of "abc".The substring with start index = 6 is "bac", which is an anagram of "abc".

Example 2:

Input:s: "abab" p: "ab"Output:[0, 1, 2]Explanation:The substring with start index = 0 is "ab", which is an anagram of "ab".The substring with start index = 1 is "ba", which is an anagram of "ab".The substring with start index = 2 is "ab", which is an anagram of "ab".

关于题目的解释：

给出两个字符串，要找出所有的s包含的p的所有的同字母异自序的所有词的起始下标。正如上面的例子所示的，很显然，需要满足这几个要求：第一个就是s的子串的长度和p的长度是一样的；第二个是，组成的字符种类相同，并且每个种类的个数也相同。说到这里，基本可以看出这个题目的要点是Hash表和滑动窗口。

这个题目标的难度是easy，但是实际并不是很简单。题目标签给的知识点是Hash，可能开始会尝试尝试使用java内部的Hash结构，分别将两个string的char放到容器里面，用containsAll方法比较容器是否相同，或者是双循环遍历的方法，这些方法或者考虑情况不全，或者会超时的。所以需要建立一个数组的Hash结构来存储Hash值。下面先放代码：

public class Solution {    public List<Integer> findAnagrams(String s, String p) {        List<Integer> result = new ArrayList<>();        if(s == null||s.length() == 0||p == null||p.length()==0)            return result;        int[] hash = new int[256];        char[] pp = p.toCharArray();        for(char i:pp){            hash[i]++;        }        int left = 0, right = 0, count = p.length();        while(right < s.length())        {            if(hash[s.charAt(right++)]-- > 0)  //窗口右移；相应的hash值减小；如果这个位置的Hash值是正的，表示p字符串也包含这个，所以count做减法                count--;            if(count == 0)                result.add(left);//count指示器，为0表示和p对应的hash值完全一致            if(right - left == p.length() && hash[s.charAt(left++)]++ >= 0)                 //如果当窗口大小一定的时候即窗口大小和需要比较的字符串大小一致的时候，将窗口左边的指针向右边移动，移动的同时左边的字符计数因为在第一个if的地方hash值减小过，所以需要执行对应恢复操作，即：hash值增加，count计数值增加。                count++;        }        return result;            }}

这个代码长度不长，但是里面用到了很多知识点。第一个就是定义的hash数组长度是256，因为ascii码的长度是256位的，所以每一位的索引表示一个字符的计数值。另外就是在窗口的部分，分别用两个指针表示窗口的左边界和右边界。还有用count作为一个计数值，它的含义是窗口里面的字符串和p字符串的相差的字符个数。

在窗口右移的过程中，如果这个位置上面的hash值是大于0的，那么，说明我当前这个位置的字符增加之后可以使得窗口和p字符串的相异度减小一个；然后判断如果count为0，即相异距离为0的时候，记录下左边界的位置，并加入到结果集合里面。接着就是移动左边界的指针，首先需要判断窗口长度等于p字符串的长度的时候才开始移动边界。移动做边界的时候，同样如果这个位置的hash值是非负的，才将count进行加一，包括在上面，如果加入了一个字符，使得hash值对应项是负值，则count值不会变化。

最终，当右边界移动完成之后，则返回所有的满足条件时窗口左索引的集合。

0 0