算法细节系列（20）：Word Ladder系列

来源：互联网发布：淘宝货到付款怎么退款编辑：程序博客网时间：2024/06/07 19:33

算法细节系列（20）：Word Ladder系列

详细代码可以fork下Github上leetcode项目，不定期更新。

题目摘自leetcode：
1. Leetcode 127: Word Ladder
2. Leetcode 126: Word Ladder II

Leetcode 127: Word Ladder

Problem:

Given two words (beginWord and endWord), and a dictionary’s word list, find the length of shortest transformation sequence from beginWord to endWord, such that:

Only one letter can be changed at a time.

Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Example:

Given:
beginWord = “hit”
endWord = “cog”
wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”]
As one shortest transformation is “hit” -> “hot” -> “dot” -> “dog” -> “cog”,
return its length 5.

Note:

Return 0 if there is no such transformation sequence.

All words have the same length.

All words contain only lowercase alphabetic characters.

You may assume no duplicates in the word list.

You may assume beginWord and endWord are non-empty and are not the same.

这道题其实不难，但要想到这种解法却要费一番周折，如果对最短路径搜索熟悉的话，相信你一眼就能看出答案了，并且我们要论证一点，为什么最短路径算法对这道题来说是正确解法。

我的思路：
DFS，把所有编辑距离为1的单词连接在一块，构建一个MAP（邻接矩阵）。这样之后，我们就可以从beginWord开始DFS搜索了，中间需要状态记录。代码如下：

public int ladderLength(String beginWord, String endWord, List<String> wordList) {        Map<String, List<String>> map = new HashMap<>();        map.put(beginWord, new ArrayList<>());        for (String word : wordList){            map.put(word, new ArrayList<>());        }        for (String key : map.keySet()){            List<String> container = map.get(key);            for (String word : wordList){                if (oneDiff(key, word)){                    container.add(word);                }            }            map.put(key, container);        }        int ans = helper(map, beginWord, endWord, new HashSet<>());        return ans >= 1 << 30 ? 0 : ans;    }    private int helper(Map<String, List<String>> map,String beginWord, String endWord, Set<String> visited){        if (visited.contains(beginWord)) return 1<<30;        visited.add(beginWord);        for (String find : map.get(beginWord)){            if (find.equals(endWord)){                visited.remove(beginWord);                return 2;            }        }        int min = Integer.MAX_VALUE;        for (String find : map.get(beginWord)){            int x = 1 + helper(map, find, endWord, visited);            min = Math.min(min, x);        }        visited.remove(beginWord);        return min;    }    private boolean oneDiff(String a, String b){        if (a.equals(b)) return false;        char[] aa = a.toCharArray();        char[] bb = b.toCharArray();        int oneDiff = 0;        for (int i = 0; i < aa.length; i++){            if (aa[i] != bb[i]){                oneDiff ++;                if (oneDiff >= 2) return false;            }        }        return true;    }

代码没有多大问题，典型的DFS+状态回溯，遍历搜索每一条到达endWord的路径，找寻最短路径。但很可惜TLE了，直观上来看是因为为了拿到到endWord的最短路径，我们需要遍历每一条到endWord的路径，这是递归求解的一个特点。但实际情况，我们可以省去某些点的遍历。

就那这个问题来说，如从beginWord开始搜索，如

beginWord = "hit"endWord = "cog"wordList = ["hot","dot","dog","lot","log","cog"]wordList中编辑距离为1的单词有：a. hot此时BFS搜索与"hot"最近距离的单词，有：a. dotb. lot再BFS搜索"dot"时，有：a. cog所以我们只需要BFS三次就能得到正确答案，而DFS中，需要DFS至少三次。

上述过程就是经典的Dijkstra算法，代码如下：

public int ladderLength(String beginWord, String endWord, List<String> wordList) {        List<String> reached = new ArrayList<>();        reached.add(beginWord);        Set<String> wordSet = new HashSet<>(wordList);        if(!wordSet.contains(endWord)) return 0;        wordSet.add(endWord);        int distance = 1;        while (!reached.contains(endWord)){ //到达该目的地            List<String> toAdd = new ArrayList<>();            for (String each : reached){                for (int i = 0; i < each.length(); i++){                    char[] chars = each.toCharArray();                    for (char c = 'a';  c <= 'z'; c++){                        chars[i] = c;                        String wd = new String(chars);                        if (wordSet.contains(wd)){                            toAdd.add(wd);                            wordSet.remove(wd); //记录访问状态                        }                    }                }            }            distance ++;            if (toAdd.size() == 0) return 0; //没有编辑距离为1的单词            reached = toAdd;        }        return distance;    }

Leetcode 126: Word Ladder II

Problem:

Given two words (beginWord and endWord), and a dictionary’s word list, find all shortest transformation sequence(s) from beginWord to endWord, such that:

Only one letter can be changed at a time

Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Example:

Given:
beginWord = “hit”
endWord = “cog”
wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”]
Return
[
[“hit”,”hot”,”dot”,”dog”,”cog”],
[“hit”,”hot”,”lot”,”log”,”cog”]
]

Note:

Return an empty list if there is no such transformation sequence.

All words have the same length.

All words contain only lowercase alphabetic characters.

You may assume no duplicates in the word list.

You may assume beginWord and endWord are non-empty and are not the same.

这道题的思路让我对DFS和BFS有了一些基本理解，但还不够深刻，咋说呢，我没想到BFS和DFS还可以分工合作，BFS用来快速求出最小distance，而DFS则用来遍历所有路径，两种遍历方法各有长处，综合起来就能解决该问题了，所以我写了一个版本，代码如下：

public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {        Map<String, List<String>> map = new HashMap<>();        map.put(beginWord, new ArrayList<>());        for (String word : wordList){            map.put(word, new ArrayList<>());        }        for (String key : map.keySet()){            List<String> container = map.get(key);            for (String word : wordList){                if (oneDiff(key, word)){                    container.add(word);                }            }            map.put(key, container);        }        int distance = bfs(beginWord, endWord, wordList);        List<List<String>> ans = new ArrayList<>();        dfs(map, beginWord, endWord, ans, new ArrayList<>(), distance);        return ans;    }    private void dfs(Map<String, List<String>> map,String beginWord, String endWord, List<List<String>> ans, List<String> path, int distance){        path.add(beginWord);        if (distance == 0){            path.remove(path.size()-1);             return;        }        if (beginWord.equals(endWord)){            ans.add(new ArrayList<>(path));            path.remove(path.size()-1);            return;        }        for (String find : map.get(beginWord)){            dfs(map, find, endWord, ans, path, distance-1);        }        path.remove(path.size()-1);    }    private int bfs(String beginWord, String endWord, List<String> wordList) {        List<String> reached = new ArrayList<>();        reached.add(beginWord);        Set<String> wordSet = new HashSet<>(wordList);        if(!wordSet.contains(endWord)) return 0;        wordSet.add(endWord);        int distance = 1;        while (!reached.contains(endWord)){ //达到该目的地            List<String> toAdd = new ArrayList<>();            for (String each : reached){                for (int i = 0; i < each.length(); i++){                    char[] chars = each.toCharArray();                    for (char c = 'a';  c <= 'z'; c++){                        chars[i] = c;                        String wd = new String(chars);                        if (wordSet.contains(wd)){                            toAdd.add(wd);                            wordSet.remove(wd);                        }                    }                }            }            distance ++;            if (toAdd.size() == 0) return 0;            reached = toAdd;        }        return distance;    }    private boolean oneDiff(String a, String b){        if (a.equals(b)) return false;        char[] aa = a.toCharArray();        char[] bb = b.toCharArray();        int oneDiff = 0;        for (int i = 0; i < aa.length; i++){            if (aa[i] != bb[i]){                oneDiff ++;                if (oneDiff >= 2) return false;            }        }        return true;    }

思路相当清楚了，以为能够AC，结果发现TLE了，说明该题对时间的要求很高，从上述代码我们也能发现一些基本问题，如BFS遍历时可以构建MAP，而不用单独构建MAP，非常耗时。其次，最关键的问题在于DFS，此版本的DFS没有进行剪枝处理，剪枝能省去很多时间，所以我还需要对BFS进行改进。

思路：
首先，我们来看看上述代码构建图的一个模型，如下图所示：
alt text

很明显，如果我们对BFS没有做任何限制，我们拿到的邻接表一定是上述探头斯，而此时如果用DFS进行搜索时，如从“hot”开始，它会搜索：

一条可能的搜索路径：hot ---> dot ---> dog ---> cog但与此同时DFS还会搜索路径：hot ---> dot ---> tot ---> hot上述路径很明显不需要DFS，但因为边的相连，使得这种没必要的搜索也将继续。

所以一个优化点就在于，好马不吃回头草，存在环路的回头草绝对不是达到endWord的最短路径。很遗憾，邻接表无法表示这种非环的图，所以想法就是用一个Map<String,Integer>来记录到达每个单词的最短路径，一旦map中有该单词，就不再更新最短路径（避免环路搜索）

所以BFS代码如下：

private int bfs(String beginWord, String endWord, Set<String> wordDict, Map<String, Integer> distanceMap,            Map<String, List<String>> map) {        if (!wordDict.contains(endWord))            return 0;        map.put(beginWord, new ArrayList<>());        for (String word : wordDict) {            map.put(word, new ArrayList<>());        }        Queue<String> queue = new LinkedList<>();        queue.offer(beginWord);        distanceMap.put(beginWord, 1);        while (!queue.isEmpty()) {            int count = queue.size();            boolean foundEnd = false;            // 这种循环遍历很有意思，看作一个整体            for (int i = 0; i < count; i++) {                String cur = queue.poll();                int curDistance = distanceMap.get(cur);                List<String> neighbors = getNeighbors(cur, wordDict);                if (neighbors.size() == 0)                    return 0;                for (String neighbor : neighbors) {                    map.get(cur).add(neighbor);                    //存在环的情况，不去更新最短路径                    if (!distanceMap.containsKey(neighbor)) {                        distanceMap.put(neighbor, curDistance + 1);                        if (endWord.equals(neighbor)) {                            foundEnd = true;                        } else {                            queue.offer(neighbor);                        }                    }                }            }            //一旦抵到了endWord，我们就放弃建立后续的图            if (foundEnd)                break;        }        return distanceMap.get(endWord);    }

上述代码在BFS时，与endWord无关的那些结点都丢弃掉了，且解决了有环路的情况。图结构如下所示：
alt tetx

这样在DFS构建路径时，它的速度就比原先要快得多。在BFS中还需要注意一个函数【getNeighbors()】，刚开始我写的这版程序也超时了，苦思许久都找不到原因，后来才发现是getNeighbors的玄机，它在建立邻接表时，一定要使用【HashSet】的搜索方法，而不要用原生的【List】的搜索方法。

所以完整代码如下：

    public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {        Map<String, List<String>> map = new HashMap<>();        Map<String, Integer> distanceMap = new HashMap<>();        Set<String> wordDict = new HashSet<>(wordList);        wordDict.add(beginWord);        int distance = bfs(beginWord, endWord, wordDict, distanceMap, map);        List<List<String>> ans = new ArrayList<>();        if (distance == 0)            return ans;        dfs(map, beginWord, endWord, ans, new ArrayList<>(), distance, distanceMap);        return ans;    }    private void dfs(Map<String, List<String>> map, String beginWord, String endWord, List<List<String>> ans,            List<String> path, int distance, Map<String, Integer> distanceMap) {        path.add(beginWord);        if (distance == 0) {            path.remove(path.size() - 1);            return;        }        if (beginWord.equals(endWord)) {            ans.add(new ArrayList<>(path));            path.remove(path.size() - 1);            return;        }        for (String find : map.get(beginWord)) {            if (!distanceMap.containsKey(find))                continue;            if (distanceMap.get(beginWord) + 1 == distanceMap.get(find))                dfs(map, find, endWord, ans, path, distance - 1, distanceMap);        }        path.remove(path.size() - 1);    }    private int bfs(String beginWord, String endWord, Set<String> wordDict, Map<String, Integer> distanceMap,            Map<String, List<String>> map) {        if (!wordDict.contains(endWord))            return 0;        map.put(beginWord, new ArrayList<>());        for (String word : wordDict) {            map.put(word, new ArrayList<>());        }        Queue<String> queue = new LinkedList<>();        queue.offer(beginWord);        distanceMap.put(beginWord, 1);        while (!queue.isEmpty()) {            int count = queue.size();            boolean foundEnd = false;            for (int i = 0; i < count; i++) {                String cur = queue.poll();                int curDistance = distanceMap.get(cur);                List<String> neighbors = getNeighbors(cur, wordDict);                if (neighbors.size() == 0)                    return 0;                for (String neighbor : neighbors) {                    map.get(cur).add(neighbor);                    if (!distanceMap.containsKey(neighbor)) {                        distanceMap.put(neighbor, curDistance + 1);                        if (endWord.equals(neighbor)) {                            foundEnd = true;                        } else {                            queue.offer(neighbor);                        }                    }                }            }            if (foundEnd)                break;        }        return distanceMap.get(endWord);    }    private List<String> getNeighbors(String word, Set<String> wordList) {        List<String> ans = new ArrayList<>();        for (int i = 0; i < word.length(); i++) {            char[] cc = word.toCharArray();            for (char c = 'a'; c <= 'z'; c++) {                cc[i] = c;                String newWord = new String(cc);                if (wordList.contains(newWord)) {                    if (newWord.equals(word))                        continue;                    ans.add(newWord);                }            }        }        return ans;    }

DFS是一个典型的回溯+剪枝的递归方法，凡是函数返回的地方，我们都需要进行状态还原，注意再注意。

阅读全文

0 0