Edit Distance:只包含插入、删除、替换三种操作的编辑距离

来源:互联网 发布:数组排序去重 编辑:程序博客网 时间:2024/04/30 03:14

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

You have the following 3 operations permitted on a word:

a) Insert a character
b) Delete a character
c) Replace a character

举例:

a、b:1

a、bcd:3

ab、bc:2

ab、bab:1

bb、bab:1

abb、bba:2

eat、sea:2

思路:标准的动态规划问题,设d[word1.length][word2.length]表示匹配word1.length与word2.length长度单词是需要的编辑距离。以下用i表示word1的操作,j表示word2的操作。

所以d[i][0] = i,d[j][0] = j,因为任何单词与空词的编辑距离是其本身长度,比如“abc”与“”的编辑长度为3。

当word1的第i个字母与word第j个字母相同时,其编辑距离等同于word1的前i-1个字母与word的前j-1个字母间的编辑距离,即d[i][j] ==d[i-1][j-1]。

当word1的第i个字母与word第j个字母不同时,对应三种操作:

1.替换:d[i][j]=d[i-1][j-1]+1;

2.word2的字母插入word1:d[i][j] = d[i][j-1] + 1;

3.word1删除字母:d[i][j] = d[i-1][j] + 1;

上述三种操作的+1均表示在之前一步的编辑操作如替换、插入、删除。动态规划代码如下:

 public int minDistance(String word1, String word2) {        int m = word1.length();        int n = word2.length();        int[][] d = new int[m+1][n+1];        for (int i = 1;i <=m;i++){            d[i][0] = i;        }        for (int j = 1;j <=n;j++){            d[0][j] = j;        }        for(int i = 1;i<=m;i++){            for(int j = 1;j<=n;j++){                if(word1.charAt(i-1)==word2.charAt(j-1)){                    d[i][j] = d[i-1][j-1];                }else{                    d[i][j] = Math.min(d[i-1][j-1]+1,Math.min(d[i][j-1]+1,d[i-1][j]+1));                }            }        }        return d[m][n];            }
此类动态规划均能在空间上优化,课件leetcode上精简版:

class Solution { public:    int minDistance(string word1, string word2) {        int m = word1.length(), n = word2.length();        vector<int> cur(m + 1, 0);        for (int i = 1; i <= m; i++)            cur[i] = i;        for (int j = 1; j <= n; j++) {            int pre = cur[0];            cur[0] = j;            for (int i = 1; i <= m; i++) {                int temp = cur[i];                if (word1[i - 1] == word2[j - 1])                    cur[i] = pre;                else cur[i] = min(pre + 1, min(cur[i] + 1, cur[i - 1] + 1));                pre = temp;            }        }        return cur[m];     }};