LeetCode OJ - - Edit Distance

来源:互联网 发布:php 汉字转ascii 编辑:程序博客网 时间:2024/06/05 09:55

题目:Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

You have the following 3 operations permitted on a word:

a) Insert a character
b) Delete a character
c) Replace a character

  • first time 分析

    先求出S1和S2字符串长度的绝对值,得到插入或者删除的steps,然后遍历一遍S1,在S2中利用find函数找相应的S1中的字母,若找到则计数器加一(该计数器表明长度相等的情况下不用替换的字母个数),并把S2中该字母改为‘#’。代码如下:

int minDistance(string word1, string word2) {    unsigned len1 = word1.size();    unsigned len2 = word2.size();    unsigned minLen = len1 > len2 ? len2 : len1;    //the steps of insert a character or delete a character    unsigned minStep = len1 > len2 ? len1 - len2 : len2 - len1;    cout << "the steps of insert a character or delete a character: " << minStep << endl;    unsigned tempLen = 0;    for (auto it1 = word1.begin(); it1 != word1.end(); ++it1)    {        string::size_type pos2 = word2.find(*it1);        if (pos2 != string::npos)        {            ++tempLen;            word2[pos2] = '#';        }    }    minStep += (minLen - tempLen);    return minStep;    }

Result:wrong! 在LeetCode平台中 768 / 1146 test cases passed.原因是没考虑S1和S2中的字母次序也要一致(不然怎么相等╮(╯▽╰)╭)。例如S1=”abcd”,S2=”adefgh”,按照以上的算法是S1添加gh(即 2 steps),然后S1中的bc替换为ef(即 2 steps),结果为4,但其实此时的S1=”aefdgh” != S2。正确的计数方式应该是S1先删除bc(即S1 = “ad”,2 steps),然后S1添加efgh(即S1 = “adefg” = S2,4 steps),正确结果应该是6。


  • second time分析

    之后打开本题的tags知道得用动态规划来解决,上网查阅资料得知该题其实是自然语言处理中的一个经典问题 —— [ 参考博文]。

    简略来说,就是用dp[i][j]表示S[i]到S[j]的最短编辑距离,那么从当前状态进入下一状态有三个选择,即添加,删除和替换,递推公式分别为dp[i][j]=dp[i][j-1] + 1 (insert),dp[i][j] = dp[i-1][j] + 1 (delete),dp[i][j] = dp[i-1][j-1]+1,题目要求是要最小编辑距离,那自然从当下的三个选择中选取最小值,所以dp[i][j] = min(dp[i][j-1] + 1 ,dp[i-1][j] + 1,dp[i-1][j-1]+1),conditon 是S1[i] != S2[j]。相应代码如下。

int minDistance(string word1, string word2) {    int len1 = word1.size() + 1;    int len2 = word2.size() + 1;    vector<vector<int>> dis(len1,vector<int>(len2,0));    //初始化,当word1为空时,dis[0][j] = j,即逐个添加word2字母    for (size_t j = 0; j < len2; j++)    {        dis[0][j] = j;    }    //初始化,当word2为空时,dis[i][0] = i,即逐个删除word1的字母    for (size_t i = 0; i < len1; i++)    {        dis[i][0] = i;    }    for (size_t i = 1; i < len1; i++)    {        for (size_t j = 1; j < len2; j++)        {            if (word1[i-1] == word2[j-1])            {                dis[i][j] = dis[i-1][j-1];            }            else            {                cout << "temp = " << min(dis[i][j - 1] + 1, dis[i - 1][j - 1] + 1) << endl;                dis[i][j] = min(dis[i - 1][j] + 1, min(dis[i][j - 1] + 1, dis[i - 1][j - 1] +  1));            }        }    }    traverse(dis);    return dis[len1-1][len2-1];}
  • Result AC! 时间和空间复杂度都是o(n*m)
0 0
原创粉丝点击