Algorithm学习笔记 --- DNAsorting

来源:互联网 发布:仿今日头条 php 编辑:程序博客网 时间:2024/05/16 06:08
DNA Sorting
Time Limit: 1000MS Memory Limit: 10000KTotal Submissions: 79623 Accepted: 31994

Description

One measure of ``unsortedness'' in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence ``DAABEC'', this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence ``AACEDGG'' has only one inversion (E and D)---it is nearly sorted---while the sequence ``ZWQM'' has 6 inversions (it is as unsorted as can be---exactly the reverse of sorted). 

You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length. 

Input

The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (0 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.

Output

Output the list of input strings, arranged from ``most sorted'' to ``least sorted''. Since two strings can be equally sorted, then output them according to the orginal order.

Sample Input

10 6AACATGAAGGTTTTGGCCAATTTGGCCAAAGATCAGATTTCCCGGGGGGAATCGATGCAT

Sample Output

CCCGGGGGGAAACATGAAGGGATCAGATTTATCGATGCATTTTTGGCCAATTTGGCCAAA

Source

East Central North America 1998


分析如下:
此题在解得时候我尝试用二位数组存储字符串,发现在处理时十分麻烦,后来选择结构体处理,并且在C++中可以用string这个类型
解法一:
#include <iostream>
#include <string>
#include <algorithm>
#include <cstdio>
using namespace std;
typedef struct
{
    string str;
    int count;
}DNA;
//进行字符串比较如果小于返回负数,如果相等返回0,如果大于返回正数
bool compare(DNA a,DNA b)
{
    return a.count<b.count;
}
int main()
{
    DNA a[100];
    int m,n;
    cin>>m>>n;
    int i;
    for(i=0;i<n;i++)
    {
        cin>>a[i].str;
        a[i].count=0;
        int k,j;
        for(j=1;j<m;j++)
        {
            for(k=0;k<j;k++)
            {
                if(a[i].str[j]<a[i].str[k])
                    a[i].count++;
            }
        }


    }
//STL中的排序函数
    sort(a,a+n,compare);
    for(i=0;i<n;i++)
    {
        cout<<a[i].str<<endl;
    }
     return 0;
}
解法二:
这个解法我是在网上看到网友发出来的,时间复杂度为O(n);比较快,此处借阅:
地址为:http://blog.csdn.net/china8848/article/details/2227131
什么是逆序数:
跟标准列相反序数的总和,比如说,标准列是1 2 3 4 5,那么 5 4 3 2 1 的逆序数算法:看第二个,4之前有一个5,在标准列中5在4的后面,所以记1个,类似的,第三个 3 之前有 4 5 都是在标准列中3的后面,所以记2个,同样的,2 之前有3个,1之前有4个
将这些数加起来就是逆序数=1+2+3+4=10。
再举一个 2 4 3 1 5 。4 之前有0个 3 之前有1个 1 之前有3个 5 之前有0个
所以逆序数就是1+3=4。
归并排序(MegerSort)求逆序数思想:
如果比较任意两个数字,那么时间复杂度是O(n^2),对于较大的n是无法接受的,MegerSort的时间复杂度是O(nlogn).
合并的时候一个指针指i向左边的元素l,有个指针j指向右边的元素r,当r小于l的时候,左边i到mid中的元素则与r构成逆序对。只用将逆序数总数加上mid-i+1就可以了。关于这个的应用参见上一篇解题报告。
对于本题,由于字符串中字符的范围是确定的,只有A C G T,所以可以在O(n)内求出逆序数方法是:至后向前插入字符,以下四个参数分别代表如果在左边加上A C G T,逆序数会加上多少,left_A left_C left_G left_T.首先初始化为0,然后至后向前插入字符,如果插入的是A:以后在前面插入C G T,逆序数都会增加,所以:left_C++;left_G++;left_T++;如果插入的是C:逆序数加上left_C,以后在前面插入G T,逆序数都会增加,所以:left_G++;left_T++;如果插入的是G:逆序数加上left_G,以后在前面插入T,逆序数都会增加,所以:left_T++;如果插入的是T:逆序数加上left_T,以后在前面插入字符,逆序数都不会增加。因此有代码:
    for(i=length-1;i>=0;i--)
    {
        a=str[i];
        switch(a)
        {
        case 'A':
            left_C++;
            left_G++;
            left_T++;
            break;
        case 'C':
            left_G++;
            left_T++;
            count+=left_C;
            break;
        case 'G':
            left_T++;
            count+=left_G;
            break;
        case 'T':
            count+=left_T;
            break;
        }
    }
    return count;

0 0