Merge Sort and Inversion

来源:互联网 发布:银河铁道之夜 知乎 编辑:程序博客网 时间:2024/06/04 19:14

Merge Sort 算法:

Conceptually, a merge sort works as follows

  1. Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted).
  2. Repeatedly merge sublists to produce new sublists until there is only 1 sublist remaining. This will be the sorted list

伪代码:

function merge_sort(list m)    // if list size is 0 (empty) or 1, consider it sorted and return it    // (using less than or equal prevents infinite recursion for a zero length m)    if length(m) <= 1        return m    // else list size is > 1, so split the list into two sublists    // 1. DIVIDE Part...    var list left, right    var integer middle = length(m) / 2    for each x in m before middle         add x to left    for each x in m after or equal middle         add x to right    // recursively call merge_sort() to further split each sublist    // until sublist size is 1    left = merge_sort(left)    right = merge_sort(right)    // merge the sublists returned from prior calls to merge_sort()    // and return the resulting merged sublist    // 2. CONQUER Part...    return merge(left, right)


Tom这样说:

-------recursively sort 1st half of the input array

-------recursively sort 2nd half of the input array

-------merge two sorted sublists into one


Pseudocode for Merge

i=1; j=1; -------------------------2 operations

for k=1 to n------------------1 op  if A(i)<B(j)-------------1op     C(k)=A(i)-------------1op      i++------------------1op  else B(j)<A(i)     C(k)=B(j)     j++end    
So total time is 6 operations.  

Merge sort running time Claim:

Since :An array of m numbers the running time is <= 4m+2 <= 6m;

Merge Sort require <= 6nlog2N+6N






首先要明白Inversion Number的含义:

在一个排列中,如果一对数的前后位置与大小顺序相反,即前面的数大于后面的数,那么它们就称为一个逆序。一个排列中逆序的总数就称为这个排列的逆序数。逆序数为偶数的排列称为偶排列;逆序数为奇数的排列称为奇排列。如2431中,21,43,41,31是逆序,逆序数是4,为偶排列。

也是就说,对于n个不同的元素,先规定各元素之间有一个标准次序(例如n个 不同的自然数,可规定从小到大为标准次序),于是在这n个元素的任一排列中,当某两个元素的先后次序与标准次序不同时,就说有1个逆序。一个排列中所有逆序总数叫做这个排列的逆序数。----------百度百科




转帖自:http://shanks-chan.blogspot.com/2012/04/merge-sort-inversion.html

merge sort & inversion 

inversion 逆序数
definition:
对数组a[].存在一对(i,j)有i < j 且 a[i] > a[j] 即为一个逆序数对
e.g.
{1,3,5,2,4,6} 逆序数为3
i.e. (3,2) (5,2) (5,4)

Stanford open course 的第一章就是关于分而治之思想(divide and conquer)的归并排序(mergesort)。上学期去隔壁班蹭算法课的时候听了下内排序,其实理解的不够深刻。现在再看一次mergesort,确实将分治思想变现的淋漓尽致。将大问题划分成小问题,递归调用函数去解决。
mergesort的大致思路,就是将一个数组平均分成左右两个子数组,分别调用排序函数本身,返回的是两个已排序的数组,然后再将其merge起来。时间复杂度是O(n·log n)

在mergesort算法的基础上稍为改变一下就可以实现求一组数的逆序数。
单纯用for循环模拟的话,复杂度是O( n^2 ) 即n个数中取2个的组合数。
接上,一个array被分成left array[i] & right array[j],其逆序数的组成可分为3部分:
1.left inversion (i,j)均在left array中,即 i,j <= n/2
2.right inversion (i,j)均在right array中,即 i,j > n/2
3.split inversion i在left,j在right,即 i <= n/2 ; j > n/2

计算左数组和右数组的逆序数,再加上split inversion即可得到整个数组的inversion number。问题关键成了解决split inversion。假设左右数组均已排好序(sorted B[],C[]),按mergesort的思路将两个数组合并的时候,每次从B[]取数放到合并数组(D[]),
计算B中剩余的元素个数(因为B中元素本应全部小于C中元素,若不是则必然存在逆序)。

课程练习原题是从100000的样本测试数据(txt)中计算逆序数个数,这个mergesort的模板感觉是有点怪(参数列表的问题)。开始用int count,结果溢出了。。。目测一下结果,貌似溢出的不多,改unsigned int,过了~

上代码:
#include<cstdio>
#include<fstream>
using namespace std;
int tmp[100001],a[100001];
void mergesort(int a[],int tmp[],int left,int right,unsigned int& count){ //待排序数组地址a,缓存数组tmp(节省时间,避免每次递归调用都开辟新数组)
int mid = (left + right) / 2;
if(left == right)return; //递归分割的最小子项
mergesort(a,tmp,left,mid,count);
mergesort(a,tmp,mid + 1,right,count); //分别对左右数组递归调用mergesort(invoke mergesort() recursively)
for(int i = left;i <= right;i ++)tmp[i] = a[i];
int i = left;
int j = mid + 1;
for(int k = left;k <= right;k ++){ //merge two sorted array
if(i > mid)a[k] = tmp[j ++]; //judge the index of array out of size
else if(j > right)a[k] = tmp[i ++];
else if(tmp[i] > tmp[j]){
a[k] = tmp[j ++];
count += mid + 1 - i; //add remain number of left array(inversions with tmp[j])
}
else a[k] = tmp[i ++];
}
}
int main()
{
fstream file("d:\\IntegerArray.txt",ios::in);
if(!file)printf("exception!");
int i = 0;
unsigned int count = 0;
char s[10];
while(file >> s)
a[i ++] = atoi(s);
mergesort(a,tmp,0,99999,count);
printf("%u\n",count);
return 0;
}

原创粉丝点击