JavaShowAlgorithm-Median of Two Sorted Arrays

来源:互联网 发布:iphone翻墙用什么软件 编辑:程序博客网 时间:2024/06/15 20:17

题目:

There are two sorted arrays A and B of size m and n respectively. Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

 题解:

首先我们先明确什么是median,即中位数。

引用Wikipedia对中位数的定义:

计算有限个数的数据的中位数的方法是:把所有的同类数据按照大小的顺序排列。如果数据的个数是奇数,则中间那个数据就是这群数据的中位数;如果数据的个数是偶数,则中间那2个数据的算术平均值就是这群数据的中位数。

因此,在计算中位数Median时候,需要根据奇偶分类讨论。

解决此题的方法可以依照:寻找一个unioned sorted array中的第k大(从1开始数)的数。因而等价于寻找并判断两个sorted array中第k/2(从1开始数)大的数。

特殊化到求median,那么对于奇数来说,就是求第(m+n)/2+1(从1开始数)大的数。

而对于偶数来说,就是求第(m+n)/2大(从1开始数)和第(m+n)/2+1大(从1开始数)的数的算术平均值。

那么如何判断两个有序数组A,B中第k大的数呢?

我们需要判断A[k/2-1]和B[k/2-1]的大小。

如果A[k/2-1]==B[k/2-1],那么这个数就是两个数组中第k大的数。

如果A[k/2-1]<B[k/2-1], 那么说明A[0]到A[k/2-1]都不可能是第k大的数,所以需要舍弃这一半,继续从A[k/2]到A[A.length-1]继续找。当然,因为这里舍弃了A[0]到A[k/2-1]这k/2个数,那么第k大也就变成了,第k-k/2个大的数了。

如果 A[k/2-1]>B[k/2-1],就做之前对称的操作就好。

 这样整个问题就迎刃而解了。

当然,边界条件页不能少,需要判断是否有一个数组长度为0,以及k==1时候的情况。

因为除法是向下取整,并且页为了方便起见,对每个数组的分半操作采取:

int partA = Math.min(k/2,m);
int partB = k - partA;

 为了能保证上面的分半操作正确,需要保证A数组的长度小于B数组的长度。

同时,在返回结果时候,注意精度问题,返回double型的就好。

public static double findMedianSortedArrays(int A[], int B[]) {    int m = A.length;    int n = B.length;    int total = m+n;    if (total%2 != 0)        return (double) findKth(A, 0, m-1, B, 0, n-1, total/2+1);//k传得是第k个,index实则k-1    else {        double x = findKth(A, 0, m-1, B, 0, n-1, total/2);//k传得是第k个,index实则k-1        double y = findKth(A, 0, m-1, B, 0, n-1, total/2+1);//k传得是第k个,index实则k-1        return (double)(x+y)/2;    }} public static int findKth(int[] A, int astart, int aend, int[] B, int bstart, int bend, int k) {    int m = aend - astart + 1;    int n = bend - bstart + 1;        if(m>n)        return findKth(B,bstart,bend,A,astart,aend,k);    if(m==0)        return B[k-1];    if(k==1)        return Math.min(A[astart],B[bstart]);        int partA = Math.min(k/2,m);    int partB = k - partA;    if(A[astart+partA-1] < B[bstart+partB-1])        return findKth(A,astart+partA,aend,B,bstart,bend,k-partA);    else if(A[astart+partA-1] > B[bstart+partB-1])        return findKth(A,astart,aend,B,bstart+partB,bend,k-partB);    else        return A[astart+partA-1];    }
题目是这样的:给定两个已经排序好的数组(可能为空),找到两者所有元素中第k大的元素。另外一种更加具体的形式是,找到所有元素的中位数。本篇文章我们只讨论更加一般性的问题:如何找到两个数组中第k大的元素?不过,测试是用的两个数组的中位数的题目,Leetcode第4题 Median of Two Sorted Arrays
方案1:假设两个数组总共有n个元素,那么显然我们有用O(n)时间和O(n)空间的方法:用merge sort的思路排序,排序好的数组取出下标为k-1的元素就是我们需要的答案。
这个方法比较容易想到,但是有没有更好的方法呢?
方案2:我们可以发现,现在我们是不需要“排序”这么复杂的操作的,因为我们仅仅需要第k大的元素。我们可以用一个计数器,记录当前已经找到第m大的元素了。同时我们使用两个指针pA和pB,分别指向A和B数组的第一个元素。使用类似于merge sort的原理,如果数组A当前元素小,那么pA++,同时m++。如果数组B当前元素小,那么pB++,同时m++。最终当m等于k的时候,就得到了我们的答案——O(k)时间,O(1)空间。
但是,当k很接近于n的时候,这个方法还是很费时间的。当然,我们可以判断一下,如果k比n/2大的话,我们可以从最大的元素开始找。但是如果我们要找所有元素的中位数呢?时间还是O(n/2)=O(n)的。有没有更好的方案呢?
我们可以考虑从k入手。如果我们每次都能够剔除一个一定在第k大元素之前的元素,那么我们需要进行k次。但是如果每次我们都剔除一半呢?所以用这种类似于二分的思想,我们可以这样考虑:

Assume that the number of elements in A and B are both larger than k/2, and if we compare the k/2-th smallest element in A(i.e. A[k/2-1]) and the k-th smallest element in B(i.e. B[k/2 - 1]), there are three results:
(Becasue k can be odd or even number, so we assume k is even number here for simplicy. The following is also true when k is an odd number.)
A[k/2-1] = B[k/2-1]
A[k/2-1] > B[k/2-1]
A[k/2-1] < B[k/2-1]
if A[k/2-1] < B[k/2-1], that means all the elements from A[0] to A[k/2-1](i.e. the k/2 smallest elements in A) are in the range of k smallest elements in the union of A and B. Or, in the other word, A[k/2 - 1] can never be larger than the k-th smalleset element in the union of A and B.

Why?
We can use a proof by contradiction. Since A[k/2 - 1] is larger than the k-th smallest element in the union of A and B, then we assume it is the (k+1)-th smallest one. Since it is smaller than B[k/2 - 1], then B[k/2 - 1] should be at least the (k+2)-th smallest one. So there are at most (k/2-1) elements smaller than A[k/2-1] in A, and at most (k/2 - 1) elements smaller than A[k/2-1] in B.So the total number is k/2+k/2-2, which, no matter when k is odd or even, is surly smaller than k(since A[k/2-1] is the (k+1)-th smallest element). So A[k/2-1] can never larger than the k-th smallest element in the union of A and B if A[k/2-1]<B[k/2-1];
Since there is such an important conclusion, we can safely drop the first k/2 element in A, which are definitaly smaller than k-th element in the union of A and B. This is also true for the A[k/2-1] > B[k/2-1] condition, which we should drop the elements in B.
When A[k/2-1] = B[k/2-1], then we have found the k-th smallest element, that is the equal element, we can call it m. There are each (k/2-1) numbers smaller than m in A and B, so m must be the k-th smallest number. So we can call a function recursively, when A[k/2-1] < B[k/2-1], we drop the elements in A, else we drop the elements in B.


We should also consider the edge case, that is, when should we stop?
1. When A or B is empty, we return B[k-1]( or A[k-1]), respectively;
2. When k is 1(when A and B are both not empty), we return the smaller one of A[0] and B[0]
3. When A[k/2-1] = B[k/2-1], we should return one of them

In the code, we check if m is larger than n to garentee that the we always know the smaller array, for coding simplicy

double findKth(int a[], int m, int b[], int n, int k){//always assume that m is equal or smaller than nif (m > n)return findKth(b, n, a, m, k);if (m == 0)return b[k - 1];if (k == 1)return min(a[0], b[0]);//divide k into two partsint pa = min(k / 2, m), pb = k - pa;if (a[pa - 1] < b[pb - 1])return findKth(a + pa, m - pa, b, n, k - pa);else if (a[pa - 1] > b[pb - 1])return findKth(a, m, b + pb, n - pb, k - pb);elsereturn a[pa - 1];}class Solution{public:double findMedianSortedArrays(int A[], int m, int B[], int n){int total = m + n;if (total & 0x1)return findKth(A, m, B, n, total / 2 + 1);elsereturn (findKth(A, m, B, n, total / 2)+ findKth(A, m, B, n, total / 2 + 1)) / 2;}};


原创粉丝点击