最长连续序列(O(n)算法)
来源:互联网 发布:淘宝自己的评论怎么看 编辑:程序博客网 时间:2024/05/20 20:01
原文链接点击打开链接
/***************************************************************************** * File: LongestRange.java * Author: Keith Schwarz (htiek@cs.stanford.edu) * * An algorithm for finding the longest range of consecutive integers * contained in an array of unsorted integers. For example, if the input * array is * * 16 1 12 5 4 10 2 11 13 3 15 * * The longest range of consecutive integers would be 1, 2, 3, 4, 5, since all * of those values appear at least once in the array. Notice that while the * sequence 10, 11, 12, 13, 15, 16 is in the array, because the value 14 is * missing, the range 10 - 16 is not a valid range. * * There are many algorithms we could use to solve this problem. First, we * could sort the array in O(n log n) time, then look for the longest * contiguous sorted range in the array using a single pass over the array. * This would take O(n log n) time for the sorting, plus an extra O(n) pass * over the array, for a net runtime of O(n log n) time. * * However, since we do know that the elements of the array are integers, we * could improve this by using radix sort to get the runtime down to * O(n lg |U|), where U is the largest possible value in the array. * * Alternatively, we could consider an entirely different approach based on * hashing. Suppose that we could build up a hash table containing a copy of * every value in the input array. Assuming that we have a hash funciton that * works in constant time (which, assuming that we're on a standard computer, * is a perfectly valid assumption), this means that given any element k of * the array, we can query in expected O(1) time whether k + 1 or k - 1 is in * the array by simply looking those values up in the hash table. Using this, * we can build a (fairly inefficient) algorithm that works as follows. * First, add all of the array elements to a hash table. Then, make a second * pass over the array. We can then see how long the range containing the * current element x is using the following logic. Every element x is part * of some range (which might just contain x), but which may contain some * elements greater than x and some elements less than x. In other words, * given an arbitrary element x, there is some range containing x that looks * something like this: * * +--------------+---+--------------+ * | elements < x | x | elements > x | * +--------------+---+--------------+ * * Given the element x, how can we tell how many elements are greater than x * or less than x? One idea would be as follows. Remember that given the * value of x, we can check whether x + 1 or x - 1 are contained somewhere in * the array in (expected) O(1) time. This means that we can find all of the * elements contained in the same range as x as follows: First, look up x + 1 * in the hash table. If it's contained in the table, then look up x + 2. * If x + 2 is in the table, then look up x + 3, etc. More generally, we can * count the number of elements in the table as follows: * * Set numGreater = 0 * Set i = 1 * While x + i is contained in the hash table: * Set numGreater = numGreater + 1 * i = i + 1 * * This works by counting up from x + 1 as long as the elements are contained * in the table, then recording how many elements we found. * * Similarly, we can count how many elements are smaller than x as follows: * * Set numSmaller = 0 * Set i = 1 * While x - i is contained in the hash table: * Set numSmaller = numSmaller + 1 * i = i + 1 * * By running both of these loops, we can count the total number of elements * in the range containing x. This gives the following (fairly inefficient) * algorithm for finding the longest range: * * Set longest = 0 * Insert all of the array elements into a hash table. * For each element x of the array, in any order: * Using the above subroutines, compute numGreater and numSmaller * Set longest = max(longest, 1 + numGreater + numSmaller) * Return longest * * Here, we scan across the array, computing the length of the range * containing each number, which is equal to the number of elements above and * below the current array element plus one (because we have to account for * the array element itself). * * Now, let's analyze the runtime of this algorithm. Inserting each element * into the hash table takes, on expectation, O(n) time. We then make a pass * over the array, computing range lengths. For each element x, the total * number of hash lookups required to find the number of values greater than * x is equal to g + 1, where g is the number of elements greater than x * (since we do a total of g successful queries and 1 failing query). * Similarly, we do l + 1 work to find the l elements smaller than x. This * gives a total runtime of g + l + 2 = S(x) + 1 work per element, where S(x) * is the total number of elements in the same range as x. * * So how does this translate into a total runtime? Recall that each element * belongs to exactly one range, so we can number the ranges contained in the * array as R_1, R_2, ..., R_j. Whenever we process an element in the array, * we end up doing |R_i + 1| hash table lookups, where R_i is the range * containing the current array element. Since there are |R_i| elements in * the range |R_i|, the total number of hash lookups done to process elements * in |R_i| is |R_i|(|R_i| + 1). Thus the total number of hash lookups done * by the function is * * j * sum |R_i|(|R_i| + 1) * i = 1 * * If each array element is in its own range, this takes O(n) time, but if * everything is contained in a single range (that is, the array is a * permutation of some contiguous range), then there is one range and we end * up spending O(n^2) time processing it. This is much worse than the old * algorithm based on sorting! * * Fortunately, we can improve this runtime bound by using a fairly * straightforward optimization. Once we've computed the length of a given * range in the array, it makes no sense to ever recompute it again, since * that work will be wasted. To fix this, as we compute the length of a * range, we will remove all of the elements from that range from the hash * table. Additionally, as we make our scan over the input array, we will * first confirm that the array element x is still contained in the table. * If so, then we know that we haven't processed the range containing x yet * and should go process it. Otherwise, the range holding x has already been * considered, so we can skip that array element. * * The new version of the logic looks like this: * * Set longest = 0 * Insert all of the array elements into a hash table. * For each element x of the array, in any order: * If x still contained in the hash table: * Using the above subroutines, compute numGreater and numSmaller * In doing so, remove each element found this way from the table. * Set longest = max(longest, 1 + numGreater + numSmaller) * Return longest * * This version of the algorithm now visits each range once, but introduces * one more hash lookup per element. To count the total number of hash * lookups performed during this algorithm, we can split the hash lookups * into two classes of lookups: lookups done to determine whether to process * a range, and lookups done to actually process the range. There are a * total of n lookups done by this first category. In the second category, * recall that |R_i| + 1 hash lookups are done when processing each range. * Since each range is processed exactly once, the total number of hash * lookups done to process ranges is given by * j * sum (|R_i| + 1) = n + j * i=1 * * Here, the n term comes from the fact that each element is in exactly one * range, so summing the lengths of the ranges gives n elements. Thus the * total number of hash lookups done by this algorithm is 2n + j <= 3n, * which is O(n). The expected runtime of this algorithm is thus O(n). * * Notice that this algorithm does not have a lg |U| term in its runtime, * even though the hash function must process all the bits of the numbers to * hash. This is because we assume the computer can perform operations on * lg |U| bits in a single operation. Thus unlike the radix sort-based * version of this algorithm, in which the number of bits manifests itself * explicitly in the number of iterations required, the hash-based version * can conveniently tuck the number of bits used under the rug because the * machine can operate on blocks of bits as a unit. * * There is one final detail to consider here: what happens if an integer * overflow or underflow occurs? For example, suppose that our array holds * the values INT_MAX and INT_MIN. Then our algorithm would incorrectly * report that the longest range has size 2, because when processing INT_MAX * the algorithm would note that INT_MAX + 1 = INT_MIN is indeed contained in * the array. To fix this, we need to insert some additional checks into our * code for finding the number of elements greater than or less than the * current array element so that we don't end up overflowing or underflowing. */import java.util.*;public final class LongestRange { /* This is a utility class and should not be instantiated. */ private LongestRange() { /* Not intended to be used */ } /** * Returns the length of the longest continous range of values that are * all present in the input array. * * @param arr The array in which the search should be conducted. * @return The length of the longest continuous range of values in arr. * @throws NullPointerException if arr is null. */ public static int longestRange(int[] arr) { /* Begin by creating a hash table that holds all of the array * elements. */ Set<Integer> values = new HashSet<Integer>(); for (int value: arr) values.add(value); /* Keep track of the longest range we've seen so far. Initially, * this is the empty range. */ int longest = 0; /* Scan across the array, searching for the longest range of values * contained in that array. */ for (int value: arr) { /* To avoid unnecessary work, don't process this element if * we already did. To mark that it has been processed, we * remove the element. Since Java's Set#remove function * returns whether the element was removed successfully, we * can combine the test/remove operation into one. */ if (!values.remove(value)) continue; /* Track how many total elements are in the range containing * the current element. Initially this is one, because the * range contains this element. */ int rangeLength = 1; /* See how many elements are greater than the current value * and contained in the range. To avoid integer overflow, * at each step we track whether the element we're about to * probe is greater than the current element; on an overflow, * this will be false. */ for (int i = 1; value + i > value; ++i) { /* Again, combine the test/remove operation into * one. */ if (!values.remove(value + i)) break; ++rangeLength; } /* Using similar logic, see how many elements in the range * are smaller than the current value. */ for (int i = 1; value - i < value; ++i) { if (!values.remove(value - i)) break; ++rangeLength; } /* Update the length of the longest range we've seen so far. */ if (longest < rangeLength) longest = rangeLength; } /* Hand back the length of the longest range we encountered. */ return longest; }}
- 最长连续序列(O(n)算法)
- 最长连续序列(O(n),C++版)
- 最长上升子序列 o(n*logn)算法
- 最长连续序列(算法)
- 最长单调子序列(O(n^2) 和 O(nlogn) 算法)
- 【LIS最长上升子序列】O(n^2)与O(nlogn)算法(HDU1257)
- 最大连续子序列和的O(n)算法
- 最长不下降子序列的O(n*logn)算法
- DP之最长上升子序列O(n*logn)算法
- 最长不下降子序列的O(n)算法
- 最长上升子序列之O(n^2)算法
- DP之最长上升子序列O(n*logn)算法
- 将求最长子序列的算法由O(n²)改进为O(n).
- 字符串最长回文O(n)算法
- HDU 1025 Constructing Roads (最长上升子序列O(n*logn)算法)
- O(nlog(n))的最长上升(不下降)子序列算法
- 最长公共上升子序列(LCIS)的O(n^2)算法
- 最长公共上升子序列(LCIS)的O(n^2)算法
- C++笔记(1)explicit构造函数
- Linux shell常用脚本
- 多媒体应用开发1---MediaPlayer
- 交叉验证
- 计算机学术期刊排名
- 最长连续序列(O(n)算法)
- 原码反码补码
- 设计模式之观察者模式
- eclipse报PermGen space
- linux c学习笔记----TCP基础客户/服务编程(socket,bind等)
- n后问题
- 近期刷题的c语言总结。
- POJ 1458简单的动态规划:求最长子序列
- 闪回技术之数据库级闪回特别篇1