Peaks Complexity
来源:互联网 发布:时序数据分类基本原理 编辑:程序博客网 时间:2024/05/21 07:00
Peaks Complexity
I've just done the following Codility Peaks problem. The problem is as follows:
A non-empty zero-indexed array A consisting of N integers is given. A peak is an array element which is larger than its neighbors. More precisely, it is an index P such that 0 < P < N − 1, A[P − 1] < A[P] and A[P] > A[P + 1]. For example, the following array A:
A[0] = 1A[1] = 2A[2] = 3A[3] = 4A[4] = 3A[5] = 4A[6] = 1A[7] = 2A[8] = 3A[9] = 4A[10] = 6A[11] = 2
has exactly three peaks: 3, 5, 10. We want to divide this array into blocks containing the same number of elements. More precisely, we want to choose a number K that will yield the following blocks: A[0], A[1], ..., A[K − 1], A[K], A[K + 1], ..., A[2K − 1], ... A[N − K], A[N − K + 1], ..., A[N − 1]. What's more, every block should contain at least one peak. Notice that extreme elements of the blocks (for example A[K − 1] or A[K]) can also be peaks, but only if they have both neighbors (including one in an adjacent blocks). The goal is to find the maximum number of blocks into which the array A can be divided. Array A can be divided into blocks as follows:
one block (1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2). This block contains three peaks.
two blocks (1, 2, 3, 4, 3, 4) and (1, 2, 3, 4, 6, 2). Every block has a peak.
three blocks (1, 2, 3, 4), (3, 4, 1, 2), (3, 4, 6, 2). Every block has a peak.
Notice in particular that the first block (1, 2, 3, 4) has a peak at A[3], because A[2] < A[3] > A[4], even though A[4] is in the adjacent block. However, array A cannot be divided into four blocks, (1, 2, 3), (4, 3, 4), (1, 2, 3) and (4, 6, 2), because the (1, 2, 3) blocks do not contain a peak. Notice in particular that the (4, 3, 4) block contains two peaks: A[3] and A[5]. The maximum number of blocks that array A can be divided into is three.
Write a function: class Solution { public int solution(int[] A); } that, given a non-empty zero-indexed array A consisting of N integers, returns the maximum number of blocks into which A can be divided. If A cannot be divided into some number of blocks, the function should return 0. For example, given:
A[0] = 1A[1] = 2 A[2] = 3 A[3] = 4 A[4] = 3 A[5] = 4 A[6] = 1 A[7] = 2 A[8] = 3 A[9] = 4 A[10] = 6 A[11] = 2
the function should return 3, as explained above. Assume that:
N is an integer within the range [1..100,000]; each element of array A is an integer within the range [0..1,000,000,000].
Complexity:
expected worst-case time complexity is O(N*log(log(N)))
expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.
My Question
So I solve this with what to me appears to be the brute force solution – go through every group size from 1..N
, and check whether every group has at least one peak. The first 15 minutes I was trying to solve this I was trying to figure out some more optimal way, since the required complexity is O(N*log(log(N))).
This is my "brute-force" code that passes all the tests, including the large ones, for a score of 100/100:
public int solution(int[] A) { int N = A.length; ArrayList<Integer> peaks = new ArrayList<Integer>(); for(int i = 1; i < N-1; i++){ if(A[i] > A[i-1] && A[i] > A[i+1]) peaks.add(i); } for(int size = 1; size <= N; size++){ if(N % size != 0) continue; int find = 0; int groups = N/size; boolean ok = true; for(int peakIdx : peaks){ if(peakIdx/size > find){ ok = false; break; } if(peakIdx/size == find) find++; } if(find != groups) ok = false; if(ok) return groups; } return 0;}
My question is how do I deduce that this is in fact O(N*log(log(N))), as it's not at all obvious to me, and I was surprised I pass the test cases. I'm looking for even the simplest complexity proof sketch that would convince me of this runtime. I would assume that a log(log(N)) factor means some kind of reduction of a problem by a square root on each iteration, but I have no idea how this applies to my problem. Thanks a lot for any help
4 Answers
You're completely right: to get the log log performance the problem needs to be reduced.
A n.log(log(n)) solution in python [below]. Codility no longer test 'performance' on this problem (!) but the python solution scores 100% for accuracy.
As you've already surmised: Outer loop will be O(n) since it is testing whether each size of block is a clean divisor Inner loop must be O(log(log(n))) to give O(n log(log(n))) overall.
We can get good inner loop performance because we only need to perform d(n), the number of divisors of n. We can store a prefix sum of peaks-so-far, which uses the O(n) space allowed by the problem specification. Checking whether a peak has occurred in each 'group' is then an O(1) lookup operation using the group start and end indices.
Following this logic, when the candidate block size is 3 the loop needs to perform n / 3 peak checks. The complexity becomes a sum: n/a + n/b + ... + n/n where the denominators (a, b, ...) are the factors of n.
Short story: The complexity of n.d(n) operations is O(n.log(log(n))).
Longer version: If you've been doing the Codility Lessons you'll remember from the Lesson 8: Prime and composite numbers that the sum of harmonic number operations will give O(log(n)) complexity. We've got a reduced set, because we're only looking at factor denominators. Lesson 9: Sieve of Eratosthenes shows how the sum of reciprocals of primes is O(log(log(n))) and claims that 'the proof is non-trivial'. In this case Wikipedia tells us that the sum of divisors sigma(n) has an upper bound (see Robin's inequality, about half way down the page).
Does that completely answer your question? Suggestions on how to improve my python code are also very welcome!
def solution(data): length = len(data) # array ends can't be peaks, len < 3 must return 0 if len < 3: return 0 peaks = [0] * length # compute a list of 'peaks to the left' in O(n) time for index in range(2, length): peaks[index] = peaks[index - 1] # check if there was a peak to the left, add it to the count if data[index - 1] > data[index - 2] and data[index - 1] > data[index]: peaks[index] += 1 # candidate is the block size we're going to test for candidate in range(3, length + 1): # skip if not a factor if length % candidate != 0: continue # test at each point n / block valid = True index = candidate while index != length: # if no peak in this block, break if peaks[index] == peaks[index - candidate]: valid = False break index += candidate # one additional check since peaks[length] is outside of array if index == length and peaks[index - 1] == peaks[index - candidate]: valid = False if valid: return length / candidate return 0
Credits: Major kudos to @tmyklebu for his SO answer which helped me a lot.
题目大意是希望将序列等分成c片,每片都要至少有一个peak,peak就是比左右都大的数(原序列首尾不能算),求c的最大值 //Codility题目描述还真是啰嗦啊喂
- 用O(n)得空间统计从开始到目前为止得peak数sum[],以及最远两peak间坐标差D
- 求最大分片数c,即求最小分片长度k,可行解k的可能范围在(D/2,min(D,n/2)],要等分首先 n % k == 0,其次sum[k - 1], sum[2 * k - 1], sum[3 * k - 1],....sum[n - 1]这个数列有n/k项。所以外层循环k的次数等于(D/2,D]间n的约数个数(小于D/2),内层判断是否可行需要n/k的操作(小于2n/D)。于是第2步时间复杂度也是O(n)。
- 编程上要注意计算D的时候要考虑第一个和最后一个peak到首尾的距离,还有不要混淆K c的含义(大写字母做变量名很容易出错)。
代码
int solution(vector<int> &A) { int N = A.size(); vector<int> npeaks(N+1, 0);//npeaks[i]代表第i个元素(不包括)之前peak的数量 int maxD = 0;//最远两peak间坐标差D int last_peak = -1;//处理第一个peak到起始的距离 for(int i = 1; i < N-1; i++){ if(A[i]>A[i+1] && A[i]>A[i-1]){ npeaks[i+1] = npeaks[i] + 1; maxD = max(maxD, i - last_peak); last_peak = i; } else{ npeaks[i+1] = npeaks[i]; } } maxD = max(maxD, N - last_peak);//处理最后一个peak到末端的距离 npeaks[N] = npeaks[N-1]; if(npeaks[N] < 1) return 0; if(maxD > N/2) return 1; for(int K = maxD/2; K <= maxD; K++){//slice长度 if(N%K == 0){ bool isvalid = true; int c = N/K;//slice数量 for(int i = 1; i <= c; i++){ if(npeaks[i*K] - npeaks[(i-1)*K] < 1){ isvalid = false; break; } } if(isvalid) return c; } } cout<<"fail"<<endl; for(int K = maxD+1;;K++) { if(N%K == 0) return N/K;//不能return K 呀 }}
- Peaks Complexity
- 8.3 Peaks
- [bzoj3545]Peaks
- 【BOI2012】Peaks
- codility Peaks
- System Complexity
- Algorithm Complexity
- Instant Complexity
- Rademacher complexity
- Radermacher Complexity
- Time Complexity and Space Complexity
- [bzoj3545]Peaks [bzoj3551]Peaks加强版
- bzoj-3545 Peaks
- BZOJ3545 [ONTAK2010]Peaks
- BZOJ3545: [ONTAK2010]Peaks
- JZOJ 【BOI2012】Peaks
- JZOJ 3635.【BOI2012】Peaks
- 【JZOJ3635】【BOI2012】Peaks
- GrabCut Demo Implementation
- JS原生方法实现jQuery的ready()
- 跟着柴毛毛学Spring(4)——面向切面编程
- <iOS>关于Xcode上的Other linker flags
- SDAU课程练习2 1007
- Peaks Complexity
- Python装饰器学习笔记
- 我说《古炉》之霸槽
- ListView不同行加载不同布局(问题及解决)
- mysql存储引擎
- Android广播集合
- 编写函数
- LeetCode 83 Remove Duplicates from Sorted List
- string to int