CareerCup How to find medium of 1 billion numbers across N distributed machines efficiently?
来源:互联网 发布:mac如何修改磁盘名称 编辑:程序博客网 时间:2024/05/15 07:26
How to find medium of 1 billion numbers across N distributed machines efficiently?
----------------------------------------------------------------------------------
1)Each machine sorts it's own elements.
Comlexity: nlog(n)
Time: Highest of all the machines.
2) Leader machine builds a heap of m elements(m being the number of machines)
Heap node contains numbers and machine to which the number belongs
3) Leader machine asks each machine to give next smallest element.
Complexity: m log(m)
4) Leader machine removes the smallest element from heap(o(1)) and asks for next min number to the machine to which that number belonged.
5) Insert the next min number in heap, repeast from step 4 till the time kth min number is found.
Total time complexity:
if h is highest chunk of data with a machine, h log(h) for sorting.
If m is number of machines:
m log(m) for building heap.
If k is half of billion numbers, find kth element complexity is:
k log(m)
Total messages passed:
k(half billion).
I am wondering if I could do the heap part in parallel.
- CareerCup How to find medium of 1 billion numbers across N distributed machines efficiently?
- Given a list of numbers ( fixed list) Now given any other list, how can you efficiently find out if
- How to (std::)find something efficiently with the STL
- Do you know how many combinations to select n numbers from 1 to m with sum of which is mysum?
- how to use assert() efficiently
- Big Data Counting: How To Count A Billion Distinct Objects Using Only 1.5KB Of Memory
- Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory
- perfect squares find the least number of perfect square numbers (1, 4, 9, 16, ...) which sum to n
- CareerCup Median of three numbers
- How to split large files efficiently
- CareerCup Find all the conflicting appointments from a given list of n appointments.
- CareerCup Number of ways to take n identical objects out of a bucket
- CareerCup Given a binary matrix of N X N of integers , you need to return only unique rows of binary
- How to crawl a quarter billion webpages in 40 hours
- CareerCup Find the no. of expressions that evaluate to a Walprime
- CareerCup Find the diameter of the tree
- find medium of two array
- How to enable the use of 'Ad Hoc Distributed Queries' by using sp_configure
- ArcGIS Runtime SDK for iOS开发系列教程(3)——Objective-C语法基础
- 一些图像处理函数用法
- 银行柜台业务调度系统
- HDU 2149 Public Sale(巴士博弈)
- 封装a.64p成.x64p达芬奇工具链的建立(工程编译步骤)g
- CareerCup How to find medium of 1 billion numbers across N distributed machines efficiently?
- 初识UML
- Regional_2011_H Holiday's Accommodation
- 指针,比特位操作
- [DP]TYVJ P1049 最长不下降子序列
- 数据挖掘-决策树ID3分类算法的C++实现
- 黑客电子书54本
- 入门视频采集与处理(学会分析YUV数据)
- C++中public,protected,private访问