Determining Big O Notation
来源:互联网 发布:社交网络 传播效应 编辑:程序博客网 时间:2024/05/20 13:19
TheCodeCube - IT Community
Tutorials => Programming => Deployment Tutorials => Topic started by: KYA on September 12, 2009, 04:31:40 PM
Title: Featured: Determining Big O Notation
Post by: KYA on September 12, 2009, 04:31:40 PM
This is a subject many people are afraid of, or simply don't get. At first, it seems to be mystical hocus-pocus, but today I'll show you a simple way to quickly get an estimate of the Big O of an algorithm/function. This is not the only way to determine Big O, nor do I make any claims of it being 100 percent accurate or effective. The idea here is to be able to look at loops, functions, and code in general, in a different light.
First, a definition:
The wiki article (http://en.wikipedia.org/wiki/Big_O_notation) from which this is taken is an excellent reference if you need to quickly know the Big O of common/popular algorithms. What it absolutely fails to do is explain how one determines the actual Big O [I will use this term interchangeably with upper bound or limit]. For those not mathematically inclined, their eyes probably glaze over when reading the symbol infested portions. At running the risk of repeating myself, this article is simply here to show another way, one I think is easier (i.e. you could have a solid grasp on the concept without ever taking Calculus).
Some Basic Rules:
1. Nested loops are multiplied together.
2. Sequential loops are added.
3. Only the largest term is kept, all others are dropped.
4. Constants are dropped.
5. Conditional checks are constant (i.e. 1).
That's it really. I used the word loop, but the concept applies to conditional checks, full algorithms, etc.. since a whole is the sum of its parts. I can see the worried look on your face, this would all be frivolous without some examples[see code comments]:
Here we iterate 'n' times. Since nothing else is going on inside the loop (other then constant time printing), this algorithm is said to be O(n). The common bubble-sort:
Each loop is 'n'. Since the inner loop is nested, it is n*n, thus it is O(n^2). Hardly efficient. We can make it a bit better by doing the following:
Outer loop is still 'n'. The inner loop now executes 'i' times, the end being (n-1). We now have (n(n-1)/2). This is still in the bound of O(n^2), but only in the worst case.
An example of constant dropping:
At first you might say that the upper bound is O(2n); however, we drop constants so it becomes O(n). Mathematically, they are the same since (either way) it will require 'n' elements iterated (even though we'd iterate 2n times).
An example of sequential loops:
You wouldn't do this exact example in implementation, but doing something similar certainly is in the realm of possibilities. In this case we add each loop's Big O, in this case n+n^2. O(n^2+n) is not an acceptable answer since we must drop the lowest term. The upper bound is O(n^2). Why? Because it has the largest growth rate (upper bound or limit for the Calculus inclined).
Finite loops are common as well, an example:
Outer loop is 'n', inner loop is 2, this we have 2n, dropped constant gives up O(n).
In short Big O is simply a way to measure the efficiency of an algorithm. The goal is constant or linear time, thus the various data structures and their implementations. Keep in mind that a "faster" structure or algorithm is not necessary better. For example, see the classic hash table versus binary tree debate. While not 100% factual, it often said that a a hash-table is O(1) and is therefore better then a tree. From a discussion on the subject in a recent class I took:
The above is always something good to keep in mind when dealing with theoretical computer science concepts. Hopefully you found this both interesting and helpful. Happy coding!
First, a definition:
Quote
In mathematics, computer science, and related fields, big O notation describes the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. Big O notation allows its users to simplify functions in order to concentrate on their growth rates: different functions with the same growth rate may be represented using the same O notation.
The wiki article (http://en.wikipedia.org/wiki/Big_O_notation) from which this is taken is an excellent reference if you need to quickly know the Big O of common/popular algorithms. What it absolutely fails to do is explain how one determines the actual Big O [I will use this term interchangeably with upper bound or limit]. For those not mathematically inclined, their eyes probably glaze over when reading the symbol infested portions. At running the risk of repeating myself, this article is simply here to show another way, one I think is easier (i.e. you could have a solid grasp on the concept without ever taking Calculus).
Some Basic Rules:
1. Nested loops are multiplied together.
2. Sequential loops are added.
3. Only the largest term is kept, all others are dropped.
4. Constants are dropped.
5. Conditional checks are constant (i.e. 1).
That's it really. I used the word loop, but the concept applies to conditional checks, full algorithms, etc.. since a whole is the sum of its parts. I can see the worried look on your face, this would all be frivolous without some examples[see code comments]:
Code: (cpp) [Select]
//linear
for(int i = 0; i < n; i++) {
cout << i << endl;
}
for(int i = 0; i < n; i++) {
cout << i << endl;
}
Here we iterate 'n' times. Since nothing else is going on inside the loop (other then constant time printing), this algorithm is said to be O(n). The common bubble-sort:
Code: (cpp) [Select]
//quadratic
for(int i = 0; i < n; i++) {
for(int j = 0; j < n; j++){
//do swap stuff, constant time
}
}
for(int i = 0; i < n; i++) {
for(int j = 0; j < n; j++){
//do swap stuff, constant time
}
}
Each loop is 'n'. Since the inner loop is nested, it is n*n, thus it is O(n^2). Hardly efficient. We can make it a bit better by doing the following:
Code: (cpp) [Select]
//quadratic
for(int i = 0; i < n; i++) {
for(int j = 0; j < i; j++){
//do swap stuff, constant time
}
}
for(int i = 0; i < n; i++) {
for(int j = 0; j < i; j++){
//do swap stuff, constant time
}
}
Outer loop is still 'n'. The inner loop now executes 'i' times, the end being (n-1). We now have (n(n-1)/2). This is still in the bound of O(n^2), but only in the worst case.
An example of constant dropping:
Code: (cpp) [Select]
//linear
for(int i = 0; i < 2*n; i++) {
cout << i << endl;
}
for(int i = 0; i < 2*n; i++) {
cout << i << endl;
}
At first you might say that the upper bound is O(2n); however, we drop constants so it becomes O(n). Mathematically, they are the same since (either way) it will require 'n' elements iterated (even though we'd iterate 2n times).
An example of sequential loops:
Code: (cpp) [Select]
//linear
for(int i = 0; i < n; i++) {
cout << i << endl;
}
//quadratic
for(int i = 0; i < n; i++) {
for(int j = 0; j < i; j++){
//do constant time stuff
}
}
for(int i = 0; i < n; i++) {
cout << i << endl;
}
//quadratic
for(int i = 0; i < n; i++) {
for(int j = 0; j < i; j++){
//do constant time stuff
}
}
You wouldn't do this exact example in implementation, but doing something similar certainly is in the realm of possibilities. In this case we add each loop's Big O, in this case n+n^2. O(n^2+n) is not an acceptable answer since we must drop the lowest term. The upper bound is O(n^2). Why? Because it has the largest growth rate (upper bound or limit for the Calculus inclined).
Finite loops are common as well, an example:
Code: (cpp) [Select]
for(int i = 0; i < n; i++) {
for(int j = 0; j < 2; j++){
//do stuff
}
}
for(int j = 0; j < 2; j++){
//do stuff
}
}
Outer loop is 'n', inner loop is 2, this we have 2n, dropped constant gives up O(n).
In short Big O is simply a way to measure the efficiency of an algorithm. The goal is constant or linear time, thus the various data structures and their implementations. Keep in mind that a "faster" structure or algorithm is not necessary better. For example, see the classic hash table versus binary tree debate. While not 100% factual, it often said that a a hash-table is O(1) and is therefore better then a tree. From a discussion on the subject in a recent class I took:
Quote
Assuming that a hash-table is, in fact, O(1), that's not quite true. Being O(1) makes the hash-table superior to a tree for insertion and retrieval of objects. However, hash-tables have no sense of order based on value, so they fall short of trees for searching purposes (including things like "get maximum value").
That said, hash-tables aren't purely O(1). Poor choices in hash algorithm or table size, and issues like primary clustering, make operations on hash-tables in worse-than-constant time in reality.
The point is, saying "hash-tables are superior to trees" without some qualifications is ridiculous. But then, it doesn't take a genius to know that sweeping generalizations are often problematic.
The above is always something good to keep in mind when dealing with theoretical computer science concepts. Hopefully you found this both interesting and helpful. Happy coding!
Title: Re: Determining Big O Notation
Post by: KYA on September 13, 2009, 12:00:01 AM
I noticed I didn't provide a log(n) or nlog(n) example, arguably the toughest ones.
A quick metric to see if a loop is log n is to see how the counter increments in relationship to the total number of elements.
Example:
There are n iterations, however, instead of simply incrementing, 'i' is increased by 2*itself each run. Thus the loop is log(n).
An example of nested loops:
This example is n*log(n). (Remember that nested loops multiply their Big O's.)
A quick metric to see if a loop is log n is to see how the counter increments in relationship to the total number of elements.
Example:
Code: (cpp) [Select]
for(int i = 0; i < n; i *= 2) {
cout << i << endl;
}
cout << i << endl;
}
There are n iterations, however, instead of simply incrementing, 'i' is increased by 2*itself each run. Thus the loop is log(n).
An example of nested loops:
Code: (cpp) [Select]
for(int i = 0; i < n; i++) { //linear
for(int j = 0; j < n; j *= 2){ // log (n)
//do constant time stuff
}
}
for(int j = 0; j < n; j *= 2){ // log (n)
//do constant time stuff
}
}
This example is n*log(n). (Remember that nested loops multiply their Big O's.)
- Determining Big O Notation
- big O notation
- 第二章 Big O notation
- 第二章 Big O notation 进阶课程
- 算法的时间复杂度:Big O notation
- Big O,Big Theta,Big Omega,little o,little omega notation的定义
- 算法的基础知识( Time Complexity & Space Complexity& Big O notation)
- Big O notation大零符合 入门讲解
- n!的近似值 (stirling approximation)与 大O记法(big -O- notation)
- 第二章 Big O notation 试题以及讲解 (包会)
- Big "O"
- 算法学习之“Big Oh Notation”
- Big O and little o
- Big-O Cheat Sheet
- Big-O Complexity Chart
- Big O, Big Omega, Big Theta的含义
- 算法度量 Big O, Big Omega, Big Theta
- STL and Big O Cheat Sheet
- uva 572 - Oil Deposits
- 成为“Android高手”需要经过的六个阶段和6个境界
- 为程序开发人员量身定制的12个目标
- 成为Android高手必须掌握的28大项内容和10个建议
- Struts 2初次认识
- Determining Big O Notation
- 经典Android面试题和答案--重要知识点都涉及到了
- 主(磁盘)分区、扩展(磁盘)分区、逻辑(磁盘)分区的概念
- 在C#中使用CURL
- uva 439 - Knight Moves
- OS X:实用脚本程序系列-17
- Python获取本地和远程主机信息
- 笔试题--反转一个字节
- 浏览器哪个好