AS3 Speed tests

来源:互联网 发布:合肥行知学校漕冲校区 编辑:程序博客网 时间:2024/05/21 10:51

AS3 Speed tests

This is a community maintained page of speed optimization techniques for use with ActionScript3. Originally started on RockOnFlash.com, the post just grew with new revisions and tests and yet new techniques from the community! Only made sense to put it on a wiki for everyone to maintain ;)

General Notes

(Morf) Many of these enhancements are particular to the type of variable declaration you use. So far, in testing Number vs. int, the greatest improvements are gained when using ints. In general (with exceptions), only moderate if any improvements are seen when using Number-declared variables, due to the limited accurate operations.

The following excellent generalizations come from Andre Michelle’s website (link below):

- Do not overload condition and statements as in AS2

It astounds me that this once-excellent technique for optimizing for speed no produces faster code. Overloading *should* produce faster code by reducing the number of compiled instructions and generating faster ones.

- Do not use Objects, if you know which properties will be finally involved.

- Cast instances, while reading from an Array

- Try to use Integers (int) for all possible cases

- Bitoperators are lightning fast

(Other things we what would like to see in AS3:

- Typed, length-fixed Arrays.

“…this would increase the speed…”

- A faster solution to read/write pixels on a bitmap.)

Optimization Links

Dennis Ippel - Some ActionScript 3.0 Optimizations and links

Andre Michelle - General Guidelines for AS3 Optimizations

Integer Addition & Increment/Decrement

Tests done to determine if “a += b” really is faster (as it should be) than “a = a + b” have been inconclusive so far. I’d still recommend using the shorthand if you’re accustomed to it, mostly because I’m hoping a new compiler will grant faster speeds for using it.

The good news is that Incrementing (adding 1) with the ++ operator is far faster than “a = a + 1” - in fast, using the ++ four times (in consecutive statements) is still about 50% faster than “a = a + 4”!!

Unfortunately, the same is not true with the decrement (–) operator. So far, tests on that have not shown any consistent speed increase over “a = a - 1”. (sigh)

Here’s the testing code:

import flash.utils.getTimer;

var time: Number = getTimer();
var sumStd: Number = 0;
var sumOpt: Number = 0;
var sumMpt: Number = 0;
var loops: Number;
var thresh: Number = 0.002;


function runStdTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
test = test - 1;
}
sumStd += (getTimer() - time);
}


function runOptTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
test--;
}
sumOpt += (getTimer() - time);
}


function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 20000000; i++) {
var test:int = i;
}
sumMpt += (getTimer() - time);
}



loops = 10;
for (var i:int = 0; i < loops; i++) {
runStdTest();
runOptTest();
runEmptyTest();
}

trace ("Standard vs. Optimization - Mean over", loops, "loops");
trace ("Standard Test: ", (sumStd /= loops));
trace();

Division

divide by 2

Now, you’ve probably heard the tip about “use multiplication instead of division when dividing by 2”, right?

faster:

var n:Number = value *.5;

slower:

var n:Number = value / 2;

run it with a test:

import flash.utils.getTimer;

var time:Number = getTimer();

function runDivisionTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:Number = i/2;
}
trace("DivisionTest: ", (getTimer()-time));
}

function runMultTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:Number = i*.5;
}
trace("MultTest: ", (getTimer()-time));
}
runDivisionTest();
runMultTest();

traces out:

DivisionTest:  162
MultTest: 110

Alright, so it’s not miles faster, but in a 3D engine like Papervision3D, this becomes the difference between a fast engine and a slow engine real quickly.

Well, there’s one still that’s even faster: Bitwise shift operations

Divide by 2 with a right shift:

trace(10 >> 1);
// traces:
5

Multiply by 2 with a left shift:

trace(10 << 1);
// traces:
20

Now, run the test against the Division and Multiplication tests above:

function runBitTest():void
{
time = getTimer();
for(var i:int=0;i<10000000;i++)
{
var test:int = i >> 1;
}
trace("BitTest: ", (getTimer()-time));
}
runBitTest();

traces out:

DivisionTest:  152
MultTest: 112
BitTest: 63

HOLY CRAP Batman - that’s nearly 1/2 the speed?? or should I say, 1*.5 the speed ;)

(Morf) Well, kind of - but you may not want to use the bitshift. When the BitTest code above uses the the same (slower) Number declaration for test, the results are slightly different (btw, the iterator i in each testing loop also now has the same declaration (int) in each code section =)):

Division by 2 - Mean over 20 loops
DivisionTest: 83.3
MultTest: 62.9
BitRightTest: 45.85
EmptyCodeTest: 29.1

I’ve also thrown in an empty loop timing for comparison - so a more accurate comparison can be made by subtracting that timing from each of the others. These gave:

Net timing results:
DivisionTest: 54.2
MultTest: 33.8
BitRightTest: 16.75

So both optimizations are still faster with Number declarations. But don’t use BitShifting for Number operations. That’s because the bit shift will round off the result to an integer - AS3 turns your original value into an integer before making the bit shift. Unless that’s what you’re after, use “0.5 *” instead.

Math.floor/Math.ceil

So, I was looking at some other stuff with uint and this seems to be a real gem when it comes to Math.floor and Math.ceil operations. I was reading through Flash’s help (I have to say, I really enjoy the new help - I live in there) on uint and realized that, like int, when you pass a number, it strips everything passed the decimal (like Math.floor). So, I thought, what the hey, let’s give er’ a speed test.

Sure enough, it was much faster to use uint(n) than Math.floor(n) - nearly 10x’s as fast

– UPDATE – After Paulius Uza’s comments about using int, I went back and added tests for int with the floor/ceil tests, and sure enough, they’re even faster than using uint. floor’s test wasn’t that drastic, but ceil’s was 1/2 the time of uint’s version

– /UPDATE –

fast:

var test:int = int(1.5);
//test equals 1

slow:

var test:Number = Math.floor(1.5);
//test equals 1

– /UPDATE by lilive –

But take care, this solution is only for positive integer:

var test1:int = int(-1.5);
//test1 equals -1
var test2:Number = Math.floor(-1.5);
//test2 equals -2

Now, I know what yer thinkin’: what about Math.ceil? +1? yes ;)

– /UPDATE by matthewwithanm – Note that this this is not a general replacement for ceil as it will give an incorrect result for integers. [ceil(1) == 1 but int(1) + 1 == 2]

fast:

var test:int = int(1.5)+1;
//test equals 2

slow:

var test:Number = Math.ceil(1.5);
//test equals 2

Time for a test:

function runFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = Math.floor(n);
}
trace("FloorTest: ", (getTimer()-time));
}
function runUintFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:uint = uint(n);
}
trace("UintFloorTest: ", (getTimer()-time));
}
function runIntFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:int = int(n);
}
trace("IntFloorTest: ", (getTimer()-time));
}
function runCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = Math.ceil(n);
}
trace("CeilTest: ", (getTimer()-time));
}
function runUintCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:uint = n == uint(n) ? n : uint(n)+1;
}
trace("UintCeilTest: ", (getTimer()-time));
}
function runIntCeilTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:int = n == int(n) ? n : int(n)+1;
}
trace("IntCeilTest: ", (getTimer()-time));
}
function runBitShiftFloorTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = 1.5;
var test:Number = n >> 0;
}
trace("BitShiftFloorTest: ", (getTimer()-time));
}
runFloorTest();
runUintFloorTest();
runIntFloorTest();
runBitShiftFloorTest();
runUintCeilTest();
runIntCeilTest();

traces out:

FloorTest:  1733
UintFloorTest: 176
*IntFloorTest: 157
BitShiftFloorTest: 178
UintCeilTest: 650
*IntCeilTest: 384

Math.abs

I was thinking about another one that I use sometimes which was using *-1 instead of Math.abs on a number

faster:

var nn:Number = -23
var test:Number= nn < 0 ? nn * -1 : nn;

slower:

var nn:Number = -23
var test:Number = Math.abs(nn);

Lets test!

function runABSTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = -1.5;
var test:Number = Math.abs(n);
}
trace("ABSTest: ", (getTimer()-time));
}

function runABSMultTest():void
{
time = getTimer();
for(var i:uint=0;i<10000000;i++)
{
var n:Number = -1.5;
var test:Number = n < 0 ? n * -1 : n;
}
trace("ABSMultTest: ", (getTimer()-time));
}
runABSTest();
runABSMultTest();

traces out:

ABSTest:  1615
*ABSMultTest: 153

(Morf) So... Taking this a bit further, instead of multiplying by -1, what about just assigning the inverse? The code is updated so n (instead of test) receives the new Abs value:

function runAbsTest():void  {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = Math.abs (n);
}
sumAbs += (getTimer() - time);
}

function runMm1Test():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n < 0 ? n * -1 : n;
}
sumMm1 += (getTimer() - time);
}

function runChsTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n < 0 ? -n : n;
}
sumChs += (getTimer() - time);
}

function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
n = n;
}
sumMpt += (getTimer() - time);
}

And what if we just tested to see if n was less than zero before doing anything?

function runIfmTest():void  {
time = getTimer();
for (var i:int = 0; i < 10000000; i++) {
var n:Number = -1.5;
if (n < 0) n = -n;
}
sumIfm += (getTimer() - time);
}

Traced out:

Abs - Mean over 25 loops
AbsTest: 1572.4
MultByMinus1Test: 121.96
ChsTest: 95.76
IfMinusTest: 76
EmptyTest: 32.48

Net timing results:
AbsTest: 1539.92
MultByMinus1Test: 89.48
ChsTest: 63.28
IfMinusTest: 43.52

Bear in mind the IfMinusTest code assumes the inverse operation must take place every time (n is always < 0) - so the actual implementation would probably run even faster.

Math.sqrt

In response to a question posted on how to determine distance, someone said “sqrt is an expensive operation”. So I tried it out against the old Babylonian Method that was known (as recently as the late 20th century =)!) to be quite accurate once you’d run it through a few iterations:

import flash.utils.getTimer;

var time: Number = getTimer();
var sumSqrt: Number = 0;
var sumAppr: Number = 0;
var sumMpt: Number = 0;
var loops: Number;

var thresh: Number = 0.00001;
var a: Number = 0,
b: Number = 0,
c: Number = 0,
w: Number = 0;

var count: int = 0;

w = 147.29;
trace ("Actual value =", Math.sqrt (w));// This is what AS3 gives us.
trace ("Testing threshold:", thresh);

b = w * 0.25;
do {
count++;
c = w / b;
b = (b + c) * 0.5;
a = b - c;
if (a < 0) a = -a;
} while (a > thresh);// This tests the algorithm.

c = Math.sqrt (w);
trace ("Number of iterations for approximation:", count);
trace ("Approximation:", b);
trace ("Error in approximation:", b - Math.sqrt (w));
trace (" - - - -");

function runSqrtTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var test:Number = Math.sqrt (w);
}
sumSqrt += (getTimer() - time);
}


function runApprTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var b:Number = w * 0.25,
a:Number;
c:Number;
do {
c = w / b;
b = (b + c) * 0.5;
a = b - c;
if (a < 0) a = -a;
} while (a > thresh);
}
sumAppr += (getTimer() - time);
}


function runEmptyTest():void {
time = getTimer();
for (var i:int = 0; i < 1000000; i++) {
var n:Number = w;
}
sumMpt += (getTimer() - time);
}


loops = 20;
for (var i:int = 0; i < loops; i++) {
runSqrtTest();
runApprTest();
runEmptyTest();
}

trace ("Math.sqrt - Mean over", loops, "loops");
trace ("Math.sqrt() Test: ", (sumSqrt /= loops));
trace ("Approximation Test: ", (sumAppr /= loops));
trace ("Empty Test: ", (sumMpt /= loops));

trace ("Net timing results:");
trace ("Math.sqrt() Test:", (sumSqrt - sumMpt));
trace ("Approximation Test:", (sumAppr - sumMpt));

I could not believe the results:

Actual value = 12.136309158883519
Testing threshold: 0.00001
Number of iterations for approximation: 6
Approximation: 12.13630915888352
Error in approximation: 1.7763568394002505e-15
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 172.05
Approximation Test: 134.8
Empty Test: 4.05
Net timing results:
Math.sqrt() Test: 168
Approximation Test: 130.75

Okay, what gives? Even my math co-processor on my old 286 (that I still do my taxes on) could do better than that! And when we boost the testing threshold up for one less iteration, we still get 10-digit accuracy, but with nearly a 1/3 speed boost:

Actual value = 12.136309158883519
Testing threshold: 0.002
Number of iterations for approximation: 5
Approximation: 12.136309166280597
Error in approximation: 7.397078505277932e-9
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 167.3
Approximation Test: 114.95
Empty Test: 3.95
Net timing results:
Math.sqrt() Test: 163.35000000000002
Approximation Test: 111
Speed Increase: 32.047750229568415 %

I can’t help feeling like I’m missing something here :T

If I’m not, then this algorithm will be faster than Math.sqrt.

UPDATE by K2xL.com@gmail.com: Whoever posted this tip must’ve used an older version of Flash. I ran the Math.sqrt improvement test with Flash 9 v. 115 today on March 26th, 2008. The results were the opposite. I think Adobe optimized their Math.sqrt. Here are the results I got when running your code:

Actual value = 12.136309158883519
Testing threshold: 0.00001
Number of iterations for approximation: 6
Approximation: 12.13630915888352
Error in approximation: 1.7763568394002505e-15
- - - -
Math.sqrt - Mean over 20 loops
Math.sqrt() Test: 36.65
Approximation Test: 60.1
Empty Test: 26.8
Net timing results:
Math.sqrt() Test: 9.849999999999998
Approximation Test: 33.3

(Morf) UPDATE RESPONSE: That revision of Flash did speed a lot of things up, including Math.sqrt(); I suspect they streamlined calls in general (this process is by far the biggest culprit in this problem). I did go ahead and create faster code that is way way uglier. It is faster by about 20% under Flash 10... but only gives approx .0001 accuracy, which is plenty for Papervision3D applications. In any case, I don’t recommend its use as it will not offer any increase in speed unless used inline (not as any sort of function or method).

Here’s the package used for testing:

package
{
import flash.display.*;
import flash.text.*;
import flash.utils.getTimer;

public class Main extends Sprite
{
public function Main()
{
var time: Number = getTimer();
var sumSqrt: Number = 0;
var sumAppr: Number = 0;
var sumMpt: Number = 0;
var loops: Number;
var resultMsg:TextField = new TextField();
addChild (resultMsg);

var thresh: Number = 0.002;
var a: Number = 0,
b: Number = 0,
c: Number = 0,
ww: Number = 0,
www: Number = 0;

var count: int = 0;
var j: int = 0;
var jj: int = 0;

function runApprTest():void
{
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var sqNum:Number = i + 0.208;


//--Using the Babylonian Method for computing square roots efficiently requires making the initial fast
// approximation as accurate as possible, insuring that the number of iterations can be minimized while
// perserving a desired level of accuracy.
//
var apprxInit: int;
var apprxRoot: Number;
var initVal: Number;
var invVal: Boolean;

if (sqNum < 1) {
initVal = 1 / sqNum;
invVal = true;
} else {
initVal = sqNum;
invVal = false;
}
apprxInit = initVal;

//--Make first approximation of APPRXROOT: Set the starting approximation APPRXINIT to determine
// its binary degree; in this case, the location of its most significant bit to determine its length
// in bits.
// In AS3, the most efficient method for accomplishing this is by comparing it to integers equal to odd
// powers of 2, then shifting it right half the number of bits of its length. This unfortunately results
// in rather unseemly code. The final treatment is one iteration of the Babylonian Method. "Hey, Morf
// said it's fast and it works"
// Well, yes... but this MUST BE USED AS INLINE CODE FOR ANY SPEED OPTIMIZATION TO BE REALIZED.
//
if (apprxInit > 7) {
if (apprxInit < 32768) {
if (apprxInit < 128) {
if (apprxInit < 32) {
apprxInit >>= 2;
if (apprxInit < 4) apprxInit++;
} else {
apprxInit >>= 3;
}
} else {
if (apprxInit < 2048) {
if (apprxInit < 512) {
apprxInit >>= 4;
} else {
apprxInit >>= 5;
}
} else {
if (apprxInit < 8096) {
apprxInit >>= 6;
} else {
apprxInit >>= 7;
}
}
}
} else {
if (apprxInit < 8388608) {
if (apprxInit < 524288) {
if (apprxInit < 131072) {
apprxInit >>= 8;
} else {
apprxInit >>= 9;
}
} else {
if (apprxInit < 2097152) {
apprxInit >>= 10;
} else {
apprxInit >>= 11;
}
}
} else {
if (apprxInit < 134217728) {
if (apprxInit < 33554432) {
apprxInit >>= 12;
} else {
apprxInit >>= 13;
}
} else {
apprxInit >>= 14;//What are you doing with a number this big anyway? Not bothering with yet another test.
}
}
}
apprxRoot = (apprxInit + initVal / apprxInit) * 0.5;
} else if (apprxInit < 2) apprxRoot = initVal * 0.5 + 0.5
else apprxRoot = initVal * 0.25 + 1;
//
//-- First approximation will be within about 1% at twice the speed and may be sufficient for collision detection computations.

//-- Now perform as many additional approximations as desired; fourth iteration = same precision as Math.sqrt().
//
apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5;// babylonian iteration 2 -- about 20% faster.
//apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5;// babylonian iteration 3 -- seven-digit precision.
//apprxRoot = (apprxRoot + initVal / apprxRoot) * 0.5;// babylonian iteration 4 -- full precision.

if (invVal) apprxRoot = 1 / apprxRoot;

}
}
sumAppr += (getTimer() - time);
}


function runSqrtTest():void
{
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var w:Number = i + 0.208;
var test:Number = Math.sqrt (w);
}
}
sumSqrt += (getTimer() - time);
}


function runEmptyTest():void {
time = getTimer();
for (var ii:int = 0; ii < 20; ii++) {
for (var i:int = 33000; i < 44000; i++) {
var n:Number = i + 0.208;
}}
sumMpt += (getTimer() - time);
}


loops = 50;
for (var i:int = 0; i < loops; i++) {
runSqrtTest();
runApprTest();
runEmptyTest();
}

resultMsg.text = "Math.sqrt - Mean over " + loops + " loops"
+ "/n" + "Math.sqrt() Test: " + (sumSqrt /= loops)
+ "/n" + "Approximation Test: " + (sumAppr /= loops)
+ "/n" + "Empty Test: " + (sumMpt /= loops)
+ "/n" + "Net timing results:"
+ "/n" + "Math.sqrt() Test: " + (sumSqrt - sumMpt)
+ "/n" + "Approximation Test: " + (sumAppr - sumMpt)
+ "/n" + "Speed Increase: " + 100 * (sumSqrt - sumAppr) / (sumSqrt - sumMpt) + "% (anecdotal)";
resultMsg.autoSize = TextFieldAutoSize.LEFT;
}
}
}
原创粉丝点击