J2ME Benchmarking: A Review @ JDJ

来源：互联网发布：python入门书籍知乎编辑：程序博客网时间：2024/05/02 07:01

It could be argued that the clock speed of a given processing

platform enables you to estimate the execution tiME of a user

application running on that platform.

However, quoting figures such as MIPS (millions of

instructions per second) are soMEwhat futile, since the execution of

a specific number of instructions on one processor will not

necessarily accomplish the saME end result as that saME number of

instructions running on a different processor. It's the execution

speed of a given set of instructions that's of greater concern when

selecting an appropriate platform to run application code.

Clearly soME platforms will be more proficient than others in

this regard, though this is a difficult paraMEter to quantify since

it's dependent to a large extent upon the application code in

question. Benchmarking is the technique used to MEasure the speed at

which a particular platform is able to execute code. Indeed, this is

evident in the abundance of benchmarks available. NuMErous examples

of Java benchmarking are listed at

http://www.epcc.ed.ac.uk/javagrande/links.html.

Benchmarks vary significantly in their complexity, but

invariably they comprise a number of lines of code that, when

executed on the platform being tested, generates a discrete value to

use during its appraisal. This facilitates a comparison of the

execution speed with similar platforms. Typically there are three

types of benchmarks, which have inherited titles in accordance with

their origin:

User

Manufacturer

Industry

User benchmarks are, as the naME suggests, created by any

individual with an interest in the field. Countless examples are

available and characteristically they vary in quality; in the past

benchmarks of this type have been very influential.

Market incentives have driven the introduction of

manufacturer benchmarks; invariably these are written to benefit the

platform in question and so can be disregarded unless used to

facilitate the relative performance of platforms offered by that

particular vendor.

Finally, the financial significance of benchmarking has

resulted in the developMEnt of industry benchmarks, which are usually

considered to be of high integrity. Such benchmarks are defined by an

independent organization, typically composed of a panel of industry

specialists.

Why Write a Paper on Java Benchmarking?

Results are published for multiple benchmarks and the primary

issues can be clouded by hype; as a consequence the selections

available to the end user are soMEwhat overwhelming. The crucial

point is how well your code performs on the chosen system, so the

question is: How do you identify a benchmark that best models your

application? An understanding of benchmarks is vital to enable the

user to select an accurate MEasureMEnt tool for the platform in

question and not be misled by the results.

The purpose of this article is to educate device

manufacturers, OEMs, and, more specifically, J2ME developMEnt

engineers, while at the saME tiME resolving any remaining anomalies

in a discipline that's commonly misunderstood.

What Is a Benchmark?

FundaMEntally, a benchmark should incorporate programs that,

when invoked MEthodically, exhaustively exercise the platform being

tested. Implicit in this process is the generation of a runtiME

figure corresponding to the execution speed of the platform.

Benchmarks can be simplistic, comprising a sequence of simple

routines executed successively to check the platform's response to

standard functions (e.g., MEthod invocation). Typically, both the

overall elapsed tiME and that for each routine in isolation is

considered; in the forMEr case it's usual to assert a weighting

coefficient to each routine that's indicative of its relevance in the

more expansive context. Each routine should run for a reasonable

amount of tiME. The issue here is an assurance that performance

statistics are not lost within overheads at start-up.

Benchmarks can also be more substantive; for example,

processor-intensive applications can check multithreading by running

several other routines simultaneously to evaluate context switching.

Essentially there's no substitute for running the user's own

application code on the platform in question. However, while this

arguMEnt is laudable, it's beyond reasonable expectation that the

platform manufacturer can impleMEnt this. To facilitate an accurate

appraisal, it's vital that any standard benchmark utilized by

competing manufacturers should mimic as much as possible the way the

platform will ultimately be used.

The Advantages and Limitations of Benchmarking

Industry benchmarks are useful for providing a general

insight into the performance of a machine. Still, it's important not

to rely on these benchmarks since such a preoccupation distracts from

the bigger picture. While they can be employed generally to realize

the efficient comparison of different platforms, they have

shortcomings when applied specifically. For example, one function may

be heavily used in the application code when compared to another, or

certain functions may run concurrently on a regular basis. There are

inherent benefits in developing your own benchmark as this

facilitates the tailoring of routines to imitate the end application

or to expose specific inadequacies in peripheral support.

Manufacturers' benchmarks can be written to aid the cause of specific

vendors and so can easily be tailored to mislead.

When considering more restrictive embedded environMEnts, such

as those used by J2ME-compliant devices, it becoMEs apparent that the

application developer must consider the risks inherent in the

hardware impleMEntation of a virtual machine prior to making a

purchasing decision.

Speed is a primary consideration when adopting a JVM within

restricted environMEnts; impleMEntations of the J2ME vary

significantly in this respect, from JVMs that employ software

interpretation and JIT compilers that compile the bytecode to target

machine code while the application is being executed, to native Java

processors offering much greater performance.

Other factors to consider include the response tiME of the

user interface, impleMEntation of the garbage collector, and MEmory

issues since consuMEr devices don't have access to the abundant

resources available to desktop machines. While this may seem a

tangential point as far as benchmarking is concerned, it's one worth

making since it's imperative that these areas in particular are

comprehensively exercised. Subject to these caveats, benchmarking is

a valuable technique that aids in the evaluation of processing

platforms, and, more specifically, J2ME platforms.

Java-Specific Benchmarks

As with other platforms, nuMErous Java benchmarks have

appeared (see Fig 2).

CaffeineMark is a pertinent instance of a benchmark since its

results are among those most frequently cited by the Java community.

On this basis we chose it as an example for further discussion.

CaffeineMark encompasses a series of nine tests of similar

length designed to MEasure disparate aspects of a Java Virtual

Machine's performance. The product of these scores is then used to

generate an overall CaffeineMark. The tests are:

Loop: Employs a sort routine and sequence generation to

quantify the compiler optimization of loops

Sieve: Utilizes the classic sieve of Eratosthenes to extract

priME numbers from a sequence

Logic: Establishes the speed at which decision-making

instructions are executed

MEthod: Executes recursive function calls

Float: Simulates a 3D rotation of objects around a point

String: Executes various string-based operations

Graphics: Draws random rectangles and lines

Image: Draws a sequence of three graphics repeatedly

Dialog: Writes a set of values into labels and boxes on a form

An embedded version of CaffeineMark is available that

excludes the scores of the Graphics, Image, and Dialog tests from the

overall score. Furthermore, CLDC doesn't support floating-point

operations, so the "Float" test is ineffective in this context. This

benchmark is regularly updated to account for vendor optimizations

and continues to be a reasonably accurate predictor of performance

for JVMs.

Bearing this in mind, alongside the high take-up of

CaffeineMark in the industry, it's unfortunate that it's unsuitable

for embedded environMEnts such as J2ME. The cogency of this arguMEnt

is based upon its inability to benchmark the interaction of Java

subsystems, and the subsequent failure to imitate typical real-world

applications faced by such devices. More specifically, it doesn't

take into account certain situations in which a platform may have to

cope with a heavily used heap, the garbage collector running all the

tiME, multiple threading, or intensive user interface activities.

To address soME of these issues, representatives of leading

companies in the field have recently forMEd a committee under the

banner of the Embedded Microprocessor Benchmark Consortium (EEMBC) to

discuss the introduction of an industry benchmark for J2ME devices.

What Is EEMBC?

EEMBC (www.eembc.org) is an independent industry

benchmarking consortium that develops and certifies real-world

benchmarks for embedded microprocessors; the consortium is

established among manufacturers as a yardstick for benchmarking in

this context. A principal concern of the committee is to produce

dependable MEtrics, enabling system designers to evaluate the

performance of competing devices and consequently select the most

appropriate embedded processor for their needs. The industry-wide

nature of such committees intrinsically helps to combat the practice

among soME vendors of striving to artificially improve their ratings

via special optimizations of the compiler, which is now so wretchedly

prevalent.

A subcommittee was recently forMEd under the umbrella of this

organization to develop similar benchmarks for hardware-based virtual

machines. Founding companies within the consortium include Vulcan

Machines Ltd, ARM, Infineon, and TriMEdia. Primarily the committee

aims to identify the limitations of existing Java benchmarks, and to

develop new ones in which "real-world" applications are afforded a

higher priority than low-level functions.

An example benchmark conceived on this basis could be a Web

browser. Since this is a very intensive end application in almost

every respect, a figure relating to the proficiency of the device

running low-level code in isolation wouldn't prove particularly

representative of its functionality.

Consequently, the EEMBC consortium solution is expected to

employ a series of applications reflecting typical real-world

scenarios in which CDC- and CLDC-compliant devices can be employed.

Further examples of such benchmarks include a generic gaME or

organizer that exercises intensive garbage collection, scheduling,

high MEmory usage, user interface, and dynamic class loading. This

way system designers are able to evaluate potential devices for

inclusion in their end application by the appraisal of a benchmark

derived in an environMEnt that's analogous to that application.

Other Considerations?

When applied prudently, benchmarks are an invaluable asset

that aid in the selection of hardware to suit a particular

application. However, they shouldn't be regarded as the sole

criteria. It's imperative that J2ME-embedded system designers don't

rely upon the use of benchmarks exclusively, since the issue is

clouded by many other factors.

In the context of J2ME, systems extend beyond the virtual

machine to its interaction with peripheral devices such as a MEmory

interface; clearly such peripherals and the interfaces to them must

be considered when MEasuring the tiME it takes to execute an

application. In the case of MEmory, limitations will be imposed on a

J2ME-optimized device; this raises nuMErous issues that may impact

the performance of the device, for example, garbage collection.

Also, implicitly, batteries are employed to power hardware

that's compliant with the CLDC specification. Consequently, power

consumption of the virtual machine is of primary concern and,

accordingly, the clock speed must be kept to a minimum. For example,

it's pertinent here that while software accelerators may post

acceptable benchmark scores, they may also, as a consequence of their

reliance upon a host processor, consuME excessive power compared to a

processor that executes Java as its native language.

Another significant factor is the device upon which the

virtual machine is impleMEnted. The FPGA or ASIC process used will

clearly affect the speed at which the processor runs, and variations

in benchmark scores are a natural corollary of this. Furthermore, the

silicon cost of the entire solution that's required to execute Java

bytecode must be considered, particularly where embedded

System-on-Chip impleMEntations of the JVM are concerned. Similarly,

the designer should be aware of fundaMEntal issues such as the

"quality" of the JVM in terms of compliance with the J2ME

specification, reliability, licensing costs, and the reputation of

the hardware vendor for technical support. All these factors must be

considered in tandem with the benchmark score of the virtual machine

prior to making a purchasing decision.

Conclusion

No benchmark can replace the actual user application. At the

earliest possible stage in the design process, application developers

must run their own code on the proposed hardware, since similar

applications may post a significant disparity in terms of performance

on the saME impleMEntation of the virtual machine. However, since

designers are often focused on using their tiME more productively,

they frequently rely upon industry benchmarks for such data. While

there's no panacea, industry benchmarks such as that proposed by

EEMBC are a useful tool to aid in the evaluation of performance,

provided you're aware of its limitations in a J2ME environMEnt.

Resources

Coates, G. "Java Thick Clients with J2ME." Java Developer's

Journal. Vol. 6, issue 6.

Coates, G. "JVMs for Embedded EnvironMEnts." Java Developer's

Journal. Vol. 6, issue 9.

Cataldo, A. (April, 2001). "Java Accelerator Vendors Mull

Improved Benchmark." Electronic Engineering TiMEs.