Living with Leaks

来源：互联网发布：linux内核设计与开发编辑：程序博客网时间：2024/05/05 14:54

The first piece of advice I ever received about object-oriented programming is that you never get the interface right the first time. Over the years, I've come to appreciate that interface design is the most challenging aspect of objects. Selecting the correct level of abstraction that hides the complexity of the implementation (but provides adequate control of the relevant details) can be a daunting task. Everyone has different ideas regarding "adequate control" and "relevant details."

The Law of Leaky Abstractions

Ideally, classes should provide good performance with a conceptually clean interface. However, systems that are clear and general tend to have inadequate performance. Performance is often improved at the expense of generality. [Keppel93] As more of the underlying implementation is exposed, the "abstraction begins to leak," in the words of Joel Spolsky.

Spolsky provides many familiar examples of leaky abstractions to support his observation. One example he offers involves SQL, a data-retrieval language that is supposed to abstract away the underlying process of querying the database. A given query issued using SQL may be thousands of times slower than a logically equivalent query, depending upon how the WHERE clause is written. The fact that logically equivalent requests do not perform the same is indicative of a leaking abstraction. In order to correct the problem, you need to crack open the query plan analyzer and muck around in the gory details of the implementation. The principle abstraction of SQL disappears.

Spolsky has developed what he calls the "The Law of Leaky Abstractions" for this common phenomenon. In short, the law says, "All non-trivial abstractions, to some degree, are leaky. Abstractions fail. Sometimes a little, sometimes a lot. There's leakage. Things go wrong. It happens all over the place when you have abstractions." [Spolsky02]

You Can't Have Leaks If You Don't Have Pipes

One way to handle leaky abstractions is to take the rather radical approach of dispensing with abstractions altogether. Consider the example of designing and implementing an operating system. There are some researchers in the field, such as Dawson Engler and M. Frans Kaashoek, who argue that operating systems fail to both properly abstract physical resources and provide adequate performance. In other words, operating systems either perform poorly or they leak.

But the failure to provide a performant operating system without Leaks isn't in poor design or implementation. The proponents for the extermination of operating-system abstractions hold that it is fundamentally impossible to abstract resources in a way that is useful to all applications and to implement these abstractions in a way that is efficient across disparate needs. [Engler95] In particular, they believe that operating systems abstract too much in terms of virtual memory, context-switching, and inner process communication, to the detriment of the applications that run on top of them. The way to fix the Leaks is to remove the abstractions.

Fortunately, most of us are not responsible for creating operating systems, and our goals are typically lower than a universal interface that is useful for all applications across all domains. We can hold onto our much-loved abstractions. And, in truth, even those who want to eliminate abstractions in the operating system actually favor them in general, just not in the OS. [Engler95] Still, the question of what to do about Leaks remains.

Some Joints are Leakier Than Others

A friend recently sent me a link to an extremely interesting paper by David Keppel, entitled "Managing Abstraction-Induced Complexity." It predates Spolsky's by nearly ten years and offers some nice insight into the strengths and weakness of five common interface models that attempt to balance conceptual cleanliness with performance (i.e., manage the Leaks). Keppel identifies the following models:

Fixed: The abstraction promises it will solve certain problems, but makes no promises about how it will do it.

Adaptive: The abstraction promises that it will figure out how to use a "good" implementation, even if differing client demands imply drastically different implementations of the interface.

Adjustable: The abstraction includes a meta-control "tuning knob," by which the client can pass usage hints to the interface.

Open: The abstraction includes a reflective mechanism, by which the client "injects" new code into the implementation. The injected code reimplements key details of the service so that they are optimized for the client.

Incomplete: The meta-control and primary interfaces are merged. The service provides only a part of the implementation. The client provides both tuning and missing parts of the service.

Examining Keppel's Continuum

Taken together, Keppel's models lie along a continuum. Each shares some of the strengths and weaknesses of its neighbor, yet is distinct. In terms of Leaks, they run from relatively watertight to a torrent. Let's look at each in turn.

Fixed Interface: Nearly Watertight

Fixed interfaces are generally easy to implement and use because they present the same abstraction to all clients. For example, let's assumes a relatively simple class that archives a collection of objects. In the case of the Fixed abstraction, the class looks like this:

public class Archivist {

public void archive(Vector objects) {

// Archive a collection of objects

}

The first thing to note is the simplicity of the class. There is only one method, archive, which takes a single parameter and returns void. It presents the same interface to all clients, and all clients use it in the same way. Second, the class performs the same implementation for all clients. There is no way to tweak or adjust performance. Third, provided the implementation performs within your expectations, there is little or no need for the extra control. You may need to know that the archive method serializes the objects, zips them up, and tucks the final file in a folder for later retrieval. However, you don't need to know this for the sake of performance. The performance of the implementation is what it is. You simply live with it. If designed correctly, Fixed interfaces are the least likely interfaces to leak. They are nearly watertight.

Adaptive Interface: PGP (Pretty Good Plumbing)

Adaptive interfaces perform differently based upon how they are used by the client. Typically, the interface presents alternative methods that provide implicit information that may be used for tuning. This may entail breaking down the overall process into a series of methods that implement the various steps. Some of these steps may be optional, and bypassed as a way to increase performance.

public class Archivist {

public Vector gather() {

// Gather collection of objects to archive

}

public Vector order(Vector objects) {

// Optionally order collection in some way

}

public Archive package(Vector objects) {

// Package the objects into Archive collection

}

public void store(Archive archive) {

// Persist Archive collection

}

The above class shows the archive process broken down into several steps. The order method is explicitly documented as optional. The client may decide that ordering is unimportant, and could possibly improve performance by not ordering the collection prior to packaging.

An even simpler way of offering the same control would be to offer an alternative to the primary method. In other words, the client could evoke archiveUnordered rather than archive from the class below.

public class Archivist {

public void archive(Vector objects) {

// Archive a collection of objects

}

public void archiveUnordered(Vector objects) {

// Archive a collection of objects without ordering them

}

The advantage to this style is that it continues to hide most of the steps involved in archiving the collection of objects, while providing the same performance tweak. The client doesn't have to worry about calling methods in the correct sequence, and can remain blissfully ignorant of most of the messy details.

Adaptive services perform only as well as their ability to deduce the client's needs. As you can see from the examples above, the ability to correctly infer those needs is based more on the design of the interface itself than on clever algorithms embedded in the methods. Like the Fixed interface, the Adaptive interface does a reasonable job of containing Leaks. However, Adaptive interfaces usually provide alternative means of evoking the service. These alternatives reveal clues to the underlying implementation, and thus introduce some leakage.

Adjustable Interface: PGP 2.0 (Pretty Good Plumbing II)

The Adjustable model gets around the problems inherent in inferring client intent by allowing clients to set preferences directly. This is usually expressed in some form of meta-control. For example, assuming the order/unorder performance tweak used in the Adaptive interface, the Adjustable interface might look like this:

public class Archivist {

private boolean ordered = true;

public boolean isOrdered() {return ordered;}

public void setOrdered(boolean ordered) {this.ordered = ordered;}

public void archive(Vector objects) {

// Archive a collection of objects

}

The meta-control is expressed in the ordered flag and adjusted using the setOrdered method. There are two important things to note here. First, the client needs to be instructed in the use of the setOrdered method, which weakens the abstraction. Second, the available adjustments are expressed in terms of concrete attributes. If a better set of attributes is found in the future, the interface, as well as the client, will need to be updated. While this may have little to do with leakiness, it does expose another weakness inherent to the interface. Overall, the Adjustable model does a pretty good job of masking the details, although, like the Adaptive interface, some details still leak out.

Open Interface: Introduction to Pressure Values

Open interfaces work by letting the client supply additional code in order to make the underlying "good enough" implementation even better. This interface is a good alternative in situations where it is too difficult to infer the client's needs or provide an adequate tuning mechanism. One example of an Open interface might be allowing the client to replace the default packaging method in the Archivist class with a user-defined implementation that outperforms the existing one.

public class Archivist {

private iPackaging packaging;

public void setOrdered(iPackaging packaging) {this.packaging =packaging;}

public void archive(Vector objects) {

// Archive a collection of objects

}

The technique is similar to being able to specify the comparator in a Comparable class. It differs in that the Open interface has a default implementation in place, while the Comparable class has no default comparator. In a way, you can think of the interface as a pressure value. Rather than force an abstraction, the Open interface accepts the situation and passes control of potential Leaks to the client. To paraphrase the old saying, "Better the leak you know, than the one you don't."

Incomplete Interface: Throw Open the Floodgate

The Incomplete interface is perhaps best thought of as a framework, or a tool set with which the client builds the implementation. The meta-control is typically merged with the primary interface. For instance, the Adjustable Archivist class could be modified to keep the basic adjustments for performance tuning, but leave the implementation of the archive method to the client. In other words, the ordered flag and its adjuster setOrdered is retained, but archive is marked abstract and requires that the client supply an implementation.

public abstract class Archivist {

private boolean ordered = true;

public boolean isOrdered() {return ordered;}

public void setOrdered(boolean ordered) {this.ordered = ordered;}

public abstract void archive(Vector objects);

}

Clearly, there is very little hiding of the implementation with this interface. In terms of Leaks, this is the veritable burst dam. The client makes use of the tuning mechanisms provided, but is otherwise responsible for supplying all the details. This is the leakiest of the interfaces.

Life Among the Leakers

It shouldn't take much effort for any of us to identify examples of these models in our own work.

Of the five, I tend towards Fixed and Adjustable. I feel a large part of my role as a developer is to hide details, and when necessary, give clients the simplest controls possible. If I'm responsible for the code working, I need a guarantee that only comes from owning the implementation. The Adaptive interface sounds cool, but also like a lot of work, especially on the design end. The Open model has the advantages of sharing some of the load with the client and managing the Leaks by purposely introducing them. However, the burden of responsibility increases dramatically, since you need to guard against errant client code. Finally, the Incomplete is by far the least likely interface for me. While it offers the ultimate flexibility, it turns the client side into a torrent of implementation details. So much so that I begin to question the purpose of the abstraction.

As Spolsky says, all abstractions leak, some more than others. Leaks are an inevitable part of interfaces. Knowing which models contain and manage Leaks better than others helps us to at least live with them.

References

Engler, Dawson R., Kaashoek, M. Frans. "Exterminate All Operating System Abstractions." Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V). 1995.

Keppel, David. "Managing Abstraction-Induced Complexity." University of Washington. 1993.

citeseer.nj.nec.com/keppel93managing.html

Spolsky, Joel. "The Law of Leaky Abstractions." Joel on Software, 2002

www.joelonsoftware.com

Craig Castelaz is a full-time husband and father who programs, teaches, and writes in his second full-ti