Architecture of Mac OS X Audio

来源：互联网发布：我的世界mac版在哪买编辑：程序博客网时间：2024/05/18 01:31

Figure 1-1 Mac OS X audio layers

Mac OS X audio layers

At the lowest level of the Mac OS X audio stack is the driver that controls audio hardware. The driver is based on the I/O Kit’s audio family, which provides much of the functionality and data structures needed by the driver. For example, the Audio family implements the basic timing mechanisms, provides the user-client objects that communicate with the upper layers, and maintains the sample and mix buffers (which hold audio data for the hardware and the hardware’s clients, respectively).

The basic role of the audio driver is to control the process that moves audio data between the hardware and the sample buffer. It is responsible for providing that sample data to the upper layers of the system when necessary, making any necessary format conversions in the process. In addition, an audio driver must make the necessary calls to audio hardware in response to format and control changes (for example, volume and mute).

Immediately above the driver and the I/O Kit’s Audio family—and just across the boundary between kernel and user space—is the Audio Hardware Abstraction Layer (HAL). The Audio HAL functions as the device interface for the I/O Kit Audio family and its drivers. For input streams, its job is to make the audio data it receives from drivers accessible to its clients. For output streams, its job is to take the audio data from its clients and pass it to a particular audio driver.

The Audio Units and Audio Toolbox frameworks are two other frameworks that provide specialized audio services. They are both built on top of the Audio HAL, which is implemented in the Core Audio framework.

MIDI System Services, which comprises two other frameworks, is not directly dependent on the Audio HAL. As its name suggests, MIDI System Services makes MIDI services available to applications and presents an API for creating MIDI drivers.

Finally, the ultimate clients of audio on Mac OS X—applications, frameworks, and other user processes—can directly access the Audio HAL or indirectly access it through one of the higher-level audio frameworks. They can also indirectly access the Audio HAL through the audio-related APIs of the application environments they belong to: Sound Manager in Carbon, NSSound in Cocoa, and the Java sound APIs.

The following sections examine each of these audio technologies of Mac OS X in more detail.

Audio HAL (Core Audio)

The Audio Hardware Abstraction Layer (HAL) is the layer of the Mac OS X audio system that acts as an intermediary between the I/O Kit drivers controlling audio hardware and the programs and frameworks in user space that are clients of the hardware. More specifically, the Audio HAL is the standardized device interface for the I/O Kit’s Audio family. It is implemented in the Core Audio framework (CoreAudio.framework) and presents both C-language and Java APIs. In the Audio HAL, all audio data is in 32-bit floating point format.

The API of the Audio HAL includes three main abstractions: audio hardware, audio device, and audio stream.

The audio hardware API gives clients access to audio entities that exist in the “global” space, such as the list of current devices and the default device.
The audio device API enables clients to manage and query a specific audio device and the I/O engines that it contains. An audio device in the Audio HAL represents a single I/O cycle, a clock source based on it, and all the buffers that are synchronized to this cycle. The audio device methods permit a client to, among other things, start and stop audio streams, retrieve and translate the time, and get and set properties of the audio device.
The audio stream API enables a client to control and query an audio stream. Each audio device has one or more audio streams, which encapsulate the buffer of memory used for transferring audio data across the user/kernel boundary. They also specify the format of the audio data.

The abstractions of audio device and audio stream loosely correspond to different I/O Kit Audio family objects in the kernel (see “The Audio Family”). For example, the entity referred to as “audio device” in the Audio HAL corresponds to a combination of an IOAudioDevice and IOAudioEngine in the kernel. For each IOAudioEngine the Audio HAL finds in the kernel, it generates an audio-device identifier. However, there is considerable overlap of role among the various Audio family and Audio HAL objects and entities.

A critical part of the APIs for audio hardware, devices, and streams involves audio properties and their associated notifications. These APIs allow clients to get and set properties of audio hardware. The “get” methods are synchronous, but the “set” methods work in an asynchronous manner that makes use of notifications. Clients of the Audio HAL implement “listener procs”—callback functions for properties associated with audio hardware, audio devices, or audio streams. When an audio driver changes a property of the hardware, either as a result of user manipulation of a physical control or in response to a “set” method, it sends notifications to interested Audio HAL clients. This results in the appropriate “listener procs” being called.

Just as important as the property APIs is the callback prototype (AudioDeviceIOProc) that the audio-device subset of the Audio HAL API defines for I/O management. Clients of the Audio HAL must implement a function or method conforming to this prototype to perform I/O transactions for a given device. Through this function, the Audio HAL presents all inputs and outputs simultaneously in an I/O cycle to the client for processing. In this function, a client of the Audio HAL must send audio data to the audio device (for output), or copy and process the audio data received from the audio device (for input).

Secondary Audio Frameworks

Mac OS X has several frameworks other than the Core Audio framework that offer audio-related functionality to applications. Two of these frameworks—Audio Units and Audio Toolbox—are built directly on the Core Audio framework. MIDI System Services (consisting of the Core MIDI and Core MIDI Server frameworks) does not directly depend on the Core Audio framework, but is still a consumer of the services of the audio frameworks.

All of these secondary frameworks are implemented in the C language and present their public programming interfaces in C. Thus, any application or other program in any application environment can take advantage of their capabilities.

Audio Units

The Audio Units framework (AudioUnits.framework) provides support for generating, processing, receiving, and manipulating or transforming streams of audio data. This functionality is based on the notion of audio units.

Audio units are one form of a building block called a component. A component is a piece of code that provides a defined set of services to one or more clients. In the case of audio units, these clients can use audio unit components either singly or connected together to form an audio signal graph. To compose an audio signal graph, clients can use the AUGraph API in the Audio Toolbox framework—see “Audio Toolbox” for details.

An audio unit can have one or more inputs and outputs. The inputs can accept either encoded audio data or MIDI data. The output is generally a buffer of audio data. Using a “pull I/O” model, an audio unit specifies the number and format of its inputs and outputs through its properties. Each output is in itself a stream of an arbitrary number of interleaved audio channels derived from the audio unit’s inputs. Clients also manage the connections between units through properties.

Examples of audio units are DSP processors (such as reverbs, filters, and mixers), format converters (for example, 16-bit integer to floating-point converters), interleavers-deinterleavers, and sample rate converters. In addition to defining the interface for custom audio units in the Audio Units framework, Apple ships a set of audio units. One of these is the MusicDevice component, which presents an API targeted specifically toward software synthesis.

Audio Toolbox

The Audio Toolbox framework (AudioToolbox.framework) complements the Audio Units framework with two major abstractions: the AUGraph and the Music Player.

An AUGraph provides a complete description of an audio signal processing network. It is a programmatic entity that represents a set of audio units and the connections (input and output) among them. With the AUGraph APIs, you can construct arbitrary signal paths through which audio can be processed. Audio graphs enact real-time routing changes while audio is being processed, creating and breaking connections between audio units “on the fly,” thus maintaining the representation of the graph even when constituent audio units have not been instantiated.

The Music Player APIs use AUGraphs to provide the services of a sequencing toolbox that collects audio events into tracks, which can then be copied, pasted, and looped within a sequence. The APIs themselves consist of a number of related programmatic entities. A Music Player plays a Music Sequence, which can be created from a standard MIDI file. A Music Sequence contains an arbitrary number of tracks (Music Tracks), each of which contains timestamped audio events in ascending temporal order. A Music Sequence usually has an AUGraph associated with it, and a Music Track usually addresses its audio events to a specific Audio Unit within the graph. Events can involve tempo and extended events, as well as regular MIDI events.

The Audio Toolbox framework also includes APIs for converting audio data between different formats.

MIDI System Services

MIDI System Services is a technology that allows applications and MIDI devices to communicate with each other in a single, unified way. It comprises two frameworks: Core MIDI (CoreMIDI.framework) and Core MIDI Server (CoreMIDIServer.framework).

MIDI System Services gives user processes high-performance access to MIDI hardware. In a manner similar to the Audio HAL, MIDI System Services implements a plug-in interface that enables clients to communicate with a MIDI device driver.

Note: MIDI device drivers are not I/O Kit drivers. The MIDI device driver model is based on the CFPlugIn architecture and typically loads a CFPlugIn bundle from/System/Library/Extensions or Library/Audio/MIDI Drivers.

For MIDI devices that cannot be directly addressed from a user-space device driver (for example, a MIDI interface built into a PCI card), you must split your driver into two parts: an I/O Kit device driver that matches against the device and a CFPlugIn bundle that manipulates the I/O Kit driver using a user client.

The details of implementing such a mechanism are beyond the scope of this document. For information on user clients, see Device-Interface Development.

Apple provides several default MIDI drivers for interfaces that comply with USB and FireWire MIDI interface standards. Using the Core MIDI Server framework, third-party MIDI manufacturers can create their own driver plug-ins to support additional device-specific features. A MIDI server can then load and manage those drivers.

Applications can communicate with MIDI drivers through the client-side APIs of the Core MIDI framework.

The Audio Family

The I/O Kit’s Audio family facilitates the creation of drivers for audio hardware. Drivers created through the Audio family can support any hardware on the system, including PCI, USB, and FireWire devices. Essentially, an I/O Kit audio driver transfers audio data between the hardware and the Audio HAL. It provides one or more sample buffers along with a process that moves data between the hardware and those sample buffers. Typically this is done with the audio hardware’s DMA engine.

Because the native format of audio data on Mac OS X is 32-bit floating point, the driver must provide routines to convert between the hardware format of the data in the sample buffer and 32-bit floating point. The sequence of steps that a driver follows depends on the direction of the stream. For example, with input audio data, the driver is asked for a block of data. It obtains it from the sample buffer, converts it to the expected client format (32-bit floating point), and returns it. That data is then passed by the family to the Audio HAL through a user-client mechanism.

The interactions between the DMA engine, the driver, and the Audio HAL, are based on the assumption that, in any one direction, the stream of audio data proceeds continuously at the same rate. The Audio family sets up several timers (based on regularly taken timestamps) to synchronize the actions of the agents involved in this transfer of data. These timing mechanisms ensure that the audio data is processed at maximum speed and with minimum latency.

Take again an input stream as an example. Shortly after the DMA engine writes sample frames to the driver’s sample buffer, the driver reads that data, converts the integer format to 32-bit floating point, and writes the resulting frames to the mixer buffer, from whence they are passed on to the Audio HAL. Optionally, just before the DMA engine writes new frames to the same location in the sample buffer, an “erase head” zero-initializes the just-processed frames. (By default, however, the erase head only runs on output streams.)

For more on the sample buffer and the timer mechanisms used by the Audio family, see “The Audio I/O Model on Mac OS X.”

An I/O Kit audio driver consists of a number of objects, the most important of which are derived from the IOAudioDevice, IOAudioEngine, IOAudioStream, and IOAudioControl classes. These objects perform the following roles for the driver:

A single instance of a custom subclass of IOAudioDevice represents the audio device itself. The IOAudioDevice subclass is the root object of a complete audio driver. It is responsible for mapping all hardware resources from the service-provider’s nub and for controlling all access to the hardware (handled automatically through a provided command gate). An IOAudioDevice object manages one or more IOAudioEngine objects.
An audio driver must contain one or more instances of a custom subclass of IOAudioEngine. This custom subclass manages each audio I/O engine associated with the audio device. Its job is to control the process that transfers data between the hardware and a sample buffer. Typically the I/O process is implemented as a hardware DMA engine (although it doesn’t have to be). The sample buffer must be implemented as a ring buffer so that when the I/O process of a runningIOAudioEngine reaches the end of the buffer, it wraps back around to the beginning and keeps going.
An IOAudioEngine object is also responsible for starting and stopping the engine, and for taking a timestamp each time the sample buffer wraps around to the beginning. It contains one or more IOAudioStream objects and can contain any number of IOAudioControl objects.
All sample buffers within a single IOAudioEngine must be the same size and running at the same rate. If you need to handle more than one buffer size or sampling rate, you must use more than one IOAudioEngine.
An instance of IOAudioStream represents a sample buffer, the associated mix buffer, and the direction of the stream. The IOAudioStream object also contains a representation of the current format of the sample buffer as well as a list of allowed formats for that buffer.
An instance of IOAudioControl represents any controllable attribute of an audio device, such as volume or mute.

An I/O Kit audio driver uses two user-client objects to communicate with the Audio HAL layer. The Audio HAL communicates with the IOAudioEngine and IOAudioControl objects through the IOAudioEngineUserClient and IOAudioControlUserClient objects, respectively. The audio family creates these objects as they are needed. The IOAudioEngineUserClient class provides the main linkage to an IOAudioEngine subclass; it allows the Audio HAL to control the IOAudioEngine and it enables the engine to pass notifications of changes back to the Audio HAL. For each IOAudioControl object in the driver, an IOAudioControlUserClient object passes notifications of value changes to the Audio HAL.

For more detailed information on the classes and general architecture of the Audio family, see the chapter “Audio Family Design.”