Digital Audio, Standards, and Chips

By Henk Muller, Principal Technologist, XMOS

Digital audio provides a means for both professionals and consumers to record, modify, and play high-quality audio. USB-Audio-1.0, FireWire, and CobraNet are popular choices. But they’re showing shortcomings in terms of available bandwidth, limited synchronization, and lack of support on desktops and laptops.

Two new standards are overtaking existing standards: USB Audio-2.0 and the AVB audio/video bridging standard based on IEEE 802. USB Audio-2.0 allows many more channels than Audio-1.0. It also has robust time-synchronization mechanisms. AVB is the IEEE audio-over-Ethernet standard. Unlike CobraNet, AVB isn’t licensed and is an open standard. It’s integrated with open time-synchronization and bandwidth-allocation protocols.

USB Audio-2.0 and AVB both aim to cover all segments of the audio markets. Given that every PC has several USB-2.0 ports, USB Audio-2.0 is a likely candidate for the consumer and prosumer markets. For example, USB speakers and microphones could all be built around the USB Audio-2.0 standard. For a more complex system, such as a USB 7.1 surround-sound system, Blu-ray players and/or televisions would need to be equipped with a USB-2.0 port. But USB cables are typically restricted in length. In addition, USB Audio-2.0 is less likely to deliver audio over long distances.

Because AVB is based on Ethernet rather than USB, it’s appropriate for longer-distance, networked distribution of audio in both pro-audio applications and multi-room consumer installation. It also is suitable for applications that don’t have a typical host-device interaction model, such as the automotive market. The Ethernet cable can simultaneously carry data (Internet, GPS) and audio between a variety of devices.

All modern protocols are based around clock-recovery algorithms. The audio clock is carried with either the audio stream or using a separate time-exchange protocol. One clock in the network is denoted the master clock. Other devices will follow this clock by locally recovering the master clock. Depending on the protocol, just the audio clock is recovered or both an audio clock and a global common reference time.

For the new protocols to be adopted, they must satisfy two criteria: low latency and bit perfect operation. Low latency can be achieved if the digital device can avoid buffering data by processing each sample of data in a predictable and deterministic manner. The only buffer that’s required is the one that smoothes out bursts of traffic (due to, for example, network switches). It also must buffer a small amount of data to cope with short- and medium-term fluctuations between the master and locally recovered clock. The minimal fundamental latency on USBAudio- 2.0 and AVB endpoints is around 250 {LC MU?}s (to store two frames of data). Typically, operating-system (OS) drivers and application programs add several milliseconds of delay.

Bit-perfect operation would seem an easy target. Just make sure your protocol is implemented correctly! Audio protocols have many corner cases, however, and need to be timed to perfection in order to guarantee zero sample loss. In particular, there’s no retransmission or other form of backup in case a bit is corrupted or data is missing from a data stream. In addition, bit errors on the physical network (caused by electrical interference) will translate directly into an audible bit error in the signal. In this respect, the uptake of AVB could be seriously harmed if vendors push for Gigabit Ethernet while customers still have old Category-5 wiring that isn’t up to Gigabit standards.

Both interfaces support a variety of optional extras that can be implemented in the digital domain, such as sample- rate conversion, mixing, and equalization. In order to customize their products, manufacturers of digital-audio devices will look at spare processing capacity (DSP capability) to implement those features. The addition of these features cannot interfere with the timing of other parts, as that may compromise bit-perfect operation.

Chip vendors have chosen a variety of methods for implementing those new audio standards: application-specific standard products (ASSPs), field-programmable gate arrays (FPGAs) configured using a hardware description language (HDL), or processors programmed in a high-level language. Typically, ASSPs implement a specific device and hence only a subset of the protocol. If this subset matches the requirements, an ASSP provides an out-of-the-box solution. However, it’s difficult to add extensions to the subset of the protocol, follow changes and improvements to standards, or differentiate a design. An example ASSP-based solution is offered by C-media. A single chip offers an out-of-the-box, two-channel synchronous USB-Audio-2.0-to-I2S terminal.

Programmable solutions, such as FPGAs or processors, can be upgraded when designing future products or even field-upgraded to add functionality. For example, Xilinx and Broadcom offer an FPGA implementation of AVB that can be modified once the IP has been purchased. XMOS offers a processor that can implement the I/O, signal processing, and USB or Ethernet stacks—all using a high-level language. As such, reference designs and software can be downloaded (at no cost). The software can be modified at will in order to customize the design or implement extra functionality.

These are exciting times for digital audio. New standards are taking over and both device vendors and OS designers are working hard to make systems compliant and hence interoperable. (Apple, for example, natively supports USBAudio- 2.0 since MacOSX 10.6.) The new standards will support bit-perfect transmission of many 24-bit channels at 96 or even 32-bit channels at 192 kHz. All of these protocols provide clock-recovery schemes and are designed to provide a low-latency solution.

Henk Muller is currently the principal technologist at XMOS Ltd. In that role, he has been involved in the design and implementation of audio and other real-time protocols. Previously, Muller worked in academia for 20 years in computer architecture, compilers, and ubiquitous computing. He holds a doctorate from the University of Amsterdam.