On-Chip Debug

New technologies yield insight into complex CPUs

By Craig A. Haller

Many modern processors have a new feature: On-Chip Debug (OCD). While implemented differently by different families of processors, there is commonality in the core feature set for OCD. Understanding those core features and the tools available for accessing those features will help designers get the most out of this unique methodology.

For many years, the principle method for debug was the ROM monitor, a stripped-down version of today’s “board support package.” The monitor was comprised of software that resided on the target board to implement the basic debug functions of read memory, write memory, and the like. But ROM monitors presented a classic chicken-and-egg dilemma: how to debug the debugger? The standard approach to debugging the monitor was the “crash and burn” method-write code, burn it into memory, try it, and watch where it failed.

Many of today’s processors have eliminated the need for these older methods by incorporating OCD into the processor itself. OCD is designed to be used without taking any resources from the processor or target design and is an elegant solution for today’s complex processors.


The first requirement for OCD is a communication channel between the debugging tool and the software. This channel needs to link directly into the CPU, as using an on-board UART or other peripheral would be invasive, would consume board resources, and would require the sharing of resources between debugging and normal board operation.

The most common channel used today is the Joint Test Action Group (JTAG) port defined by IEEE Standard 1149.1. The JTAG standard provides five lines for bi-directional communication with the interior of an IC, and can be daisy chained among various devices to share the lines. (see Figure 1)

Figure 1: On-chip debug typically uses a JTAG port as its communications channel into the CPU, allowing access to internal registers and trace data without taking up applications resources.

However, the use of JTAG, or any other communication channel, does not define the ability or configuration of the CPU’s OCD. These are purely electrical standards for defining a communication channel, while the function of an OCD depends on the types of commands it accepts and the responses it provides.

For instance, an OCD interface may have more than data signals; it may have control lines such as RESET or BREAK, as well. Status lines that describe the state of the processor may also be present. Additionally, some processors provide the OCD with code execution trace data, typically encoded, to allow further debug. These features establish the OCD’s capability, not the channel.

For the designer to use the OCD in most instances, the processor must be in a special debug mode. In this mode, the processor stops executing user code and enters instead an alternative state that varies with the CPU. Typically, clocks are stopped during the debug, although some processors keep the watchdog clock alive, causing unexpected behavior during debug if the engineer is unaware of its presence. (see Figure 2)

Figure 2: A debug session using on-chip debug is transparent because the debugger is unaware of connections hidden in the low-level drivers.

OCD implementations

Historically, the Motorola CPU16 and CPU32 families were among the first to implement OCD. Motorola took the tack of writing a ROM monitor into the microcode of the processor itself. The monitor and its interface, in combination, were named Background Debug Mode (BDM). Today, the higher-end Motorola PowerPC processors use JTAG for a communications channel. The JTAG channel does not talk to an internal monitor, however. Instead, it uses a long scan chain-in other words, a shift register-that can be used to monitor or alter the state of many of the CPU’s internal devices.

The JTAG scan chain was designed to be used for chip testing, but has been adapted for software and hardware debug by including the CPU’s registers, program counter, and other control elements as part of the chain. There’s a downside, however. The resulting chain may be very long-upwards of tens of thousands of bits-so that setting up and executing a basic debug operation may be slow.

A different approach is available from Intel, which has added OCD to the Intel XScale® microarchitecture. Intel has chosen to implement an OCD that is similar to a ROM monitor. A debugger tool downloads a monitor-type image into a special purpose instruction cache on the CPU-a two-kilobyte mini instruction cache that is a physically separate space from the CPU’s 32 kilobyte main instruction cache. From there, the JTAG lines are used as a UART, a simple communication channel for commands to the monitor for example. There are no particular specifications on the debug monitor in Intel’s OCD; each debugger tool provides its own implementation.

Some Intel processors have additional features in their OCDs. Because the Intel Xscale technology is based on the ARM core processor, for instance, the ARM Embedded Trace Macrocell (ETM) module is available on some parts. The ETM is a real-time trace module that monitors the path that the CPU takes through code execution, and provides an encrypted data block that describes the path. By understanding the encryption and having access to the actual source code, a debugger can trace all the steps the processor has taken.

To access this trace data or any other OCD information, the debugger tool must exchange commands and data with the monitor in the mini-instruction cache. Technically, the code that is downloaded to the mini-I cache is not a monitor, but a debug exception handler. The debug mode forces the processor to re-direct code execution to the cache when it encounters a debug exception. These exceptions include instruction and data breakpoints, software breakpoints, external debug breaks, exception vector traps, and a trace-buffer-full break.

Two instructions and two data breakpoint registers are available in Intel’s OCD. These allow the debugger to set up a break before execution of a target instruction and/or a break after a specific memory access has been performed. Software breakpoints are implemented via a breakpoint instruction.

All of the standard debug functions are handled by the exception handler, which implements the reading and writing of registers and memory as well as the stepping and resuming of code execution. Using the JTAG channel as a simple serial communications line, the handler can also interact with programmers, performance monitors, and various host debuggers.

Host software needed

OCD does not use any resources external to the processor chip itself. It is, therefore, an ideal method of debug for initial testing of a processor board. A prototype may arrive with errors in the interface to memory, or I/O devices that would prevent access to a monitor-type debugger. Using the on-chip debug features allows you to test and verify these circuits even if there are errors.

Obviously, the on-chip debug function of modern processors is completely useless without host software in a debugging tool to take advantage of the capability. There are several types of applications available to enable the complete design, debug, test, and programming of the target system. When looking for a debugger to interface with the OCD, there are several things to consider. Besides the classic issues of cost and support, look for features such as hardware breakpoint support, trace, and real-time displays that can simplify the debug effort.

Once the board itself is verified, the debugger also supports the porting of a board support package, the designing and implementing of basic interrupts, and system initialization. Several manufacturers have debuggers designed specifically for bringing up hardware and basic systems. OCD Commander from Macraigor Systems, LLC (Brookline, MA) and visionCLICK from Wind River Systems, Inc. (Alameda, CA) are two examples.

Once a developer has gotten past hardware test and board initialization, the challenge is to debug the application software. Many of today’s full-featured software debugger tools support the use of OCD as an integral part of the application debugging tool kit. Code|Lab Debug from Mentor Graphics Corp. (Wilsonville, OR) is an example of such a debugger. All of the classic debug windows are available in the Mentor tool including source and disassembly, registers, memory, watchpoints, and so on. Advanced features include hardware breakpoints, trace, scripting, multi-task debugging and profiling.

Similar debuggers are available from various other vendors, and many offer awareness for popular real-time operating systems. There are also versions of the GNU toolset available, at no cost, for those that are fans of the Free Software Foundation (www.gnu.org). This toolset includes a full GUI debugger, C and C++ compiler, linker, and everything needed to start development immediately.

Uses beyond debug

Beyond board test and application development, OCD can be used for other things as well. Flash programming is one such popular use that solves a problem with this type of memory. Today’s flash memory chips are being sold in sophisticated packages with leads that are too finely pitched-if they have leads at all-for stand-alone programmers. Furthermore, the chips need to be attached to the board during the initial production, so that they end up being mounted without being programmed. (see Figure 3)

Figure 3: Some on-chip debug host tools also offer the ability to perform in-circuit programming of flash memory.

Because an on-board programmer most likely uses the CPU and system memory for some of its interactions, minimally those subsystems must be working and debugged before the memory can be programmed. OCD provides a means for verifying the processor connections to memory, then can be used for programming the flash memory in-circuit once the connections to the processor have been verified. This feature is frequently available as a stand-alone application, as well as it being included in several manufacturers’ debuggers.

Further applications

A final use for OCD is not debug at all, but production line testing. For the cost of a simple header as the final product is being assembled-full test, initialization, calibration, programming and burn-in may all be performed. Test suites are available to execute all forms of memory tests and CPU tests. Input and output devices may be tested to the extent that a test fixture will allow. Calibration of any devices on the board can be automated and followed by application programming and burn-in cycles. Any resource on the board accessible to the CPU, is available via the OCD connection. Some OCD connections will also allow for IEEE boundary scan testing, a hardware-level test that may or may not be required.

Given the complexity of today’s processors and the systems in which they are embedded, on-chip debug capability is clearly vital to the success of a design. Debugging can start with a minimal prototype; all that is needed is a working CPU, while subsystems can be brought on line one at a time until the entire system is running. Once the design is complete and the application debugged, object code can be programmed into flash memory and production can commence. Check out OCD-it’ll be worth your time to learn about it, and more importantly-to implement it.

Craig A. Haller is chief engineer at Macraigor Systems LLC in Bookline, MA. He has been working with the design and implementation of OCD and OCD tools for 15 years. Macraigor is an Affiliate Member of the Intel® Communications Alliance.