Debugging Intel® XScale® Technology-Based Platforms

By Steve Blair

Intel XScale microarchitecture-based processors are instruction set compatible with ARM® processors, but they are very different in the way they interact with JTAG-based debuggers. When debugging code running on an Intel XScale technology-based processor target, it is important to understand the unusual features of these processors. In some cases these features will affect how you write and debug code.

One of the first things to understand when debugging an Intel XScale technology-based processor is that its on-chip debug handler is not micro-coded. When a JTAG-based debugger takes control of an ARM processor, it places that processor into its Debug Mode, which is implemented in hardware in the processor and is always available. This is not true of Intel XScale technology-based processors. Before the processor can be controlled through its JTAG port, the debugger must first install a debugging kernel (the Intel XScale core Debug Handler) on the processor.

The Intel XScale core Debug Handler is code that provides operations such as program start/stop, single-stepping, breakpoints, and debugger access to registers and memory. The Intel XScale technology-based processor contains a special cache (the mini instruction cache) specifically designed to contain the Debug Handler. Until the Debug Handler is loaded into this cache the in-circuit emulator will not be able to control the processor. The code can be installed either by target firmware or downloaded to the processor through the JTAG port.

The Intel XScale core Debug Handler consumes memory addresses, but not physical memory in the system. The handler physically resides within the mini instruction cache, but its address map may be anywhere within the processor’s memory space. (see Figure 1)

Figure 1: The debug mode in the Intel XScale technology-based processor maps a mini instruction cache onto system memory at a user-defined base address and locks the reset vector into cache.

Mini-cache overlays memory

The debug kernel consumes 2KB of memory space at the address that you specify as the Handler Base Address. Because accesses to this memory range will always result in a cache hit, no system code in this memory range will be accessible. The debug kernel effectively overrides it. The memory space is still accessible as data space, however, because the Intel XScale technology-based processor uses separate instruction cache and data caches.

During its initialization, the debug handler will save the original reset vector value, modify the reset vector to point to the Debug Handler, and then jump to the original reset vector to launch your target software. Subsequent resets will again cause the Debug Handler to execute, and it will again chain to the original reset vector after it has completed its housekeeping. In most cases, the developer does not need to be aware that this is taking place.

The Intel XScale technology-based processor supports two modes, placing the exception handler vectors in low memory or high memory as configured by the target software. The modes place the reset vector either at 0 (zero) or at FFFF0000h. In either mode, the Debug Handler takes over the reset vector by locking it in cache. While the Intel XScale technology-based processor provides a 32-way instruction cache, it also provides two extra, lockable cache lines for the Debug Handler to use. The Debug Handler uses one of these cache lines to lock address zero in cache and the other one to lock address FFFF0000 in cache.

As soon as the emulator detects that power has been applied to the processor, it will load the Debug Handler into the processor’s mini instruction cache and lock it at the user’s chosen address. It then executes the Debug Handler. At this point, American Arium’s debugger software, SourcePoint, will display target status showing the Intel XScale core-based target is stopped at the reset vector.

This is not quite accurate, however. The processor is, in fact, now executing code: the Debug Handler. When the user issues a “Go” or “Step” command, the Intel XScale core Debug Handler temporarily transfers control to the target software. The fact that the Intel XScale core Debug Handler is involved during operations such as Go, Stop, and Single-Step is completely transparent to the user-it simply appears that the developer has control of the program.

Locked vectors cause concern

The fact that the exception handler vectors are locked in cache can have ramifications for the target software if the developer is using the Memory Management Unit (MMU) of the Intel XScale technology-based processor. In debug mode accesses to the vector table will always come from cache. The unit will not generate any memory cycles so the MMU will not see a memory access. This means that a scheme that attempted to switch vector tables by manipulating the MMU would not work while the Debug Handler was resident. Once a cache line is locked at address 0 (zero) or FFFF0000h, the entire 32-byte exception handler vector table becomes locked in cache.

This can also cause problems if the target software attempts to dynamically modify exception handler vectors at run-time. Because the exception handler vectors are loaded and locked at initialization, run-time attempts to modify the vectors will result in a mismatch between the vectors in memory and the vectors in the mini instruction cache. The Arium emulator works around this problem by trapping all attempts to modify exception handler address. The Debug Handler can respond to that trap by executing a routine that examines memory and branches to the application’s handler. It can also, if needed, update its own table by unlocking the appropriate cache line, updating the affected address, and relocking the cache line. At the cost of some significant overhead during target initialization, this process keeps the exception handler addresses properly updated.

The Debug Handler’s strategy for managing the table of exception handler addresses, however, is based on the concept that target software will fill the table for all handlers early in the initialization process and never modify them again. To reflect that, the emulator receives a Vector Determination Count during configuration. Each attempt to modify the vectors is logged, and when the count is exceeded, further attempts by the target software to modify the vector table will be logged but not executed.

The fact that debug code in the Intel XScale technology-based processor effectively replaces a section of physical memory with the mini-instruction cache thus can cause unexpected behaviors when the processor is in debug mode. By designing software with this and other special Intel XScale core features in mind, developers can avoid such problems and successfully develop and debug their firmware.

Steve Blair is vice president of engineering at American Arium (Tustin, CA). With more than 25 years’ experience, Blair has held engineering positions with Computer Automation, Emulex Corp., AST Research, and Phoenix Technologies. He has co-authored two popular software texts, Advanced Programmer’s Guide to EGA/VGA and Advanced Programmer’s Guide to SuperVGAs. Blair was also a contributing author to Dare to be Excellent: Case Studies of Software Engineering Practices That Work, edited by Alka Jarvis.