Dual-Core Performance and COM Express Form a Winning Combination

By Jennifer Zickel, Product Marketing Manager - Com E, RadiSys

With the rising popularity of dual-core processors, embedded-systems designers are gaining extra but necessary computing horsepower at attractive price points. Such horsepower is helping them to address performance-hungry applications, such as imaging, gaming, and test and measurement systems. A key requirement to maximizing performance on multiple cores, however, is the use of dual-channel memory. In dual-channel-memory configurations, processor performance gains of more than 50% are observed for general-purpose computing tasks. Gains of nearly 30% are seen for imaging, graphics-processing, and data-acquisition applications. While dual-channel memory is a standard feature on most boards, it isn't always feasible on small modular form factors like COM Express modules.

The advantages associated with a dual-core architecture cannot stand on their own. The key to unlocking the performance potential of such an approach lies in the simple requirement for ample available memory bandwidth. Such bandwidth is needed to keep both cores from idling. A dual-channel memory architecture accomplishes this goal. Dual-channel memory capability has been a feature on many chip sets for the past several years. In addition, two channels are commonly populated on standard-size boards. In the modular-form-factor market, however, dual-channel memory is very unusual. After all, two SODIMM sockets are extremely difficult to fit on a 95-x-125-mm size board.

Off-chip memory access is key to central-processing-unit (CPU) performance gains. Without a dual-channel architecture, the true potential of dual-core performance isn't fully realized for small-form-factor boards. To increase memory-access performance in a dual-channel memory architecture, a simultaneous read and write or two simultaneous reads or writes are required.

Simultaneous read-write and write or read accesses reduce memory-access congestion. Consequently, they reduce idle CPU cycles. Simultaneous reads are of special importance to SIMD operations in which the same operation is performed on very large sets of operand pairs. Simultaneous read-write accesses speed up overall graphics performance by allowing faster access to CPU data by the graphics processor and through the memory subsystem. Such operations are very common in digital-signal-processing and graphics-rendering operations. They rely heavily on linear algebraic operations, such as inner products and matrix-matrix or matrix-vector multiplications.

A dual-channel memory controller can interleave addresses between the two channels with the switch happening after each cache line (64-Byte boundary). A second read or write request—queued up behind the first for an address on the opposite channel—can have its data transfer completed independent of the first request. Two consecutive cache lines can be requested and retrieved simultaneously, as they're on opposite channels.

Clearly, the advantage of a dual-channel memory architecture is evident from benchmark results. The streamed benchmark is representative of the memory accesses occurring in medical and other imaging applications. For its part, the MCS benchmark is representative of general computational tasks. Both benchmarks show vast improvement in CPU performance with a dual-channel memory configuration over a single-channel one. There is nearly 30% improvement in the case of the streamed benchmark and over 50% in the case of the MCS one.

Managing Hardware and Software

The dual-core, dual-channel hardware-performance gains experienced in a COM Express module can be multiplied by intelligent software use. Designers can now mix and match multiple operating systems (OSs) on a single device with the introduction of Intel® Virtualization Technology Intel® VT. They also can consolidate multiple processor boards into a single Com Express solution. For example, a typical high-end medical-imaging implementation often has separate processor boards—one for the high-reliability, high-performance, real-time instrumentation control and another for the less critical, general-purpose computational, display, and networking functions. With dual-core processors, these multiple boards can now be consolidated into a single Com Express solution. A high-performance real-time operating system (RTOS), such as Microware OS-9, can run in one core in parallel with a general-purpose OS like Windows or Linux running on the other core. As a result, designers cut costs, simplify design, and lower power consumption.

Virtualization brings other benefits. A crash in one core doesn't affect the operation of the other. As a result, system reliability improves. If security is a concern, virtualization ensures that the software running on one core can be isolated from software running on the other core. A trusted, long-life RTOS like OS-9 can be used to run secure and critical applications on one core. Less critical or secure applications can be run on a general-purpose OS, such as Linux, on the other core. Designers no longer need to re-test and qualify critical applications with every frequent release of a new Linux or Windows update.

To target a variety of applications, the RadiSys COM Express modules bring together the power of dual-core processors and dual-channel memory in a 125-x-95-mm form factor. Such an architecture is fully compliant with the PICMG COM Express module specification. It is coupled with standard, dual-channel SODIMM memory support. This feature results in an impressive memory-expansion capacity at 2 GBytes DDR2, 667-MHz SODIMM memory per channel, or a total of 4 GBytes. The designers that are adopting the COM Express standard for their applications are seeing a number of benefits including:

  • Performance scalability - Modules with higher CPU, chip-set, and I/O performance can replace older modules without a redesign of carrier boards.
  • Exact form-fit-function - Carrier boards are easily designed to fit a particular form factor, allowing product differentiation and better, more functional and user-friendly designs, such as portables and handhelds.
  • Long product life - Plug-compatible modules replace those with end-of-life parts without the need to change carrier board, thermal, mechanical, and software design.
  • Lower development cost and risk - By removing the need to design high-speed CPU systems, product-development risks are lowered and costs are reduced. It isn't necessary to upgrade expensive CAD tools and expand engineering staffs.
  • Focus on core competency - Because the host processor subsystem is generic, not much competitive advantage is gained by allocating expensive resources for its design. Instead, those resources can be focused on subsystems, which are non-generic and add true market differentiation and customer value.
  • Faster time to market and time to revenue - The reduced design cost and development risk result in faster time to market and—consequently—faster time to revenue.
  • Innovation and service - Access to an open industry standard spurs innovation in the supplier base while securing greater levels of customer service.

The combination of dual-core and dual-channel memory technology enables an emerging breed of COM Express modules. These modules provide designers of high-performance medical imaging, industrial imaging, gaming, and test and measurement equipment with unparalleled performance in the smallest available package. Dual-core processors bring unprecedented performance gains to modular embedded systems through dual-processor cores paired with large, independent, dynamically sizable L2 caches, support for SSE2 and 3 instructions, execution of up to four operations per instruction, out-of-order memory access, and support for virtual machines. Dual-channel memory is responsible for the maximization of these performance gains through the doubling of memory bandwidth—with two reads, two writes, or a pair of read and write operations performed independently of each other.

Jennifer Zickel is a Product Manager for the COM Express Product line at RadiSys. She has 20 years of experience in Product Line Manager, Product Marketing, and Product Engineering roles for component, board and system level products from the latest technology to older legacy technology products. In addition to her RadiSys experience, she has worked for Intel, Texas Instruments, and Lattice Semiconductor.