Will InfiniBand Become the King of AdvancedTCA Fabrics?

To select the best communication architectural fabric, tradeoffs must be performed among Advanced Switching, InfiniBand, Rapid I/O, and others.

By Joe McDevitt

The Advanced Telecom Computing Architecture (AdvancedTCA) is an inflection point--the point at which telecommunications companies can design new products without costly engineering-redesign efforts. It's an inflection point for profits as well as development and business style. Some argue that AdvancedTCA--a series of industry-standard specifications for the next generation of carrier-grade communications equipment--will cause a lack of creativity and differentiation. In the modern digital world, differentiation and features are often software-driven. Standardized hardware should do little to prevent these developments. In fact, AdvancedTCA should enable telecommunications solution providers to focus solely on differentiation and not be bothered with the development of non-differentiating hardware.

To assist in some hardware differentiation, AdvancedTCA is a fabric-agnostic architecture. One of the fabrics that was defined early for AdvancedTCA inclusion was InfiniBand. It was defined via the AdvancedTCA 3.2 sub-specification. InfiniBand was the result of merging two fabric technologies: Future I/O and Next Generation I/O. From the onset, it was meant to be a fabric that would connect CPUs and provide them with their I/O needs. InfiniBand was designed to replace all datacenter I/O standards at that time including PCI, Fibre Channel, and Ethernet. With these goals, InfiniBand would seem to be the perfect fabric for AdvancedTCA. After all, CPU and I/O devices have become modularized parts of a chassis.

AdvancedTCA systems are beginning to move toward production. The majority of these systems are most likely AdvancedTCA 3.1 or Ethernet-fabric-based systems. There are limitations with this kind of deployment. Although others propose Advanced Switching (AS) as the �King of All Fabrics,� there may not be such a king in AdvancedTCA. Time to market, bandwidth, latency, and even system- level costs are some areas in which InfiniBand may be a surprisingly good fit for AdvancedTCA.

Some target telecommunications engineers are outright hostile to the usage of AdvancedTCA in their companies� applications. The chief reason for that hostility is bandwidth. The engineers report that 1-Gb/s or 2-x-1-Gb/s Ethernet was slower than their current proprietary solution. They also argued that 10 Gigabit Ethernet wasn't the timely solution that they needed.

On the InfiniBand side, one x4 port can provide 10-Gb/s of bandwidth utilization (see Figure 1). This amount would fully utilize one AdvancedTCA channel, thereby providing the timely bandwidth demanded. Other fabrics have plans for these bandwidth levels. AdvancedTCA 3.1 Ethernet via Option 9 and AdvancedTCA 3.4 Advanced Switching via Option 3 can match this in terms of raw performance. The difference is the solution's timing. With InfiniBand, the full bandwidth of a simple FR4 AdvancedTCA backplane is realized. It doesn't have any limitations on node boards or switches. Both the Ethernet and Advanced Switching solutions suffer from the timing aspect of the availability of proper AdvancedTCA centric- node and switch solutions.

Another key factor that affects performance is the time that it takes a single packet to leave the host and reach the destination. This factor is known as packet latency. InfiniBand offers high bandwidth with low latency. Packet construction time is an area in which InfiniBand exceeds, thereby reducing latency. The InfiniBand host channel adapter (HCA) does this directly without host support. The HCA doesn't carry the overhead of Ethernet, which relies on a complex software protocol stack to ensure reliable delivery.

A well-designed packet switch also is required to achieve low latency (see Figure 2). InfiniBand satisfies this need with a switch that can cut through a packet from input port to output port. It simply looks at the first 8 Bytes of the header before the packet has fully arrived. Other switch architectures will buffer the entire packet before routing, resulting in increased latency. Although such cut-through routing does mean forwarding a bad packet, it doesn't cause a fabric or application failure. The packet will be checked at its destination.

A lot of discussion revolves around the cost of the fabric. Interesting shell games occur when this subject is discussed. Ethernet does have a clear message about per-port hardware cost being lower than other fabrics. But per-port charges are only partial solutions and therefore just part of the cost. To approach even half the bandwidth of InfiniBand, 1-Gigabit Ethernets need multiple ports. Alternatively, some fabrics rely on future technology that's not available in the scale and port counts that are needed for AdvancedTCA. This future cost cannot be known. It can't be extrapolated or inferred from "sister" technologies. Yet a real-world example based on existing AdvancedTCA nodes and switches--an "apples-to-apples" comparison of system cost--is possible.

Say a fully loaded InfiniBand AdvancedTCA 3.2 Option 1 (two switches and 12 CPU nodes) system is being compared to the exact same configuration of an equally loaded Ethernet AdvancedTCA 3.1 Option 1 system. The total system cost of the InfiniBand system will only be about 2% greater. This minor increase comes with a 10X increase in fabric bandwidth and a similar performance increase in fabric latency. When one looks to the CPU utilization or packet-construction offloading, a conservative estimate of 30% of the available CPU cycles is used in packet construction in this real-world Ethernet system. A more liberal estimate of 10% of the CPU cycles for packet construction on the InfiniBand side translates into savings from wasted CPU cycles. That savings reoccurs in the form of more power for applications running on these CPUs.

InfiniBand allows processors to do what they were purchased to do by running applications and not simply munging packets. Although link aggregation and offload engines increase the likelihood of matching performance and limiting CPU usage, these performance enhancements belay the apparent cost-per-port benefit of Ethernet. Additionally, the silicon cost of InfiniBand switches is admittedly more expensive--mainly because of market-size differences.

One major difference is that the InfiniBand management software is open source. The ease and general familiarity demanded by customers on the Ethernet side requires a long, complicated laundry list of RFC support. Frequently, the management costs of Ethernet aren't included when raw per-port hardware costing is used. As a result, the system-level cost of Ethernet isn't realized from its per-port lead.

The result is "The Great Fabric Shell Game." Instead of finding the ball under the shell, we're asked to forget the cost additions of offload engines and multiple port aggregations. We must argue performance limitations, buy into wild speculation on future enhancements, and ignore development-Ethernet-software management costs. These efforts work to show Ethernet as the low-cost fabric of choice. Per-port cost savings is a partial solution. One must therefore look toward total system cost and the performance gained. Today, InfiniBand is the clear winner when the timely system-level price for a given performance is taken into account.

To predict lower cost, Advanced Switching experts point to the economies of scale derived from the reuse of the physical- and link-layer technologies. While there's value in reusing existing technology, it's clear that these economies won't be achieved. AS will have much higher prices than volume PC-based PCIExpress products like motherboard chip sets and add-in cards. In addition, AS devices will sell in much lower volume. They also will have much higher performance, reliability, and support designed into them. As a result, they will demand higher cost.

Advanced Switching and PCIExpress are different technologies that will be deployed on different classes of devices. PCIExpress isn't at all suited for system-level AdvancedTCA construction via AdvancedTCA redundant links. Instead, AS is required. AS means management software, whereas PCIExpress requires none. PCIExpress therefore changes the market, node architecture, and chip design while adding cost. It is a great motherboard technology for desktop systems. Using Intel's E5720 chip set and PCIExpress support, the ATC5232 is enhanced by PCIExpress.

PCIExpress enhances InfiniBand by allowing reduced BOM costs via memory- free solutions. The memory-free option uses the bandwidth available to the InfiniBand chip by the PCIExpress architecture. That architecture is native to the E7520. Although PCIExpress is a great technology, the AS "economies-of-scale" argument is simply not valid. On AdvancedTCA, some blocking will occur on the early deployment of AS because of current AS port counts. This factor negatively impacts any claim of performance equivalence to InfiniBand.

Ultimately, there may be no king of AdvancedTCA fabrics. Ethernet, Advanced Switching, InfiniBand, Rapid I/O, and others can coexist in AdvancedTCA and give added points of differentiation. Pitting one fabric against another is fun and the heated discussions that arise are interesting. AdvancedTCA is designed to its core as a fabric-agnostic architecture. For some AdvancedTCA applications, the best solution will certainly be InfiniBand. As engineers and technologists, however, it's our job to study all of the tradeoffs and place aside any politics. We can then ensure that the best solution is chosen for the problem we are given.

Joe McDevitt, Vice President of Technical Development, Diversified Technology Inc.