Use Virtualization to Simplify Network Surveillance

With application-specific programs implemented on virtual machines, developers can focus on packet processing and fight cyber-terrorism.

By Kevin Graves

No matter which side of the network you’re on, deep packet inspection is a key ingredient for network security applications. When most people think of network security, however, they’re strictly thinking of perimeter security devices like firewalls and VPNs. Instead, people should consider network surveillance. This type of surveillance aims to serve a growing need to fight cyber-terrorism and network abuse/misuse. These systems are commonly located within (i.e., on the inside of) the network. Both perimeter security and surveillance systems have a lot in common, such as the need to monitor network traffic on the fly and look for information that matches particular criteria.

Normally, there are two fundamental parts of surveillance systems. Here, the first will be called User Identification (UI). The UI part is analogous to the connection-establishment function in session-oriented schemes. That function is used for things like the call setup provided by the signaling plane in telephony, such as SIP in voice-over-IP (VoIP). The UI is used to resolve some user identifier (e.g., a username, hostname, or call-id) to a specific IP flow (e.g., a 5-tuple). The second part of the surveillance system is then enabled. For this discussion, that part will be called the Packet Traffic (PT) analyzer. Continuing with the analogy used above, the PT would represent the actual RTP traffic in a VoIP call.

It’s interesting to look at some of the complexities in both the UI and PT components of surveillance systems. The UI component is very protocol-aware. It needs to support a variety of “session-establishment” protocols, such as SIP, DHCP, FTP, and H.323. Because some of these protocols are stateful, the UI must maintain state tables for each user. Additionally, it must be capable of inspecting the packet contents. Performance also can be a key issue if, for example, the surveillance system is located near a signaling gateway or other concentration point. Similarly, the PT component needs to monitor packet flows, detect malicious activity, keep statistics, etc. at wire speeds based on the results of the User Identification phase.

Networking-equipment vendors are rising to the challenge by providing equipment with the performance to analyze packets of data on the fly at wire-speed. In general, the industry approach has been to utilize programmable-processor-based systems (as opposed to purely ASIC- or FPGA-based designs). It can then maximize the flexibility necessary to keep pace with the rate of change. Network system designs typically incorporate the following:

  • High degrees of parallelism either through multi-core general-purpose processors or network processors
  • Fixed-function accelerators, such as encryption/decryption and hash units and TCAMs
  • Various memory types--each with potentially different access models and timings
  • Various buses/interconnects to other silicon (again, each with different access models and timings)
  • Interaction with other system elements or “planes”
But all of this comes at the cost of daunting software-development complexity. After all, meeting the performance goals has typically required that a new and complex computing architecture be used within the products. A good example is a relatively new class of processor called the network processor (NP).

A typical NP-based board is shown in Figure 1. NPs are optimized for packet processing because they generally incorporate a high degree of parallelism through pipelining or superscalar processor arrangements, multi-threading, multiple memory types, and dedicated-function accelerators (e.g., hash units, CRC generators, etc.). An example of an NP family that serves a wide range of applications is the Intel® Internet Exchange Processor (IXP) family. Among the key benefits to using these Intel® processors are:

1 Full programmability, giving a clean slate for innovators to do creative things with virtualization
2 A family of products and technologies that offers scalability, interoperability, etc. to maximize system cost effectiveness and flexibility
3 A good ecosystem of board makers and software providers is emerging.

Network-processor components offer huge benefits to system developers. Yet the potential impact of using highly flexible, highly configurable processor architectures in this complex architectural environment is increased development and life-cycle software costs. Such costs are derived from the steep learning curve and lengthened development, debug, and test phases of the program. There are other drawbacks too. For example, the lengthened software-development cycle often results in a functional prototype not being available until late in the project cycle. As a result, integration with other system components and overall system-performance modeling are both delayed. Another effect of the high software complexity is that designers are often hesitant to modify or enhance working designs due to the high risk of change and the lengthy debug cycle. This negates one of the strongest benefits of NP-based designs (i.e., flexibility).

The solution to managing the complexity of network-processing software development is virtualization--specifically, to abstract the underlying NP hardware by creating an application-specific programming model implemented as a virtual machine (VM). Implementing a VM in software atop the NP provides the programmer with an architecture- independent environment. This environment offers the obvious potential to be completely portable and scalable. Other significant benefits include offering superior robustness by building in logic to perform bounds checking, null pointer/handle checking, and other exception handling. Lastly, a VM approach allows for advanced capabilities like dynamic compilation, which can improve application power and flexibility and enable new classes of applications.

Figure 2 shows how a virtual-machine hardware/software environment comes together. Instead of writing in machine code, the programmer uses a high-level language to develop applications by sequencing high-level packet-oriented instructions like those in the diagram. After compilation, the code runs on a virtual machine that has been optimized for packet-processing functions.

A virtual-machine-based network system design, which uses a packet-processing language that supports basic traffic-monitoring functions, can be put together to demonstrate results in a matter of weeks. In contrast, a machine-coded implementation will typically require months to make functional. The fact that basic functionality can be demonstrated very quickly using the VM approach means that more development time is available for adding features and tuning the system for higher performance.

For example, the Packet Traffic portion of a network surveillance system must monitor specified packet flows and deal with complexities like flow lookups, packet classification, and payload scanning. In somewhat simplified terms, the software engineer must program the processor to first classify the packet to determine if it’s potentially a packet of interest. If so, he or she must then perform a flow/session lookup to determine if the packet is part of a session that’s already being monitored. This lookup is often based on the 5- or 6-tuple of IP and layer 4 header fields with an optional VLAN identifier. It needs to be performed on a table that’s capable of tracking the maximum number of supported concurrent connections (often hundreds of thousands or millions).

Due to the size of the table, hashing or other search algorithms are usually used rather than linearly searching the entire table. If the packet isn’t part of a monitored session, the packet might need to undergo further classification by inspecting the protocol headers and payload scanning. Payload scanning or pattern matching is extremely complex. Optimally, it involves state-of-the-art pattern-matching algorithms, which are optimized to take advantage of the target hardware. Typically, expressing the above logic using assembly-language code would require thousands of lines. It would involve coding the packet logic described above as well as dealing with the intricacies of the underlying hardware, such as partitioning and synchronizing the function across parallel processors.

Implementing this in C language wouldn’t be much easier because of the lack of any type of parallel operating system or library support. In addition, one must take into account the need to interact with the underlying NP at the “hardware” level. The time to develop and debug the code from scratch could easily take months. Even if reference code is already available, the time to integrate and test the resultant application would certainly take weeks. If changes or upgrades are required in the future, the changed packet logic will have to be debugged and tested along with the re-partitioning and re-mapping to the underlying hardware.

A better approach is to use a high-level programming model that provides a robust set of built-in algorithms, such as a high-performance n-tuple table lookup (required for flow lookups). Alternatively, a standard way to scan packets against a database of patterns or signatures would decrease the effort of programming and debugging the application by more than an order of magnitude. Because the programming model could be tailored to a specific class of applications (e.g., packet processing), the model also could include a very focused set of data types, control mechanisms, and built-in values. All of them would be useful in dealing with the inspection of packets/streams, which would further abstract the programming task.

An example of this type of virtualization is IP Fabrics’ packet-processing language (PPL). PPL is “packet-centric,” meaning the primary features and benefits center around the primitives and complex algorithms that process IP packets. (It should be noted that PPL is actually very good at processing entire layer 2 frames from the layer 2 headers up through layer 7 content.) In the PPL programming model, the fundamental object is the packet to which the PPL virtual machine provides a rich set of packet-handling functions. These functions include header insertion/stripping, connection/session tracking, content inspection, and rate monitoring.

In addition, the PPL software has a number of lower-level built-in functions like automatically calculating key packet state values (e.g., the offset and length of the various headers and whether or not the packet is part of a fragment). It also allows common packet fields to be accessed symbolically. But the most important aspect of the IP Fabrics approach is probably that it allows users to express their logic in this very high-level manner, but then automatically maps this logic onto a parallel, complex network processor. In doing so, it yields a very high-performance implementation.

Skeptics of using high-level languages to implement high-performance networking applications will say that software abstraction comes at a price--most often, a real or perceived performance penalty when compared to ideal performance. By focusing on a particular application area, however, the VM implementation can be constructed to use highly optimized, best-of-breed algorithms and state machines that reside in that application-specific domain (e.g., packet processing). In fact, the overall VM architecture can be optimized to the domain-specific processing and data-flow characteristics (e.g., through pipelining and parallelism as is the underlying architecture used by PPL).

The hand-tuned functional aspects of the VM architecture will often outperform those written by general-purpose NP application designers. Lastly, one could argue that the final system performance will be greater when all of the system components are available as early in the product cycle as possible, thereby allowing more time to analyze and optimize in real-world conditions.

Using virtualization technology, such as creating a virtual machine with a high-level, application-specific language, can greatly cut design complexity while shortening time to market for network system developers. It also can enable those developers to bring better products to market sooner. Using the aforementioned approach (an application-specific programming model implemented as a VM) allows the application developer to focus his or her attention on packet-processing logic rather than the underlying NP architecture.

To learn more about virtual systems and virtualization, see these archived articles:

“Enhance Communications Platform Design with Virtual Systems Prototyping,” by Graham Hellestrand

“Software Development on an Intel XScale® Technology-Based Virtual Prototype,” by Markus Wloka, Ph.D. and Guy Shaviv, Ph.D.

“Virtual Prototyping for Early Software Development,” by Filip Thoen

Kevin Graves, IP Fabrics’ CTO, has over 20 years of experience in packet networks, signaling, and media gateways. Prior to IP Fabrics, he was Senior Director of Engineering of the Telecommunications Division of RadiSys. Before RadiSys, Graves had roles in engineering and marketing management in communications adapters at IBM. He has a BS in Computer Science from Pennsylvania State University.