Software Programmers Face Multicore Challenges

By John Blyler, Editor-in-Chief

As multicore technology moves into embedded systems, software developers face tool shortcomings, legacy code preservation, and scalability challenges.

Max Domeika, an embedded tools consultant in the Software and Services Group at Intel, explained that one of the biggest challenges facing embedded-systems software developers is the growing number and types of multicore processors. “There are so many different cores with varying capabilities even within a given architectural family of processors--not to mention virtual enhancements like hyperthreading techniques, which enable a single core processor to look like two cores to the operating system.”

Hyperthreading is based on “out of order” scheduling of a processor in which the incoming code instructions are identified early enough to be executed in parallel. From a programmer’s viewpoint, it looks like you have two different processors. Yet you’re sharing many of the same resources, such as execution units and memory caches. Although this approach gives the programmer more options, it isn’t the same as having two distinct cores as in a multicore environment.

Why are embedded designers upgrading to multicore technology? Many have found that the use of multiple processor cores reduces system latency or delay. “If you have latency-sensitive applications—even something as simple as a user interface— you can spawn threads with reduced latency,” notes Domeika. Threads, or a collection of related code segments, are at the heart of the multithreading paradigm, which allows programmers to design software applications with threads that can be executed concurrently (see the Figure).

Figure: Threading helps designers to focus on work to do, not “how” (thread control) to manage threads.

Scalability is one of the biggest obstacles faced by embedded developers who want to or must use multicore systems. “You may ship with a two-core architecture, but the next revision of the product may have four cores. This means that you want to create your software in such a manner that it scales with as little effort as possible,” explains Domeika. Many developers have found out the hard way that a system hard-coded for a two-core architecture must be completely rewritten to run on a four-core system. This translates into additional product cost and longer time to market. To accommodate the next product revision, a programmer has to look at all of his or her routines and figure out how to re-partition the executions among an increased number of processors.

At the heart of the scalability question is the balance between different types of available parallelism. Aside from multithreading, another means of providing for parallel execution is through the use of single-instruction-multiple-data processing. Intel’s implementation of SIMD instruction is embodied by its SSE instructions. Another example of SIMD instruction-set processing is AltiVec, which is used by the PowerPC community—namely IBM and Freescale.

“In many ways, it is still left up to the programmer to decide how best to balance the amount and type of parallelism to be used. Programmers must think about what happens at the thread level for a shared-memory machine, the process level, and also the SIMD vector level. This is a very hard problem,” notes Domeika. Scalability and parallel implementation issues help to reinforce the need for better development tools and environments. One way to address these needs is through standards and automation. In the multicore space, several companies including Intel have worked on a standard called OpenMP, which enables concurrency on shared-memory machines.

One challenge with using OpenMP in the past was balancing parallelization operating on multiple cores and parallelization available through the use of SIMD instruction sets. To address this challenge, the Intel® Compiler recently added support for automatic vectorization inside of OpenMP pragmas, which are used for compiler control. Automatic vectorization is a technique that transforms a series of sequential operations into operations performed in parallel. Hence, automatic vectorization analyzes the code to take advantage of SIMD instructions while working within the OpenMP environment to balance different forms of parallelism.

All of these advances are great. For many embedded software developers, however, the main issue is how to preserve their legacy embedded C code while upgrading to multicore technology. “Most customers that I talk with are interested in new technologies like multicore but are more interested in preserving their business, which means their 10 million lines of legacy C code,” explains Domeika. “That is why I’m one of the co-chairs—along with David Stewart, CEO at CriticalBlue— of the Multicore Programming Practices (MPP) Group. The charter of the group is to collect the best-known methods of programming practices using today’s technology. For embedded developers, that means C/C++ and libraries— mainly POSIX threads.”

Such cooperation among embedded software developers will be critical for the continued growth of multicore technology. As in the desktop world, embedded software is the last—but certainly not the least—component necessary for a successful system.

To view/post comments on this article, visit: face-multicore-challenges/

John Blyler is the Editorial Director of Extension Media, which publishes Chip Design and Embedded Intel magazine, plus over 36 EECatalog Resource Catalogs in vertical market areas. He has co-authored several books on technology (Wiley and Elsevier). John has over 23 years systems engineering hardwaresoftware experience in the electronics industry. He remains an affiliate professor in Systems Engineering at Portland State University.