iApplianceWeb.com

EE Times Network
News Flash Appliance Insights Appliance Directory Standards in IA Webcasts


 

Netcentric View:

Mobile devices, multimedia NoCs, and Steve Martin's "LA Story"

By Bernard Cole
iApplianceWeb
(01/23/06, 1:54 AM GMT)

During the recent Consumer Electronics Show, I talked to many chip and software vendors whose designs are being used in some of the feature-packed, multimedia-laden mobile and embedded devices introduced there. Listening to them, I was constantly reminded of a scene from Steve Martin’s LA Story.

Martin, playing a Los Angeles TV weatherman, gets in his car in the morning to go to work. But instead of getting on the freeway system - clogged to a standstill with traffic beyond its capacity - Steve travels to work by way of an elaborate system of short-cuts: through a neighbor’s yard, down an alley, across an empty lot, through a car wash, zigzagging through a parking lot, and so on.

It was that or face driving on a freeway system that was designed fifty years ago and unable to handle today’s volume of traffic. The scene was hilarious because it was an accurate reflection of the real world, and Martin’s solution wasn’t that much of an exaggeration of the extremes to which L.A. drivers sometimes resort.

Embedded SoC traffic jams
In many embedded applications in mobile devices and portable electronics systems, developers and builders of the silicon are in the same situation. Builders of MP2 players, video recorders, mobile TV devices and all-in-one mobile phones with video capability are driven by the need to deliver multimedia content over high-bandwidth wired and wireless Internet connections at higher and higher data rates. But they are having problems with outdated shared bus architectures that simply can’t handle the increased traffic loads.

The use of multi-core CPUs in such designs only partially addresses such problems, and in other ways exacerbates them, because to move the data around the chip it has been necessary to depend on a shared-bus “freeway” system that is decades old and inadequate for present and future needs.
 

Of course, there are new freeway systems, such as networks-on-chip and on-chip point-to-point, packet-based, serial switched fabric linkages, similar in concept to Infiniband, PCI Express and RapidIO at the board-to-board and system to system level. Many of these chip-level alternatives and the problems they raise are described in an excellent recent book “Networks On Chips,” by Giovanni De Micheli and Luca Benini.

There are at least two problems I see with most such topologies for the vendors of the devices that use these multimedia-optimized SoCs. First, there are so many of them. How do you make a choice? How do you assess their compatibility with existing “freeway” designs? Second, there are the numerous software development issues. These are also covered extensively in the De Micheli/Benni book.

After reading in their book about all the complicated software problems ahead, I have come to the conclusion that even if we agree on a common nextgen freeway system for on-chip traffic, the software problems alone will prevent its widespread adoption for many years. Consider the amount of time it is taking for the industry to develop a common set of standards for multicore software development. So far I hear a lot of talk, with minimal action taken.

Making do with work-arounds
It should come as no surprise that, faced with such challenges, not a few current licensees of core processors - including those from ARM, MIPS, Power, and PowerPC - are taking a page from the script for Steve Martin’s movie. They’re making do with what they have, using current shared bus topologies where appropriate, replacing them where they can, or finding work-arounds and shortcuts that get around the traffic jams when they can’t.

For example, most recently, Atmel’s  ARM926EJ-S-based microcontroller - designed for what it calls human interface applications with loads of graphics, audio and video - takes the work-around approach to the extreme to eliminate the data traffic bottlenecks that often occur on the ARM architecture’s traditional AMBA bus to achieve on-chip data transfer rates of up to 41.6 Gbps.

No less innovative in its work-around strategies is Digi with the bus workaround it uses in its Netsilicon NS9360 deployed in the several dedicated I/O devices it has built for cellular gateways, WiFi device servers, and Wireless Video appliances. Similar to the approach taken by Atmel, they stick with the existing AMBA AHB shared bus topology, but greatly modify the peripheral DMA structure. It even incorporates mechanisms that enable the developer to modify specific registers to allow direct control in software over however much bandwidth is allocated.

They are not alone. For example, Faraday Technology has opted for a QoS-aware non-blocking crossbar switch to get the intra-chip data flow bandwidth it needed, as well as a smart DMA engine of its own design. PortalPlayer also uses a crossbar switch of its own design as an alternative to AMBA, and NXP uses a modified bus architecture, retaining AMBA for deterministic control and processing tasks and adding an additional data flow optimized bus of its own design that handles media rich operations. Other companies have opted for the approach that Cirrus Logic has taken. Direct and simple, it just puts two AMBA buses on the chip and separates data flows such that each processing element gets as much bandwidth as possible.

Others, such as Texas Instruments with its OMAP, NXP with the Nexperia and Toshiba et. al. in the Cell architecture have opted for a shared memory approach, on top of which they layer various message-passing mechanisms, based as much as possible on existing standards, such as Open MPI.

There are still a lot of questions that occur to me as the industry makes the shift to this new architectural paradigm. How long can such workarounds be effective? Are there any commonalities between the various new NoC bus topologies that a developer can look to, to at least minimize the cost of converting from the existing shared bus methods?

Do any of the new NoC alternatives incorporate features that make this translation easier? Can the software solutions being considered to solve various programming and debug issues with current homogeneous symmetric multi-cores be extended to operate effectively in this much more complex, heterogeneous and asymmetric multicore environment that NoCs represent?

What do you think? What approaches are you pursuing now? And in the future? What is the best way to make the transition? The Steve Martin approach will only work for so long.

Bernard Cole is editor in chief and site leader for  iApplianceweb and site editor on Embedded.com  as well as an independent editorial services consultant working with high technology companies. He welcomes your feedback. Call him at or send an email to .
 

For more information about topics, issues and technologies mentioned in this story go to the flashing icon in the upper left corner on this page or go to the iAppliance Web Views page and call up the associatively-linked Java/XML-based Web map of the iApplianceWeb site.

Enter the appropriate key word, product or company name to list instantly every news and product story, product review and product database entry relating to the topic since the beginning of the 2002. 

 



Copyright © 2004 Appliance-Lab
Terms and Conditions
Privacy Statement