|
|
||
|
Wireless multimedia pushes SoCs to multiprocessor designsBy Avner Goren, Texas Instruments Inc. Cellular phones are becoming handheld entertainment centers that simultaneously operate as sophisticated wide-pipe wireless telephones. Users carry a cell phone for its wireless features, but once it's in their pockets or purses, they expect it to function as a PDA, an MP3 player, a digital camera, a camcorder, a video player and a gaming console. Developing such a multimedia appliance presents enormous technical challenges, especially in quality-of-service, responsiveness and battery life. Ultimately, the solutions will lie in multiple processing engines highly integrated into system-on-chip technology. Consider three user scenarios: - A consumer is using her wireless telephone as an MP3 player over a headset, but she's also playing a videogame. The music and the game sounds must mix together so that they occur simultaneously without clicks or static. - Another consumer is viewing a movie stored on a flash card inserted in the telephone when his mother calls. Of course, he expects his wireless phone to let him know that the call is coming in and to identify the caller. - A third user is on a videoconference, but can't afford to ignore the tornado alert coming in over the Internet. A text message on the screen should bring him the news update without causing video or audio interruptions. Providing quality-of-service at the levels these consumers expect requires multiple processing engines operating concurrently. A single-processor configuration, even with multimedia extensions, probably will not be able to manage these kinds of simultaneous dynamic workloads in real-time, be-cause it uses a sequential, rather than parallel, multitasking approach. To meet real-time requirements of simultaneous multimedia tasks and user interface events, a single processor must constantly switch among tasks, which will lead to significant operating system overhead. Eventually, this task switching will result in lost frames, audio clicks, video flickers or visual artifacts. Workload scenario For example, consider two processors running a workload scenario composed of a control task, user interface task and multimedia task. Processor A, which uses a single processing engine, context switches among all three tasks, introducing overhead. Processor B uses two processing engines (an ARM and a DSP) and offloads the multimedia task to the DSP. Since only the control and user interface tasks are left on the ARM processor, the context-switching overhead is significantly reduced. Even if Processor A runs at much higher clock speeds, the results, from the end user's point of view, are inferior. Going a step further in multiprocessing, Texas Instruments Inc.'s Omap 1611 integrates an ARM926, a TI 55x DSP and a set of dedicated hardware accelerators for video, Java and security. To optimize cost-effectiveness, both the ARM core and the DSP core share external memory via a traffic controller. High levels of software integration are, of course, essential to the seamless performance of two or more processing engines operating in parallel. A software bridge can recognize required tasks and assigns them either to the most appropriate processor or, in some cases, to the one that is not already engaged. It can also switch off processing engines that are not in use and bring them up only when needed. This kind of multiprocessing also helps deliver the responsiveness users expect. Users are accustomed to waiting while MP3 music downloads from the Internet, but when they want to listen to a downloaded file, they expect to select a song, control the volume and pause or switch to another application in real-time. The moment a consumer touches a button, she wants something to happen. In a wireless multimedia appliance, such responsiveness is not as easy to achieve as it may sound. The reason is that command-and-control functions, the user interface and signal processing-all active while playing the MP3 song-inherently require different kinds of data processing. The user interface is a high-interrupt activity, while signal processing requires continuous and extremely repetitious execution of mathematically complex operations. Recent experience shows that, no matter how rich the feature set, consumers will not settle for a wireless multimedia appliance that provides significantly less talk time or standby time than they have become accustomed to in today's second-generation telephones. Japanese carrier Docomo learned this firsthand when it introduced third-generation phones. Docomo's first 3G phones garnered substantially less market share than anticipated. Now that the carrier is offering re-engineered phones with standby battery life above 200 hours and an industrial design closer to Japan's existing 2G phones, the market is accepting the 3G models. It may seem counterintuitive, but multiple processing engines, in general, consume less power and provide longer battery life than can be obtained with a single-processing core design. Taking advantage of the varying capabilities of different kinds of processing cores-RISC, DSP and hardware accelerators-makes it possible to map an algorithm to the optimal engine from a performance and power consumption point of view and to switch units on or off as they are needed, further conserving battery life. The DSP uses complex instructions that allow it to perform several mathematical operations in a single clock tick. RISC architectures and instruction sets generally permit only one operation per cycle. Consequently, the DSP needs far fewer cycles to process MP3 music or streaming video than a RISC chip would. Further, DSP cores are supported by internal memory rather than caches. For tight DSP loops the internal SRAM provides two major advantages: deterministic execution and power savings.
The RISC core performs high-interrupt command-and-control functions, such as user interface and video display management, much more efficiently than a DSP. And it uses less power than a DSP for those operations, especially when they need not occur in real-time. Hardware accelerators boost both performance and power efficiency, but at the cost of reduced flexibility and upgradability. Texas Instruments has added these accelerators to its Omap devices to address specific and dedicated acceleration tasks such as Java, security and video: discrete cosine transform, inverse DCT, motion estimation and pixel interpolation. With multiple processing engines at the core of their systems, designers can satisfy consumer needs today and draw a road map to the wireless multimedia appliances of tomorrow. Avner Goren () works in the worldwide wireless-terminals business unit at Texas Instruments Inc. (Dallas).
|
|
||||||||
Terms and Conditions Privacy Statement |