ARM tips faster Mali GPU and interconnect for 2.5GHz Cortex-A15
Nov 10, 2010 — by Eric Brown — from the LinuxDevices Archive — 5 viewsARM announced a new Mali graphics processing unit (GPU) and related interconnect IP, designed to work with its recently announced 2.5GHz Cortex-A15 MPCore processor design. The Mali-T604 delivers up to five times the performance of current Mali GPUs, and the CoreLink 400 interconnect offers high-speed cache sharing between the Cortex-A15 and the Mali-T604, says the company.
ARM made the announcements at its ARM Technology Conference today in Santa Clara Calif., where earlier today Linaro announced its first release of Cortex-A optimized Linux source code.
The announcement comes two months after ARM announced the licensing availability of its next-generation Cortex-A15 MPCore design, aimed at mobile-ready processors that will run at up to 2.5GHz (see farther below for more details). The new IP helps fill in the pieces to make the core framework for a full-featured Cortex-A15 system-on-chip (SoC).
Mali-T604
The Mali-T604 is claimed to offer five times the speed of ARM's current GPUs. This would appear to refer to its previous top-of-the-line multicore Mali-400 MP 2D/3D graphics processor, which is claimed to deliver up to 800M pixel/S and 45M triangle/S performance. (You can do the math, but until we see actual chips and benchmarks, we'll wait and see.)
Mali-T604 architecture
(Click to enlarge)
Aimed primarily at mobile phones, tablets, DTVs, and automotive infotainment devices, the Mali-T604 supports "General Purpose computing on GPU" (GPGPU) applications, said by ARM to be useful for enhanced augmented reality and gesture recognition.
The fourth generation Mali-604 introduces a tri-pipe graphics architecture, and integrates "patented techniques" for reducing memory bandwidth consumption by up to 30 percent, claims the company. The GPU also extends API support to include full-profile Khronos, OpenCL, and Microsoft DirectX.
The Mali-604 is said to be the first member of a new family of ARM GPUs based on the "Midgard" architecture. Midgard devices will share a common software driver, minimizing software upgrade costs for future implementations, says ARM.
The Mali-T604 IP is currently available to license by lead partners, says ARM. Samsung will be the first ARM Partner to gain access to the Mali-T604, presumably for an updated version of its Hummingbird SoCs.
Stated Lance Howarth, EVP and general manager, Media Processing Division, ARM, "The tri-pipe architecture in the Mali-T604 provides both market leading compute functionality and high-performance graphics without compromise, enabling unequalled user experiences in energy-efficient consumer electronic devices."
Anti-Imagination?
The Mali-T604 announcement provides the final blow to the once-strong partner relationship between ARM and its fellow U.K. technology success Imagination Technologies, says a story today on Hexus.channel by Scott Bicheno. Imagination, — which makes GPU IP such as its recent Powervr SGX544 — has long provided the 3D graphics oomph for ARM Cortex designs.
The relationship has been strained since ARM acquired Norwegian graphics IP company Falanx Microsystems in 2006, bringing the company the Mali GPU design, says the story. "Instantly it [ARM] became arguably Imagination's biggest competitor," writes Bicheno. "It should be remembered that a major shareholder and licensee of Imagination is now Intel."
CoreLink 400
ARM's new CoreLink 400 series interconnect is designed to sit between the Mali-T604 and the Cortex-A15 CPU, says ARM. The CoreLink 400 system IP "enables designers to resolve the critical issues of coherency, virtualization, latency and power management to ensure each processor is able to share memory resources and maximize overall system performance," says the company.
The key component in the CoreLink 400 series is the CoreLink CCI-400 Cache Coherent Interconnect, which enables the efficient sharing of the Cortex-A15 cache data with the Mali-T604 GPU, thereby maximizing throughput, says ARM.
The CCI-400 is touted for reducing software overhead associated with cache maintenance. In addition, it minimizes latency and power consumption by reducing off-chip memory transactions, claims the company.
Another CoreLink 400 component is a new NIC-400 Network Interconnect. The NIC-400 is claimed to be fully configurable and non-blocking, while offering lower latency and lower power consumption.
The NIC-400 supports Quality of Service (QoS) features, and introduces a "Virtual Networks" capability that is said to help increase bandwidth while minimizing CPU latency. Its "Thin Links" feature, meanwhile, reduces wiring congestion, says ARM.
A third CoreLink series technology is the CoreLink DMC-400 Dynamic Memory Controller, claimed to provide a high-performance, multi-channel interface for low-power DDR2 (LPDDR2) and DDR3 memory systems. Combined with the NIC-400 and CCI-400, the DMC-400 improves QoS via an enhanced priority-driven scheduler, claims ARM.
Finally, a CoreLink MMU-400 Memory Management Unit provides hardware accelerated memory translation "for other virtualized masters," says ARM. The MMU-400 specifically targets virtualization functionality embedded in the Cortex-A15 processor, which is said to be the first ARM processor to feature full hardware assisted virtualization.
Stated Michael Dimelow, marketing director, processor division, ARM, "We realise that building complex, many-core multimedia-rich compute sub-systems with the associated low latency, non-blocking memory sub-systems is challenging. The good news is that the new CoreLink 400 series products provide hardware assistance in just the right places to really improve consistency and portability."
Cortex-A15 background
The Cortex-A15 targets 32nm and 28nm fabrication processes at speeds of up to 2.5GHz. The processor design is touted as offering enhanced virtualization support, 1TB memory access, plus five times the performance of current smartphone processors — all with similar power consumption. The technology is designed for devices ranging from smartphones to servers, says the company.
Cortex-A15 block diagram
(Click to enlarge)
Up to eight instructions can be issued per cycle, enabling the processor to take less than 10 microseconds to move into standby or wake up again, claims ARM. Floating point and NEON instruction set performance for signal processing and multimedia have also been improved, says the company.
Compared to the Cortex-A9, the Cortex-A15 is said to add more efficient hardware support for operating system (OS) virtualization, soft-error recovery, larger memory addressability, and system coherency.
While the Cortex-A15 is capable of 2.5GHz performance, ARM usage profiles suggest that most manufacturers of smartphones and other mobile devices will want to clock the processor from between 1GHz and 1.5GHz in single- or dual-core configurations for an optimal performance/power consumption trade-off.
The optimal suggested clock rate jumps to between 1GHz and 2GHz for home entertainment devices, and from 1.5GHz to 2.5GHz for embedded "Home and Web 2.0" servers, as well as wireless basestations.
Availability
The first Cortex-A15 devices are expected to arrive in 2012. The ARM CoreLink 400 series and Mali-T604 designs are available for licensing today to key partners, says ARM. More information on the Mali-T604 may be found here. More information on the CoreLink 400 may be found here.
All the new ARM systems IP is being shown at the ARM TechCon through Nov. 11 at the Santa Clara Convention Center.
The Hexus.channel story on the ARM/Imagination rift may be found here.
This article was originally published on LinuxDevices.com and has been donated to the open source community by QuinStreet Inc. Please visit LinuxToday.com for up-to-date news and articles about Linux and open source.