News Archive (1999-2012) | 2013-current at LinuxGizmos | Current Tech News Portal |    About   

32-core processor module claimed to rival 128-core SoC

Jul 27, 2010 — by Eric Brown — from the LinuxDevices Archive — 73 views

NetLogic Microsystems announced a multi-core “solution” using its MIPS-based XLP processor architecture, and says it will soon introduce nine new XLP SoCs. Aimed at high-end networking applications, the Linux-ready XLP8128S integrates four eight-core XLP832 SoCs clocked at up to 2GHz and offers over 160 programmable processing engines, for up to 160Gbps throughput and 240 million packets-per-second (Mpps) processing, says NetLogic.

Whereas the NLX321103A announced earlier this month offers a single eight-core XLP832 system-on-chip (SoC), with 128 NL11k processing engines and 196 NETL7 Intelligent Fabric for Automata (IFA) engines, the XLP8128S "solution" combines four XLP832 processors with assorted engines for greater control plane processing capabilities, as well as significant dataplane processing support, says NetLogic.

The XLP8128S supports "intelligent application performance" for next-generation 3G/4G mobile wireless infrastructure, enterprise, storage, security, metro Ethernet, edge and core infrastructure network applications, says the company.

Thanks to the four-way multithreading, four-issue superscalar engine and out-of-order execution of the XLP832 and its eight 64-bit MIPS EC4400 cores, the SoC provides 32 highly independent threads. Netlogic calls these independently addressable threads "nxCPUs." Whereas the previously announced NLX321103A solution offers 32 nxCPUs, the four-chip XLP8128S solution supplies 128 nxCPUs, says the company.

XLP832 block diagram
(Click to enlarge)

The solution offers equivalent processing power to a 128-core MIPS64 SoC thanks to the new EC4400 cores, claims the company. This is because each nxCPU thread is viewed by the operating system and application software as another core, explained a NetLogic spokesperson in an email.

These claims appear to be confirmed, at least in theory, by a Microprocessor Report published yesterday that also says that Netlogic plans to sample nine new XLP processors this fall (see farther below).

Interconnect makes four SoCs look like one

The XLP8128S solution is unique in that it offers scalability across all four SoCs, "with full cache and memory coherency," the NetLogic spokesperson further explained. The scalability is enabled via an Inter-chip Coherency Interface (ICI) that allows software applications to "seamlessly" run in Symmetric Multi Processing (SMP) or Asymmetric Multi Processing (AMP) modes, says the company.

The ICI helps to produce 240Mpps of intelligent application processing performance, making it the industry's highest performance multi-core communications processors for intelligent Layers 4-7 network, services, and application processing, claims NetLogic.

Building on the ICI interconnect, the XLP8128S solution features NetLogic's high-speed, low-latency Enhanced Fast Messaging Network, which streamlines communication among the solution's 128 NXCPUs, says the company. This feature is said to support billions of in-flight messages and packet descriptors between all on-chip elements.

As with the NLX321103A, the XLP8128S would appear to combine the XLP832 with NetLogic networking processors, such as its NETL7 processor. Yet, this time around, the company does not mention particular processing components.

The company does say that the solution offers over 160 fully-autonomous processing engines, up from 128 on the NLX321103A. These are said to offload network functions, including:

  • 160Gbps fully-autonomous security acceleration engine, supporting networking, wireless, and storage encryption/decryption/authentication protocols
  • 160Gbps network acceleration engines for ingress/egress packet parsing and management
  • 512Gbps RAID-5/RAID-6 acceleration
  • 40Gbps compression/decompression
  • Packet ordering
  • Storage de-duplication acceleration
  • TCP segmentation offload
  • IEEE 1588 hardware time stamping

The XLP8128S provides a tri-level cache architecture with over 50MB of fully coherent on-chip cache, says NetLogic. This is claimed to deliver 40 Terabits/sec (Tbps) of on-chip memory bandwidth.

In addition, the XLP8128S incorporates 16 channels of 72-bit DDR3 memory interconnects yielding an aggregate of over 1600Gbps of off-chip memory bandwidth, says the company. The XLP8128S is also said to integrate high-speed interfaces including Interlaken, XAUI, SGMII, PCI-Express, and USB 2.0.

As with the NLX321103A, and the XLP SoCs in general, the XLP8128S is supported by a Linux-based software development kit (SDK), says Netlogic. The SDK contains reference and production-ready software components, says the company.

Microprocessor Report talks up XLP

As noted, NetLogic's claims that its four-SoC, 32-core solution is the equivalent of a 128-core processor appear to be theoretically confirmed by a Microprocessor Report feature on XLP  that was published yesterday.

NetLogic is moving from 90nm to 40nm fabrication for XLP production next year, skipping the 65nm stage, says Microprocessor Report writer Tom Halfhill. (Note that this does not appear to match up precisely with NetLogic's estimate that the XLP8128S solution will ship in the third quarter of this year.)

This transition will push the high-end models to 2GHz, "surpassing Cavium's 1.5GHz, MIPS-based Octeon II chips, which will soon be offered in recently announced 32-core CN68xx/67xx versions, says the report. It will also nearly match Freescale's new PowerPC-based 2.2GHz P5020 QorIQ processors, writes Halfhill. Per-clock performance, meanwhile, nearly matches that of Intel's Nehalem CPU, while running much cooler, he adds.

Thanks to its four-way multithreading and four-issue superscalar pipelines, the XLP cores enjoy instruction throughput advantages over those of Cavium, Freescale, and (for now) Intel at any clock speed, writes Halfhill. The report notes that Intel has never implemented more than two threads per CPU using its Hyper-Threading technology, although the upcoming "Larrabee" processors will boost that to four threads like the XLP.

"We believe the top-end members of the XLP and Octeon II families will offer similar performance, regardless of the difference in CPU count," writes Halfhill. "Even if they don't, NetLogic's glueless interchip interface lets customers link two to four chips together, delivering far more aggregate performance than Cavium can muster with a similar programming model."

Netlogic preps nine new XLP SoCs

According to the Microprocessor Report, Netlogic will start sampling nine new XLP processors this fall to join its current, top-of-the-line XLP832 SoC. These will include single-core XLP104, XLP204, and XLP304 processors, dual-core XLP208, XLP308, and XLP408 SoCs, plus the quad-core XLP316 and XLP416.

Meanwhile, the company will introduce a second eight-core SoC in addition to the XLP832, called the XLP432, writes Halfhill. This appears to be similar to the XLP832, but without the chip-to-chip interconnect that would enable a product like the XLP8128S solution.

Stated Behrooz Abdi, executive vice president and general manager at NetLogic Microsystems, "Our ability to scale to 128 NXCPUs with full cache and memory coherency to deliver 160Gbps throughput and over 240Mpps of application performance is unprecedented in the industry, and enables a new class of equipment for our customers."

Stated Linley Gwennap, principal analyst at The Linley Group, "No competing product even approaches this level of performance. The XLP processor offers numerous architectural advantages that enhance performance, scalability, and power efficiency."


The XLP8128S solution will sample in the third quarter of 2010, says NetLogic Microsystems, adding that it is currently "engaging with early adopters of this technology." More information on NetLogic's XLP processor, but not, so far, on the XLP8128S solution or the new XLP models, may be found here.

The Microprocessor Report article mentioned in this story may be found here, although the full report requires a subscription.

This article was originally published on and has been donated to the open source community by QuinStreet Inc. Please visit for up-to-date news and articles about Linux and open source.

Comments are closed.