One FPGA SoC to rule them all

June 19, 2015

One FPGA SoC to rule them all

Altera's latest release, the Stratix 10 FPGA, boasts double the performance of the previous-generation Stratix V and other impressive features.

With the recent announcement of a $16.7 billion acquisition of Altera by industry behemoth Intel Corporation, demonstrating both Intel’s confidence in their innovative technology and their enthusiasm to throw their full weight behind pushing that technology ubiquitously into the embedded marketplace – comes an announcement that is exciting all on its own, the revolutionary Stratix 10 FPGA SoC.

Joining the Generation 10 range, nestled alongside the Max 10 55 nm high volume/low cost pure FPGA and the Arria 10, the sole 20 nm SoC FPGA in the embedded space – comes the Stratix 10, using the latest ground-breaking 14 nm Tri-Gate process technology. Described as the “King of FPGAs” by Altera’s Chris Balough – it’s difficult to disagree. Bravely promising an average of double the performance of the previous generation Stratix V, in simulations it has exceeded a factor of five in raw core fabric performance improvement, offering up to 10 teraflops processing capability, via a variable precision DSP block with hard floating point capability. The largest member of the family boasts 5.5 million individual monolithic logic elements, achieved using Intel Embedded Multi-Die Interconnect Bridge (EMIB) packaging innovation, itself a factor of five more than even the closest monolithic competitor product. All realised by the implementation of the quad core ARM Cortex-A53 processor, clocked at an impressive 1.5 GHz and for the first time, using the 64-bit architecture.

Of course, whilst these figures are impressive, the embedded industry moved away long ago from purely focussing on processing performance. Real estate and power requirements, alongside (as always) unit cost are just as critical for today’s design engineers. If the performance increase the Stratix 10 offers isn’t directly required, via its innovation in performance per watt and per square millimetre, a smaller die package can be utilised to “achieve more in a smaller space, and more power efficiently”. For example halving the bus width from 1024 to 512 bits wide yet with double the clock speed offers between 40 percent to 70 percent power-saving overall. Data centre examples where a single Stratix 10 can replace five Stratix 5 FPGAs have been demonstrated, though this is probably an extreme example, it simplistically demonstrates the concept.

Altera’s HyperFlex architecture elects not to increase performance by the “obvious” method of widening buses; this approach would demand higher power dissipation and larger die size. Instead resolving the address routing delay by adding additional “hyper-registers”. By utilising hyper-registers in all routing segments (a factor of 10 higher than the number of ALM registers) the minimum critical path timing is slashed by an average of 50 percent. With clock speed being inversely proportional to that minimum path timing, clock speed effectively at least doubles, with less than 1 percent effect on die size and power consumption. Simulations of approaching near 100 existing designs demonstrate at least a factor of two performance increase.

Whilst historically only the primary concern of military applications, security is high on everyone’s agenda these days and the Stratix 10 satisfies even the most stringent requirements. Employing Intrinsic ID’s Physically Unclonable Function (PUF) technology, each die is recognised individually and this information used to help prevent unauthorised access or cloning. Additionally with both sector-based and multi-factor encryption, multiple keys are for the first time employable.

No matter how good the FPGA hardware, without the supporting software suite it’s never a solution – this aspect is less sexy thus often falls under the radar but is critically important. Altera recognise this to such a degree they sanctioned a complete overhaul of the Quartus II suite to truly unlock the power of the Spartix 10 – with a HyperAware design flow now optimised for Hyperflex and promising a fraction of the compile times, both accelerating time to market.

What many are wondering following the Intel acquisition is of course where this leaves the ARM core implemented within the Stratix 10. Intel recognises the importance of the ARM core as demonstrated in their press release detailing that acquisition.

With the recent announcement of a $16.7 billion acquisition of Altera by industry behemoth Intel Corporation, demonstrating both Intel’s confidence in their innovative technology and their enthusiasm to throw their full weight behind pushing that technology ubiquitously into the embedded marketplace – comes an announcement that is exciting all on its own, the revolutionary Stratix 10 FPGA SoC.

Joining the Generation 10 range, nestled alongside the Max 10 55 nm high volume/low cost pure FPGA and the Arria 10, the sole 20 nm SoC FPGA in the embedded space – comes the Stratix 10, using the latest ground-breaking 14 nm Tri-Gate process technology. Described as the “King of FPGAs” by Altera’s Chris Balough – it’s difficult to disagree. Bravely promising an average of double the performance of the previous generation Stratix V, in simulations it has exceeded a factor of five in raw core fabric performance improvement, offering up to 10 teraflops processing capability, via a variable precision DSP block with hard floating point capability. The largest member of the family boasts 5.5 million individual monolithic logic elements, achieved using Intel Embedded Multi-Die Interconnect Bridge (EMIB) packaging innovation, itself a factor of five more than even the closest monolithic competitor product. All realised by the implementation of the quad core ARM Cortex-A53 processor, clocked at an impressive 1.5 GHz and for the first time, using the 64-bit architecture.

Of course, whilst these figures are impressive, the embedded industry moved away long ago from purely focussing on processing performance. Real estate and power requirements, alongside (as always) unit cost are just as critical for today’s design engineers. If the performance increase the Stratix 10 offers isn’t directly required, via its innovation in performance per watt and per square millimetre, a smaller die package can be utilised to “achieve more in a smaller space, and more power efficiently”. For example halving the bus width from 1024 to 512 bits wide yet with double the clock speed offers between 40 percent to 70 percent power-saving overall. Data centre examples where a single Stratix 10 can replace five Stratix 5 FPGAs have been demonstrated, though this is probably an extreme example, it simplistically demonstrates the concept.

Altera’s HyperFlex architecture elects not to increase performance by the “obvious” method of widening buses; this approach would demand higher power dissipation and larger die size. Instead resolving the address routing delay by adding additional “hyper-registers”. By utilising hyper-registers in all routing segments (a factor of 10 higher than the number of ALM registers) the minimum critical path timing is slashed by an average of 50 percent. With clock speed being inversely proportional to that minimum path timing, clock speed effectively at least doubles, with less than 1 percent effect on die size and power consumption. Simulations of approaching near 100 existing designs demonstrate at least a factor of two performance increase.

Whilst historically only the primary concern of military applications, security is high on everyone’s agenda these days and the Stratix 10 satisfies even the most stringent requirements. Employing Intrinsic ID’s Physically Unclonable Function (PUF) technology, each die is recognised individually and this information used to help prevent unauthorised access or cloning. Additionally with both sector-based and multi-factor encryption, multiple keys are for the first time employable.

No matter how good the FPGA hardware, without the supporting software suite it’s never a solution – this aspect is less sexy thus often falls under the radar but is critically important. Altera recognise this to such a degree they sanctioned a complete overhaul of the Quartus II suite to truly unlock the power of the Spartix 10 – with a HyperAware design flow now optimised for Hyperflex and promising a fraction of the compile times, both accelerating time to market.

What many are wondering following the Intel acquisition is of course where this leaves the ARM core implemented within the Stratix 10. Intel recognises the importance of the ARM core as demonstrated in their press release detailing that acquisition.

Rory Dear, European Editor/Technical Contributor