AgileX In-Depth: Intel's Attempt to Make FPGAs More Interesting and Accessible

Intel announced its next-generation 10nm AgileX FPGA, which it had formerly announced under the Falcon Mesa codename, at its Data-Centric Innovation Day.Intel's AgileX comes brimming with next-gen tech, like support for PCIe 5.0, DDR5, HBM, Optane Memory DIMMs, and memory coherency with Xeons.

Altera was one of the first and only high profile customers that Intel announced for its Custom Foundry. Over time there was also talk that Altera would be the first to use Intel’s 2.5D EMIB (Embedded Multi-Die Interconnect Bridge) packaging technology, which Intel touts as a cheaper and superior alternative to the interposer for chiplet mix-and-match strategies.

Credit: IntelCredit: Intel

Intel liked Altera so much that it acquired the company in early 2016 for $16.7 billion. As a backgrounder, Altera designed and manufactured its previous-gen Stratix 10 FPGAs on Intel’s 14nm node, succeeding the 20nm Arria generation. Those eventually launched in the second half of 2017.

After the acquisition, the Altera design team was still working on Stratix 10, so Intel set up a second, parallel development team that began work on the 10nm generation. And while we will cover the whole breadth of technology that makes up this product, it is interesting that this is the first FPGA conceived and designed with Altera as an integrated part of Intel. Intel is eager to emphasize this point, as it allowed it to design this new family from the ground up, from architecture development and process/packaging co-optimization to I/O and software.

Intel has a very broad portfolio of data-centric IP, and FPGAs have followed a trend of integrating increasingly more of that IP. Deeper IP integration was also one Xilinx’s (Intel's primary FPGA competitor) messages behind its 7nm Versal (formerly Project Everest) FPGA. Xilinx intends to launch its competing FPGA in 2019. 

AgileX: FPGA for the Data-Centric World

As a broad overview, Intel focuses on three main goals for AgileX: high-performance compute capabilities, any-to-any integration, and software. In short, AgileX is based on the Hyperflex 2 architecture built on the 10nm process and delivers 20 TFLOPS of single-precision performance. Intel is also using EMIB to connect any chiplet tile and any process node to the FPGA. That allows the company to tune and highly personalize the FPGA to the needs of each customer. Finally, Intel wants to make FPGAs accessible to all software developers with its OneAPI programming interface.

Credit: IntelCredit: Intel

Compute: Hyperflex 2

The 10nm AgileX FPGA also incorporates memory coherency with Xeon processors and includes several high bandwidth and memory capabilities.

First and foremost, AgileX will provide a welcome boost in performance. Altera introduced the Hyperflex architecture (PDF) in Stratix 10. The idea behind Hyperflex is to weave in bypassable hyper-registers throughout the core fabric, in every routing segment, on all block inputs, and throughout the FPGA's interconnect. Hyperflex has more than 10x the number of hyper-registers than ALM registers, which allows any path between logic cells to be registered.

This enables a new approach for improved timing, pipelining, and optimization to eliminate critical paths that cause routing delays, along with unlocking additional software optimization opportunities. This architecture combined with the 14nm process helped Stratix 10 achieve a 4x higher logic density and 2x increase in clock speed (or 70% lower power).

Intel isn't releasing Hyperflex 2 architecture specifics but says that AgileX will be up to 40% faster than Stratix 10 or consume 40% less power. Intel has also doubled its DSP capabilities, providing 20 TFLOPS of single-precision or 40 TFLOPS of half-precision performance. Intel says AgileX is the only FPGA with hardened FP16 and Bfloat16 capabilities.

Credit: IntelCredit: Intel

On the memory side, which Intel says is becoming more and more important, AgileX adds support for DDR5, HBM3, and Optane DC Persistent Memory DIMMs. This covers the entire memory spectrum from low latency devices to SSDs, too.

To expand the interconnect beyond the confines of the chip, AgileX is compatible with both PCIe 4.0 and PCIe 5.0, though PCIe 5.0 support will come at a later date. These faster interfaces double or quadruple the data rate compared to PCIe 3.0. Intel also looks to extend its transceiver leadership with 112G transceivers. Intel tells us that they are currently the only FPGA vendor shipping with 58G PAM-4 transceivers, while Xilinx is still on 28G.

Credit: IntelCredit: Intel

FPGA for AI

For AI, Intel touts its "AI Plus" as a key value proposition. This technology means an FPGA can also do data ingest and preprocessing before it uses the actual neural network. Intel also tweaked the DSP blocks to enhance AgileX's suitability for AI workloads. Seeking to retain its leading floating-point performance, Intel added additional DSP hardware to achieve 40 TFLOPS of FP16 performance and added hardened Bfloat16 support like the Cooper Lake, Ice Lake, and Nervana Spring Crest processors that are also coming to market this year. Intel says AgileX is the only FPGA with hardened FP16 and Bfloat 16 support.

Credit: IntelCredit: Intel

On the integer side of things, Intel added support for lower precision INT8 through INT2 and said AgileX will deliver up to 92 TOPS INT8 performance. All this is supported on the software side by Intel's OpenVINO toolkit and OneAPI.

MORE: Best CPUs

MORE: Intel & AMD Processor Hierarchy

MORE: All CPUs Content