# Solution Brief

Test and Measurement Memory, HBM, HBM2e



# Intel Agilex® 7 FPGAs M-Series Overcome Memory Bandwidth Bottlenecks

Intel expands the Intel Agilex® 7 FPGA product offering with M-Series devices equipped with high fabric densities, in-package HBM2e memory, and DDR5 interfaces for high-memory bandwidth applications



### **Executive Summary**

In the networking and data center realm, we are seeing a vast increase in data traffic. Interconnect speeds have surpassed memory bandwidth improvements causing memory bottlenecks to appear. Enhancements to memory bandwidth and compute efficiency are needed to alleviate these bottlenecks.

Intel Agilex® 7 FPGAs M-Series combine high fabric densities and flexible memory options such as HBM2e, LPDDR5, DDR5, and DDR4 memory support to provide the right balance of capacity, power efficiency, and performance for memory-driven workloads. These features make M-Series devices optimal for high-memory bandwidth applications such as cryptocurrency mining, next-generation network firewall, 800GbE testers, and 8K video processing. These features make M-Series devices optimal for high-memory bandwidth applications such as cryptocurrency mining, next-generation network firewall, 800GbE testers, and 8K video processing.

## **Network/Communications**

Networking applications such as Next-Gen Firewall (NGFW) require high-performance data paths, deep memory buffers, and high-bandwidth connectivity. These physical requirements combined with increased software processing functions have pushed non-FPGA based deployments past their functional limits. M-Series devices satisfy these hardware requirements with a robust memory hierarchy consisting of DDR5 and HBM2e, a hardened memory network-on-chip (NoC) for high-speed data movement, and support for the latest Ethernet and PCIe connectivity standards. Intel 7 technology-based programmable logic is available to offload traditionally software-based compute intensive workloads from the CPU. These functions include high-performance deep packet inspection, complex access control list processing, and transport layer security (TLS). FPGAs can also implement multi-gigabit ternary content-addressable memory (TCAM) for n-tuple access control lists and complex search functions.

Authors
Laura Leon

Product Marketing Engineer Intel Corporation M-Series devices can be deployed in both virtualized and non-virtualized NGFW networks. For a virtualized network, a commercial off-the-shelf server with a SmartNIC or infrastructure processing unit (IPU) with a M-Series device can replace physical appliance-based hardware. For a non-virtualized implementation, the NGFW may be implemented on a purpose-built card with the M-Series device. In both cases, the architectural benefits of the M-Series device apply.



Figure 1. Block Diagram of next-generation Firewall implemented using M-Series devices

#### **Broadcast**

Currently, 4K UHD is the standard for television, cameras, and video streaming. 8K UHD is on the horizon, and broadcast infrastructure must expand with components and systems capable of supporting 8K resolutions. With a pixel resolution of 7680 x 4320, 8K represents a 4X increase in pixels compared to 4K, requiring 4X higher memory and data processing bandwidth. Applications such as high-end 8K cameras add further complexity, with metadata added to each pixel for high-definition rate (HDR), sensor data, and color correction to name a few. Transferring the massive image sensor pipeline to the processing device requires high bandwidth connectivity using protocols such as 100G Ethernet, 12G Serial Digital Interface (SDI), PCIe, DisplayPort or HDMI, as does displaying or storing the processed data.

M-Series devices are capable of 1TB/s of memory bandwidth with DDR5 and HBM2e for 8K data processing, with high-bandwidth data movement handled by the integrated transceivers. Intel Agilex 7 FPGAs also bring reconfigurability to modern broadcast applications, allowing manufacturers to rapidly add new features and evolve to new standards, speeding time to market and enabling products to stay in the market longer.



**Figure 2.** Block diagram of an 8K video application using M-Series devices

#### **Test and Measurement**

Ethernet speeds are growing exponentially, spiking the need for ethernet testing. As 400GbE is still rolling out, 800GbE is already under development. Customers are demanding testers capable of supporting a variety of speeds and protocols. A high-performance 800GbE tester requires:

- High-bandwidth memory support for the high volume of data feeding into the tester
- High performance fabric to process at clock rates up to 800 MHz
- Sufficient transceivers to implement multiple 400G or 800G port configurations.

M-Series devices meet all three performance markers, with support for external DDR4, DDR5, and LPDDR5 memories for packet buffering, fabric capable of implementing 800GbE in 1,024 bits at ~800 MHz instead of 2,048 bits at ~400 MHz, and up to 72 transceivers to customize any combination of 400/800G port configurations. Additionally, these devices bring benefits of lower dynamic power due to reduced device utilization, the high-clock rate allowing the design to potentially fit into a smaller device. M-Series devices enable 800GbE testing, which in turn will accelerate the design and development of next-generation devices.



Figure 3. Block Diagram of 800GbE tester

#### Cloud

High-memory bandwidth is a requirement for common artificial intelligence (AI), network processing, data analytics, and cryptocurrency mining applications. Taking a deeper dive into cryptocurrency mining, many cryptocurrency algorithms use a memory-hard loop to achieve ASIC resistance and increase decentralization. Memory-hard algorithms usually use a loop, where memory contents from one location are hashed with those from another many times. This loop is essentially a memory bandwidth test. Engineers designing with FPGAs to mine these coins need the absolute highest usable memory bandwidth with enough capacity to meet the requirements of the algorithm being mined. This memory can run to several GB in size, too big to fit in FPGA embedded memory blocks. M-Series devices meet these needs with the in-package HBM2e memory.

In cryptocurrency mining, the costs associated with electricity and removing waste heat can be significant. High power-efficiency directly translates to profitability. The in-package HBM2e memory of M-Series devices provides massive bandwidth with lower power consumption compared to discrete solutions. For many memory-hard algorithms, logic resources in the FPGA are not constrained. Mining efficiency can be increased by loading non-memory-hard mining algorithms into the free FPGA fabric while the primary miner is running with the high-bandwidth memory (HBM2e) and a smaller amount of logic resources, switching them in and out as the profitability changes.



**Figure 4.** Memory-hard algorithm implemented using the HBM2e, while running a secondary algorithm on the FPGA fabric

#### Conclusion

The amount of data being generated has skyrocketed, creating a demand for devices capable of handling the deluge. Data transfer speeds are improving at a much faster rate than memory bandwidth technology creating memory bottlenecks and resulting in applications requiring more memory than there is available. Memory bandwidth and compute efficiency need to be upgraded to avoid memory bottlenecks.

M-Series devices offer a variety of unique features that address the increased memory bandwidth and compute efficiency demands. M-Series devices support high-performance memory protocols including HBM2e, LPDDR5, DDR5, and DDR4 memory, enabling an extensive memory hierarchy to address various system requirements. A hard memory network-on-chip ensures efficient memory transactions between the HBM2e memory and the logic fabric. As the first FPGAs built on the Intel 7 process, M-Series devices have high logic densities and over 2X higher fabric performance per watt compared to competing 7 nm FPGAs. These features combined with support for PCIe 5.0, 116G transceivers, and hardened floating-point digital signal processing (DSP) in an HBM2e enabled FPGA, make M-Series devices the optimal solution for high-memory bandwidth applications.

#### Learn more

- · Read the Intel Agilex 7 FPGA M-Series White Paper
- Explore Intel Agilex 7 FPGA M-Series



Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

 $Performance\ varies\ by\ use,\ configuration\ and\ other\ factors.\ Learn\ more\ at\ \underline{www.Intel.com/PerformanceIndex}.$ 

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.