Cloud native EDA tools & pre-optimized hardware platforms
Hardware emulation and FPGA-based prototyping, born in the mid-1980s from the pioneering application of nascent Field-Programmable Gate Arrays (FPGAs) to prototype pre-silicon designs, emerged as a novel verification tool, disrupting the dominance of software-based simulation.
Initially, hardware emulation growth was constrained by complex implementations requiring specialized expertise. However, persistent engineering innovation in the areas of easing bring-up and debug, encompassing both software and architectural refinements, propelled the technology forward. From the mid-1990s to the mid-2000s, hardware emulation matured into a critical validation method for complex processors and graphics designs. This era saw the rise of in-circuit emulation (ICE), deploying the design-under-test (DUT) within its intended physical target system, a methodology closely related to FPGA prototyping.
The convergence of hardware emulation and FPGA prototyping into hardware-assisted verification (HAV) platforms marked the next phase. HAV leveraged emulation for fast bring-up of large-scale designs and comprehensive debugging of synchronous systems, where a step in one block triggers coordinated progression across the entire design. Meanwhile, FPGA prototyping prioritized execution speed, enabling complex asynchronous operations that closely mirror the final design. This approach enhanced performance but required a longer bring-up period. At the same time, advancements in partitioning tools enabled the development of large-scale prototypes, surpassing the billion-gate threshold.
Although HAV experienced a period of relative stagnation until the mid-2010s, beginning around 2018, a dramatic surge in adoption reflected in rapid growth in total revenues signaled a renaissance. HAV is the by far largest component in the overall verification revenue shown below.
Chart 1: EDA Verification Revenue Growth (Source: ESDA)
The primary catalyst for the growth of HAV lies in the escalating complexity in both hardware and software components of modern system-on-chips (SoC). As hardware designs grow in size, the volume of cycles necessary for comprehensive verification swells proportionally. Simultaneously, the increasing density of software compels more testing to ensure functionality and performance. By plotting design complexity on the x-axis and software complexity, namely, number of verification cycles, on the y-axis, the resulting area represents the total verification effort. This area has been expanding rapidly in both dimensions, underscoring the growing press for HAV solutions.
Another, as important, catalyst for growth sits in the fundamental nature of new AI hardware. Unlike traditional Central Processing Units (CPU), which excel at computing software programs, AI accelerators are based on specialized architectures designed for massive parallel computing necessary to process machine learning algorithms.
The essential difference between CPUs and specialized AI accelerators stems from their distinct design philosophies and target workloads. CPUs, the workhorses of general-purpose computing, are optimized for sequential instruction processing. They are particularly adept at executing software applications where tasks are mainly performed step-by-step. While modern CPUs incorporate multiple cores for parallel execution, their inherent architecture is geared towards serial operations, limiting their ability to scale for modern AI. Leading-edge server CPUs, such as AMD's EPYC series, exemplify this design, reaching core counts of up to 192, still a constraint when faced with the needs of AI.
In contrast, AI accelerators, including Graphics Processing Units (GPUs), Tensor Processing Units (TPUs), and custom Application-Specific Integrated Circuits (ASICs), are architected for execution of AI algorithms that demand the simultaneous processing of hundreds of billions of parameters in Large Language Models (LLMs). Their core strength lies in their ability to perform numerous calculations simultaneously, particularly matrix multiplications and convolutions, which are the building blocks of neural network computations. These operations involve processing vast amounts of data in parallel, a task where CPUs struggle.
NVIDIA's best-in-class Blackwell GB202 GPU, for instance, showcases this paradigm shift, boasting 24,576 compute cores. However, the true power of AI accelerators emerges from their ability to be integrated into large-scale systems. Unlike CPUs, where scaling is limited by software and architecture, AI accelerators can be interconnected using high-bandwidth technologies enabling them to function as a unified processing powerhouse. NVIDIA's GB200 NVL72 integrates 72 Blackwell GPUs and 36 Grace CPUs, effectively pushing the total number of AI cores within a single system to well over a million. This massive parallelism is essential for running the most complex AI models, where the ability to process vast datasets and perform convoluted calculations concurrently is paramount.
Despite the architectural differences, both CPUs and AI accelerators necessitate rigorous pre-silicon validation through software workload execution. This process involves running industry benchmark programs, end-user applications, and, for AI accelerators, LLMs. Additionally, coherency testing among the cores broadens the scope of validation. In the case of RISC-V, verifying ISAs extensions further adds to the complexity of CPU pre-silicon validation. Given the need for multiple iterative executions of software workloads per day, comprehensive testing calls for quadrillions of verification cycles.
A unique challenge in validating AI processors concerns the dual operational modes of AI models, namely, training and inference. Training asks for processing enormous amounts of data to fine-tune network parameters deeply embedded within multi-layer deep neural networks. This computationally intensive phase, which can run for weeks or even months, demands throughput on the order of many PFLOP/s and memory bandwidth reaching TB/s. Inference, while less demanding in raw computation due to smaller datasets, prioritizes low latency to deliver real-time results within stringent energy and cost constraints.
The scale of these operations is staggering. Validating AI hardware accelerators and complex compute clusters¡ªwhere coherency is critical¡ªoften necessitate quadrillions of verification cycles to ensure performance, energy efficiency, accuracy, and reliability. This sheer volume of testing far exceeds the capabilities software-driven simulation tools, which struggle to keep pace with the complexity and parallelism of AI workloads.
Figure 2: Factors driving the quadrillion cycle challenge
Sources: AWS, Synopsys, , Baya Systems
The complexity of these AI models, coupled with the downright scale of the workloads, needs the deployment of HAV platforms.
Hardware emulators and FPGA prototypes provide a cycle-accurate, high-speed environment for executing these workloads, allowing engineers to identify and rectify design flaws before committing to silicon.
The reasons for HAV's unrelenting growth are clear: the AI revolution has created a verification bottleneck that software simulation alone cannot address. The requirement for speed, capacity, and scalability in verifying complex AI hardware via real-world workloads has shifted the roles played by software simulation and HAV platforms. While software simulation remains the essential choice for IP, block, and sub-system design verification¡ªindispensable for its unparalleled debug capabilities¡ªHAV platforms have emerged as a cornerstone for full-system and multi-die and chiplet, i.e., system of systems (SoS) validation, enabling pre-silicon validation at scale.
The increase of HAV adoption will continue to be shaped by the key trends that have driven its prominence. First, the ever-growing complexity of AI models and the emergence of novel architectures¡ªparticularly those featuring rapidly evolving memory and communication interfaces¡ªwill necessitate ever more powerful and scalable verification tools. Second, the rising emphasis on energy efficiency and security in AI systems will drive the deployment of HAV platforms for precise peak and average power analysis, as well as for comprehensive security validation within real-world workloads. Third, the integration of AI into a broad range of applications, from autonomous vehicles to healthcare, will further underscore the need for domain-specific verification methodologies.
To address these evolving demands, HAV platforms must embrace several key innovations. AI-powered analysis tools will enhance bug detection and root cause analysis, significantly improving verification efficiency. Cloud-based emulation and prototyping will provide scalability and accessibility, enabling more adaptable design workflows. Standardized verification methodologies and benchmarks will facilitate industry collaboration and interoperability. Given the diverse verification setups across computing, AI training, edge AI, automotive, consumer electronics, and wired/wireless applications, modular verification will be critical for managing mega-designs surpassing 60 billion gates and complex system-of-systems configurations. Furthermore, the convergence of emulation and prototyping will create a more seamless and integrated verification flow.
In conclusion, the renaissance of hardware-assisted verification platforms is a direct consequence of the AI revolution and the increasing complexity and cost of software development. The unique demands of AI hardware accelerators¡ªcharacterized by massive parallelism, high computational workloads, and stringent performance and power requirements¡ªhave elevated hardware-assisted verification (HAV) to a pivotal role in the design process.
As AI continues to advance and permeate every aspect of modern technology, the role of HAV in ensuring performance, power efficiency, accuracy, reliability, and security will only become more critical. By embracing innovation and adapting to the evolving landscape of AI, HAV platforms will remain indispensable in the development of next-generation hardware.