
NVIDIA vs AMD Graphics Cards: Comparing the RDNA and Turing GPU ArchitecturesĪMD RDNA: Dual Compute Architecture and Wave32. But, because of the way SIMDs works, this wasn’t the case and the CUs were often left underutilized. In an ideal world, this would mean that the effective time taken for one wave is one cycle. To sum up, like many other SIMD designs, a GCN Compute Unit worked on four wavefronts at a time and took four cycles to execute them. If there are only one or two sets available, the rest of the slots will be idle for that cycle. The vector scheduling works on the basis that there will always be one instruction to be executed on multiple items. On the other hand, the AMD counterparts had to wait for four cycles for the next one despite having room for additional wavefronts.Įach vector can perform the same instruction on multiple data sets. As a result, the competing NVIDIA GPUs with similar shader counts were much faster thanks to their Super-Scalar architecture and took only one to two cycles to execute these shorter dispatches. The reason why this wasn’t very effective is that most games use shorter work queues due to which only one or two out of the four wavefronts were saturated per execution cycle. What is SIMD? How Does it Work and How is it Different from SIMT?. The difference was that the CU or SIMDs could switch to any of the four available waves. The instructions within a wavefront were still executed as per their order. At the same time, GCN wasn’t an out-of-order architecture. Like Bulldozer, the aim was to maximize parallelization.
Yes, it’s true that the scheduler could issue new wave groups after every four cycles, but at a time each Compute Unit would also work on four 64-item waves, not one 64-item wave. These were divided into four SIMDs (Single Instruction On Multiple Data Types), each packing 16 ALUs (SP). That’s what Navi is all about: It does a lot more with notably less hardware! AMD GCN: Powerful but UnderutilizedĪMD’s GCN graphics architecture consisted of 64 wavefronts or work-items (and ALUs/cores) per Compute Unit. The core difference is that RDNA reorganizes the fundamental components of GCN for a higher IPC, lower latency, and better efficiency. ALUs called stream processors provide the computational power, and the Command Processor (along with the ACEs) handles the workload scheduling per Compute Unit. The 1st Gen RDNA architecture powering the Navi 10 and Navi 14 GPUs (Radeon RX 5500 XT, 5600 XT, and 5700/XT) are based on the same building blocks as GCN: A vector processor with a few dedicated scalars for address calculation and control flow, separate compute and graphics pipelines running asynchronously. Similarly, GCN was the architecture while Vega and Polaris are codenames. RDNA is the GPU architecture and Navi is the codename of the graphics processors built using it.
T he hardware utilization was quite poor compared to contemporary NVIDIA parts, scaling dropped sharply after the first 11 CUs per shader engine, and overall, using more than 64 CUs per GPU wasn’t feasible.Īs a result, despite featuring a powerful compute architecture, AMD’s GCN GPUs (Vega) repeatedly lost to NVIDIA’s high-end gaming products, all the while drawing significantly higher power. Although the design had its strengths such as a powerful Compute Engine, hardware schedulers, and unified memory, it wasn’t very efficient for gaming.
AMD Radeon RX 6800 Review: Chipping at the Heels of the RTX 3080ĪMD’s GCN architecture powered Radeon graphics cards for almost a decade. In this post, we’ll talk about how the Navi GPUs are different from the traditional Vega and Polaris parts which were powered by the GCN architecture. While the RDNA 2 design is largely similar to RDNA 1 in terms of the compute and graphics pipelines, there are some changes that have allowed the inclusion of the Infinity Cache and the high boost clocks. With the launch of the Big Navi graphics cards, AMD has finally returned to the high-end GPU space with a bang.