The GPU is, without a doubt, one of the most important, and most interesting, components of all those currently used to give “life” to a computer. These acronyms refer, in English, to “graphics processing unit”, three words that indicate, simply and clearly, that it is an element that deals with the graphic tasks of the system.
However, it is important to take into account that the concept of GPU that we handle today is not exactly the same as that which prevailed, for example, in the nineties, and in turn, the concept of that time does not fit with the one used, for example, in the eighties. This has an explanation, and it is that the architecture of the graphic processing unit, and its way of working, have been changing to adapt to the evolution of the sector.
In the 1970s and 1980s, 3D graphics was so cost-prohibitive in terms of performance, and almost all graphics adapters at that time were limited to 2D. Techniques such as “blitting” and the drawing of “sprites” were combined with different effects to give shape to works that, in some cases, reached significant complexity. Remember, for example, the wonders that Neo Geo could create despite being limited to 2D.
GPU evolution: from sprite to triangle
When we talked about 2D graphics, aspects like sprite size, color depth, and resolution were some of the most important when it came to differentiating the performance that a particular GPU could offer. 2D acceleration boomed in the early 1990s, but it was the leap to 3D that really marked a huge transition in the concept of GPUs.
I’m sure our older readers will remember what the adoption of 3D graphics meant. The concept of GPU acquired a new meaning, we went from talking about “sprites” to focusing on triangles per second, and yes, it was also important to assess other aspects, such as texturing, lighting, texture filtering, Z-buffering and the amount of memory available, but the most important thing was that jump from the “sprite” to the triangle.
3D graphics have always used predominantly polygonal shapes, but the triangle has been one of the most popular, so much so that for a long time, the raw performance of a GPU was measured, in large part, by the total number of triangles per second it was capable of handling. Voodoo 2, from 3DFX, was capable of moving 3 million triangles per second filtered, mapped, Z-buffered, alpha blended, fog-enabled, and textured. If we talk about flat triangles, the figure was much higher.
From there, the GPU embarked on an unstoppable evolution that left us with wonderful moments for any technology lover. Among the most interesting, and most important, we can highlight, for example, the debut of the NVIDIA GeForce 256, which marked the arrival of the first GPU with hardware transformation and lighting support; the release of the GeForce 3 series, which featured programmable shaders; the introduction of the ATI Radeon 9700, which greatly improved pixel shading; and the GeForce 8000 series, which introduced unified shading engines, popularly known as shaders, and marked the transition from pixel and vertex shaders to geometry shaders.
If we look at the present, the last great evolution that the GPU has experienced has been given by specialization, and this has been marked by an important advance in the coexistence of traditional rasterization with artificial intelligence and ray tracing.
With the Volta architecture, NVIDIA introduced a whole new element to the GPU: tensor cores, which are dedicated to powering tasks associated with artificial intelligence. With Turing came RT kernels, specialized in ray tracing. AMD has taken a similar approach with the Radeon RX 6000 (RDNA 2 architecture) but has only implemented specialized ray tracing hardware.
This small historical review has been necessary to have a starting point that allows us to understand, in a simple way, the point in which we are currently, and the enormous advances that we have achieved in the world of the GPU. Both NVIDIA and AMD have significant achievements behind them, and both have played key roles in the industry that have allowed us to reach a point that a few years ago, would have been unthinkable.
What is a GPU and why is it important? A deeper look
We can define the GPU as a specialized one that deals with carrying out everything related to the graphic workload that a system must process. Unlike the processor, which performs general-purpose tasks, the GPU specializes in tasks that are necessary to create graphic elements, both in 2D and 3D, and is capable of performing a high number of floating-point operations per second.
A processor or CPU lacks that level of specialization and is not capable of working with graphic elements. The workload that a GPU must face is made up of very important tasks, which include everything from the most basic visual representation, such as the desktop of your PC, to the execution of advanced 3D graphics in next-generation games, passing, of course, for decoding and acceleration of videos in different formats, color treatment, and post-processing functions associated with images, videos, and other multimedia content.
As we can see, the role of the GPU is very important, and although its way of working presents similarities when we compare it with the CPU, in the end, we must be clear that we are dealing with a very different component. We are going to dwell on this topic for a moment, as it will help us dig deeper into the GPU and the way it works.
A processor performs general-purpose operations. It does this by using its different cores, storing instructions and data it needs in its cache, and saving resolved operations to RAM, allowing it to re-access those operations when needed without having to reprocess them. A high-performance processor, today, has 8 cores.
A GPU adopts a similar foundation. It receives data and instructions from the CPU, it has its own memory to store certain elements (geometry, textures, shaders, etc.), which it can access when needed, without having to process them again, but its work is limited to the graphical tasks, and to face them it uses a huge degree of parallelism.
Previously, we have said that a current high-performance CPU can have eight cores. Those cores allow you to parallelize eight different processes. Well, a GPU, like the RTX 3080, has 8,704 shaders or CUDA cores, that is, it has thousands of small cores, which makes it possible to parallelize large and complex workloads, and carry them out more efficiently.
This simple comparison helps us understand why a GPU has such high potential in floating point operations, and why they are able to offer a high level of performance in other sectors that go beyond graphic design, rendering, and gaming. such as scientific inquiry, inference, deep learning, and data analysis. These are sectors that, due to the very nature of the workloads they represent, are solved much better when they are parallelized in thousands of small cores.
In summary, a GPU is a graphic processing unit with a high parallelization capacity, capable of working and processing graphics, and of converting information and data into elements visible to the user, but also of carrying out tasks that require the performing a large number of concurrent operations in parallel. A GPU is much more than a semiconductor to create graphics, it is silicon with enormous potential that can give a lot of itself in different sectors, as we have seen.
The GPU is the graphics engine of any computer. Without it, we couldn’t do something as simple as representing the Windows 10 desktop, for example, and we couldn’t run 3D games or enjoy high-resolution multimedia content either. However, it is important to keep in mind that a GPU only develops its full potential in applications and tasks that are prepared to take advantage of GPU acceleration, that is, to parallelize at very high levels.
What elements make up a GPU?
Going into a detailed description down to the millimeter would take us a long time, and would make this article too long and complicated, which I obviously do not want. For this reason, I am going to adopt a similar approach to the one we saw in our special dedicated to explaining what a processor is, why it is important, and what elements make it up.
We already know what a GPU is, and we’re clear about why it’s important, so we’re ready to look at its most important parts. As I said at the beginning of the article, the structure of the GPU has changed enormously over time, in fact, relatively recently we experienced another very important evolution that was motivated by the introduction of specialized hardware for ray tracing and artificial intelligence.
In spite of everything, there are a series of common elements that both NVIDIA GPUs and AMD GPUs use, and that are essential for a GPU to develop its work. We are going to focus on these so that you have a complete and realistic vision of the “ins and outs” without the need to go into unnecessary complexities.
- SM units or Compute Units: NVIDIA uses the former, AMD the latter. Each unit integrates a set of shader engines, texture units, raster units, and, depending on the architecture, specialized cores for ray tracing and artificial intelligence. They are the base of any GPU.
- Shader engines: Also known as shaders. They deal with carrying out the workload associated with tasks as important as the transformation of the geometry, both at the level of color and shading and other effects (lighting, fog, reflections, etc.). With the arrival of Turing, the architecture used in the RTX 20 series, NVIDIA opted for meshed shaders (DirectX 12 Ultimate), and AMD did the same with RDNA 2.
- Raster units: better known as ROPs. They carry out the necessary process so that an image, which has been expressed in vector format, is converted into a perfectly ordered set of pixels that are written to the frame buffer, from which they will be transmitted directly to the screen. They also carry out important operations associated, for example, with the application of filters and smoothing techniques (MSAA, for example).
- Texturing Units: We also know them as TMUs. These are in charge of applying a texture mapping to the geometry, that is, of “dressing” or putting “skin” on the polygons that make up the different elements of the game.
- RT cores: Dedicated to speeding up the processing of ray tracing-related workload. These kernels were introduced in 2018, and focus on computing BVH cross-sections, ray-triangle intersections, and box-bounding intersections.
- Tensor Cores: These cores are also a relatively new thing. They are only used on NVIDIA graphics cards based on the Volta, Turing, and Ampere architecture. They specialized in workloads focused on inference, artificial intelligence, and deep learning. In gaming, they deal with the workload of applying DLSS 2.0.
Memory Bus – The memory bus determines the width available for the GPU to communicate with the graphics memory. A bigger bus is always better as it increases the bandwidth, but it is more expensive and highly dependent on the speed of the graphics memory used.
Graphics memory: It is not integrated into the GPU, but it plays such an important role that we wanted to include it. Its function is, in essence, very similar to that exercised by RAM with respect to the CPU. The GPU uses the graphics memory to save certain graphic elements and data that it has already processed, so that it can refer to them when needed without having to process them again, that is, without having to repeat work cycles. Textures, geometry, shaders, and other elements are stored in graphics memory. An insufficient amount of graphics memory can limit the performance of a GPU, and its working frequency also plays a role, since slower graphics memory offers lower access (write and read) speeds.
Other things to keep in mind
The GPU is a graphic processing unit that has, as we have seen, highly specialized hardware, which allows it to offer very high performance when working with any type of graphic element (geometry, textures, shading, and even ray tracing). ). However, its performance and efficiency vary depending on the architecture and manufacturing process you use.
This means that a GPU that has a greater number of shaders does not have to be better than one that has fewer shaders. For example, the GTX 780 Ti has a whopping 2,880 shaders, but it uses an outdated architecture (Kepler, released in 2012) and is built on the 28nm process, so it performs less than a GTX 1060 (based on Pascal, released in 2016), which adds 1,280 shaders, and is less efficient than this one, which is manufactured in the 16 nm process.
With the launch of each new architecture, both NVIDIA and AMD have introduced significant improvements at the performance level that have not always been linked to an increase in the maximum number of shaders, although it is true that, with the arrival of Ampere, things have changed. notably. However, it has an explanation, and that is that NVIDIA has doubled the shading engines, as we told you at the time in this article.
On the other hand, we must not forget that specialized hardware is having an increasing impact. Let’s understand it right away with a very simple example, the GTX 1080 Ti is more powerful than the RTX 2060, but the latter manages to outperform it in certain scenarios, thanks to the fact that it has RT cores and tensor cores. In Death Stranding, the RTX 2060 offers a very similar result to the GTX 1080 Ti moving said game in 4K, thanks to DLSS 2.0, and the second is unable to move ray-traced games smoothly, which the second can. do.
The GPU maintains its essence, but as we have been able to see in this article, its evolution has been so great that its specialization has become even more marked, and it has extended to unthinkable levels. It will be interesting to see what the future holds and to find out, above all, if we are finally able to make the leap to an MCM-type design.