It takes only a cursory glance at the increasing graphical fidelity offered by games to see the advances in GPU computing power over the past decade. To truly appreciate just how vast this improvement has been, however, one needs to look beyond visuals and examine the raw computing ability offered by GPUs.
To illustrate this, consider the following example: in 2002, the Radeon 9700 Pro could provide a performance of 31.2 GFLOPS of performance, 5 years later the Radeon HD 2900 XT offered 473.6 GFLOPS and by 2012, the Radeon HD 7970 GHz Edition was capable of computing 4301 GFLOPS - an increase of 13,700% when compared to the HD 2900 XT.
Though this change can reasonably be attributed to Moore’s Law and the continued decline of the $:GFLOP ratio, it is important to note that CPUs have not kept pace with the exponential growth of GPUs’ computing power as between 2002’s Pentium 4 “Northwood” processor and 2012’s Core i7-3970X processor, the computing ability rose by “just” 2600% from 12.24 GFLOPS to 336 GFLOPS.
The significance of these comparisons are twofold, firstly that current generation GPUs boast well in excess of 10 times the computing ability of current generation CPUs and secondly, the vast majority of applications and computing tasks do not take advantage of the processing power offered by GPUs.
AMD aims to address this with its heterogeneous Uniform Memory Access (hUMA) technology which builds upon HSA, the “intelligent computing architecture” utilized in the company’s APUs that that “enables CPU, GPU and other processors to work in harmony on a single piece of silicon by seamlessly moving the right tasks to the best suited processing element”.
The hardware coherency provided by hUMA brings three key features to the table.
- Coherent Memory: Ensures that CPU and CPU caches both see an up-to-date view of the data
- Pageable Memory that allows the GPU to seamless access virtual memory addresses that are not (yet) present in physical memory
- Entire Memory Space: Both CPU and GPU can access and allocate any location in the system’s virtual memory space.
AMD demonstrates the technologies functionality with the following examples, without hUMA, the CPU must first explicitly copy data to GPU memory, the GPU completes the computation and then the CPU must explicitly copies the result back to CPU memory in order for it to be read. With hUMA the CPU can simply pass a pointer to the GPU, which completes the computation and produces a result that the CPU can directly read without any copying required.
In addition to the "Top 10 Reasons" noted in the above slide, AMD cites an additional six benefits that its hUMA technology brings to both developers and consumers:
- Ease and simplicity of programming through single, standard computing environments
- Support for mainstream programming languages including Python, C++ and Java
- Lover development costs as the more efficient architecture enables less people to do the same work
- Better experiences through "radically different user experiences"
- Enables more performance from the same form factor
- Longer battery life without sacrificing performance
Through both the HSA Foundation and partnerships with companies such as ARM, Qualcomm, Samsung and Texas Instruments, AMD can already claim broad support from major industry players. The company is evidently confident enough in its HSA / hUMA technology that it boldly predicts that HSA-based devices have the potential to constitute two-thirds of the 2.1 billion connected devices it expects to see by 2016.
Further information on this technology will be revealed during APU '13, AMD's Developer Summit taking place in San Jose between November 11 and November 14 2013 that will offer "14 different tracks with over 140 individual presentations".