Nvidia GeForce GT 1030 2GB Review

Once upon a time, Nvidia’s GeForce GT 730 was the lead-off hitter for our list of Best Gaming Graphics Cards. It sold for right around $70 and offered 384 CUDA cores. As AMD’s Radeon R7 300s started dropping below $100, though, their superior performance made a slightly higher price tag worth paying. For a while there, our least expensive recommendation started somewhere between $90 and $100.

Now, after more than a year of dominance at the high end with its Pascal-based 10-series cards, Nvidia is ready to challenge AMD’s entry-level position with GeForce GT 1030.

Gigabyte sent over its GeForce GT 1030 Low Profile 2G to represent Nvidia’s latest addition. The card ships with a full-sized slot bracket in place, but it includes a half-height bracket for slim enclosures as well. Although our sample is actively cooled, Gigabyte also sells a passive model sporting the same clock rates. Low-profile and passively-cooled? Yup.

GeForce GT 1030’s TDP is a mere 30W, so we can already guess that power consumption, thermals, and acoustics will be some of this board’s advantages over the competition. But can it keep up in our benchmark suite? After all, that’s what determines whether the GT 1030 succeeds GT 730 in our list of gaming graphics cards.

Meet GP108

GeForce GT 1030 utilizes an all-new graphics processor called GP108, composed of 1.8 billion transistors. It’s a teeny thing at just 70mm², thanks to the same 14nm FinFET process used to manufacture GP107. Compare that to GeForce GT 730’s GK208 chip with 1.02 billion transistors in an 84mm² die. Or how about the GeForce GTX 750 Ti, which we’re making the GT 1030 battle in today’s benchmarks? That card’s GM107 GPU has a similar transistor count as GP108, but in a 148mm² die, owing to its 28nm manufacturing process.

Here’s the thing, though: whereas GeForce GTX 750 Ti employs five Streaming Multiprocessors, GT 1030 comes equipped with three. Given 128 CUDA cores per SM/SMM in the Pascal and Maxwell architectures, that’s 384 cores for GT 1030 and 640 for GTX 750 Ti. Both designs also expose eight texture units per SM, totaling 24 on GeForce GT 1030, while GTX 750 Ti gets 40. The two GPUs feature a pair of ROP partitions, giving you up to 16 32-bit integer pixels per clock. However, those partitions are aligned with 256KB slices of L2 cache on GP108 and 1MB slices of L2 on GM107. That means GeForce GT 1030 includes 512KB L2 total—a big reduction from GTX 750 Ti’s 2MB. And whereas GeForce GTX 750 Ti utilizes two 64-bit memory controllers, GT 1030’s specs break the memory bus down into a pair of 32-bit controllers, adding up to a 64-bit interface. That’s a lot of lost resources for a ~4% difference in complexity.

Nvidia goes a long way to overcoming those deficits in GT 1030 with higher clock rates. Our sample employs a 1227 MHz base frequency and a typical GPU Boost rating of 1468 MHz. In contrast, GeForce GTX 750 Ti starts at 1020 MHz and boosts just slightly to 1085 MHz. Of course, a 64-bit aggregate memory bus cripples GT 1030’s peak bandwidth to 48 GB/s using 6 Gb/s GDDR5; GTX 750 Ti’s wider interface facilitates up to 86.4 GB/s.

GeForce GT 730: The Real Competition

But remember that GTX 750 Ti launched as a $150 card, and even now sells for $100 and up. We’re only fascinated by the comparison to GT 1030 because of their similar transistor counts. In reality, GeForce GT 1030 is the spiritual successor of GT 730 since Nvidia never created a 900-series product below the $160 GTX 950. Spec-wise, the Kepler-class GeForce GT 730 is a much closer match with 384 CUDA and 16 texture units across two SMXes, eight ROPs, 512KB of L2 cache, and a 64-bit interface. Also, GT 730’s 38W lands a lot closer to GT 1030’s 30W specification. GeForce GTX 750 Ti is a 60W card, and we haven’t seen evidence that it can be passively cooled in a low-profile form factor.

So why is GP108 so much more complex of a GPU than GK208? The two GPUs do span a couple of different architectures, so they’re organized differently, for starters.

GP108 utilizes a single Graphics Processing Cluster with three Streaming Multiprocessors. Again, each SM includes 128 CUDA cores, eight texture units, 24KB of L1/texture cache, and 64KB of shared address space. Meanwhile, GK208 employs two larger SMXes, each with 192 CUDA cores, eight functional texture units, 64KB of shared memory and L1 cache, and a separate texture cache. GP108 also sports 16 ROP units to GK208’s eight.

In the end, GP108 offers a much higher pixel fill rate than GK208 (19.8 GP/s vs. 7.2 GP/s). Its texture rate is much greater, too (29.8 GT/s vs. 14.4 GT/s). Further, Nvidia says that the work it did to enable Pascal’s aggressive clock rates and proper asynchronous compute support via dynamic load balancing added to the transistor budget. GeForce GT 1030 uses a complete GP108 processor—there are no disabled resources waiting to be switched on. It’s just a much denser GPU than GK208.

The competition from AMD lands somewhere between GM107 and GK208. Its Radeon RX 550 is a little more expensive (~$85) and slightly more power-hungry (50W). We’ve seen low-profile and “single-slot” versions, but not both. Nothing with passive cooling, either. On the other hand, you get 512 Stream processors, 32 texture units, and 16 ROPs in a 2.2 billion-transistor Polaris 12 GPU. That translates to a pixel fill rate of 17.6 GP/s and a texturing rate of 35.2 GT/s. Faster 7 Gb/s GDDR5 modules on a wider 128-bit memory bus give AMD a 233% theoretical bandwidth advantage, too.

And yet, Nvidia tells us its GeForce GT 1030 should trade blows with AMD’s pricier solution. If that turns out to be true, it’d be quite an achievement for a smaller and simpler graphics card able to fit into PCs that might not accommodate a Radeon RX 550.

MORE: Best Graphics Cards

MORE: Desktop GPU Performance Hierarchy Table

MORE: All Graphics Content