Sign in with
Sign up | Sign in

GK104: The Chip And Architecture

GeForce GTX 680 2 GB Review: Kepler Sends Tahiti On Vacation
By

GeForce GTX 680’s Vital Signs

Once we strip the card of its cooling hardware, we’re left with a bare board.

The GK104 GPU sits at its center, composed of 3.54 billion transistors on a 294 mm² die. Again, Nvidia lands between two of AMD’s current-gen offerings: the Tahiti GPU contains 4.31 billion transistors in a 365 mm² die, while Pitcairn is composed of 2.8 billion transistors in a 212 mm² die.

Knowing that GK104 is manufactured on TSMC’s 28 nm node, the GPU’s supply will almost certainly be tight as multiple customers vie for the foundry’s limited output. With that said, the company tells us to expect some availability at launch, with another wave of boards to arrive in early April.

Now, let’s have a look at the GPU’s specifications compared to GeForce GTX 580 and AMD’s Radeon HD 7970:


GeForce GTX 680
Radeon HD 7950
Radeon HD 7970
GeForce GTX 580
Shaders1536
1792
2048512
Texture Units
128
112
128
64
Full Color ROPs
32
32
32
48
Graphics Clock
1006 MHz
800 MHz
925 MHz772 MHz
Texture Fillrate
128.8 Gtex/s
89.6 Gtex/s
118.4 Gtex/s
49.4 Gtex/s
Memory Clock
1502 MHz
1250 MHz
1375 MHz1002 MHz
Memory Bus
256-bit
384-bit
384-bit384-bit
Memory Bandwidth192.3 GB/s
240 GB/s
264 GB/s
192.4 GB/s
Graphics RAM
2 GB GDDR5
3 GB GDDR5
3 GB GDDR5
1.5 GB GDDR5
Die Size
294 mm2365 mm2
365 mm2
520 mm2
Transistors (Billion)
3.54
4.31
4.31
3
Process Technology
28 nm
28 nm
28 nm40 nm
Power Connectors
2 x 6-pin
2 x 6-pin
1 x 8-pin, 1 x 6-pin1 x 8-pin, 1 x 6-pin
Maximum Power
195 W
200 W
250 W
244 W
Price (Street)
£450
£350
£450
~£315


The GK104 GPU is broken down into four Graphics Processing Clusters (GPCs), each of which contains two Streaming Multiprocessors (formerly referred to as SMs, and now called SMXs).

While there’s undoubtedly a lot of depth we could go into on the evolution from the original GF100 design to the GK104 architecture debuting today, the easiest way to characterize this new chip is in reference to the GF104 processor that first powered GeForce GTX 460.

In comparison to GF104's SM, the GK104 SMXs features two times as many warp schedulers (from two up to four), dispatch units (from four up to eight), and texture units (from eight up to 16) per Shader Multiprocessor, along with a register file that’s twice as large. It also sports four times as many CUDA cores. GF104 included 48 shaders per SM; GK104 ups that to 192 shaders in each SMX.

GK104 SMX (Left) Versus GF104 SM (Right)
Per SM:
GK104GF104Ratio
CUDA Cores
192
48
4x
Special Function Units
32
8
4x
Load/Store
32
16
2x
Texture Units
16
8
2x
Warp Schedulers
4
2
2x
Geometry Engines
1
1
1x


Why quadruple the number of CUDA cores and double the other resources? Kepler’s shaders run at the processor’s frequency (1:1). Previous-generation architectures (everything since G80, that is) operated the shaders two times faster than the core (2:1). Thus, doubling shader throughput at a given clock rate requires four times as many cores running at half-speed. 

The question then becomes: Why on earth would Nvidia throttle back its shader clock in the first place? It’s all about the delicate balance of performance, power, and die space, baby. Fermi allowed Nvidia’s architects to optimize for area. Fewer cores take up less space, after all. But running them twice as fast required much higher clock power. Kepler, on the other hand, is tuned for efficiency. Halving the shader clock slashes power consumption. However, comparable performance necessitates two times as many data paths. The result is that Kepler trades off die size for some reduction in power on the logic side, and more significant savings from clocking.

Additional die area and power are cut by eliminating some of the multi-ported hardware structures used to help schedule warps and moving that work to software. By minimizing the amount of power and area consumed by control logic, more space is freed up for doing useful work versus Fermi.

Nvidia claims that the underlying changes made to its SMX architecture result in a theoretical doubling of performance per watt compared to Kepler’s predecessor—and this is actually something we’ll be testing today.

Alright. We have eight of these SMXs, each with 192 CUDA cores, adding up to 1536. Sixteen texture units per SMX yield 128 in a fully-enabled GK104 processor. And one geometry engine per SMX gives us a total of eight. Now, wait a minute—GF104 had eight PolyMorph engines, too. So Nvidia multiplied all of these other resources out, but left its primitive performance alone? Not quite.

To begin, Nvidia claims that each PolyMorph engine is redesigned to deliver close to two times the per-clock performance of Fermi’s fixed-function geometry logic. This improvement is primarily observable in synthetic metrics, which we’ve seen developers (rightly) argueshould represent the future of their efforts, exploiting significantly more geometry than today’s titles to augment realism. But if you’re playing something available right now, like HAWX 2, how can you expect Kepler to behave?

In absolute terms, GeForce GTX 680 easily outperforms the Radeon HD 7970 and 7950, regardless of whether tessellation is used or not. However, Nvidia’s new card takes a 31% performance hit when you toggle tessellation on in the game’s settings. In comparison, the Tahiti-based Radeons take a roughly 16% ding. Now, there’s a lot more going on in a game than just tessellation to affect overall performance. But it’s still interesting to note that, in today’s titles, the advantage Nvidia claims doesn’t necessarily pan out. 

In order to support the theoretically-higher throughput of its geometry block, Nvidia also doubles the number of raster engines compared to GF104, striking a 1:1 ratio with the ROP partitions.

Like GF104, each of GK104’s ROPs outputs eight 32-bit integer pixels per clock, adding up to 32. The two GPUs also share 256-bit aggregate memory buses in common. Where they differ is maximum memory bandwidth. 

Perhaps more true to its mid-range price point, GeForce GTX 460 included up to 1 GB of GDDR5 running at 900 MHz, yielding 115.2 GB/s. GeForce GTX 680 is being groomed to succeed GeForce GTX 580, though, which utilized 1002 MHz GDDR5 on a 384-bit aggregate bus to serve up 192.4 GB/s of throughput. Thus, GeForce GTX 680 comes armed with 2 GB of GDDR5 operating at 1502 MHz, resulting in a very similar 192.26 GB/s figure.

Ask a Category Expert

Create a new thread in the UK Article comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 13 comments.
This thread is closed for comments
  • 2 Hide
    graemevermeulen , 22 March 2012 21:07
    Wow. Knew it would be a beast, and for such a good price too. Well done Nvidia!
  • 3 Hide
    LLL , 22 March 2012 21:41
    well done NV! cant wait NV's 660 serie.
  • 1 Hide
    jemm , 23 March 2012 03:41
    It is a monster!
  • -2 Hide
    silicondoc_85 , 23 March 2012 03:44
    I love it, the GTX 680 wins and in page after page our red c angel tells us it's not meaningful as all cards can do.
    He even calls the GTX590 single card king defunct !
    The bias is so bad, and stinks to high heaven, but the review is blind as a bat to his red fanboy bloviating he would have us believe.
    I'm sure he will 100% excuse himself as he has before claiming rabid red fans complain so he's doing a good job.
    The bias is beyond disgusting anyway.
    As we will see the soon Crysis 1 and Metro 2033 will be the favorite games of all. All the rest will be described as " capable of being run on any card offering from either competitor so a dead match washout".
    This is the very definition of the red raging fanboy. When they lose in every fashion and form, suddenly everything is a tie and all cards are good, as they are all capable.
    You will never see the reverse done for Nvidia, it does not happen here, nor almost anywhere else.
    The real truth is more than apparent.
    The 680 is about 20% faster across the board, costs less, lower thermals and wattage, smaller core, with an immensely larger feature set that now the raging reds will have extreme trouble lying about over and over again as they have for years claiming "they don't care about any of it and it all sucks".
    Worse yet for them the 7970 loses in triple monitor gaming as well, the big fat ram lie is kicked to the curb as it should have been YEARS AGO.
    The GTX590 still has the single card crown these low life liars still cannot admit as we see it in so many of these reviews - but for the little lying amd fan, "it was not sufficiently dethroned" and for this reviewer as it shows up on top or under the 680 over and over again, it is "defunkct" - THAT'S RIGHT CHRIS IT IS ACCORDING TO YOU DEFUNKCT EVEN AS YOU INCLUDE IT IN YOUR BENCHMARKS.
    ---
    thanks for the thuosands of endless lies in red fanboy favor. thanks so much !
    I'm sure you all have nvidia cards in your own personal systems a well ! I know you do that proves you aren't a lying sack.
    way to go...
  • -4 Hide
    silicondoc_85 , 23 March 2012 03:54
    I can't even read this CRAP without getting furious.

    " The GeForce GTX 680 takes second place in Skyrim at 1680x1050 and 1920x1080. But across-the-board performance is so good that the victory isn’t entirely meaningful.

    (second means the top card GTX dual 590 wins, while the 680 smokes the lame 7970)

    Frame rates slow down just enough at 2560x1600 that the Radeon HD 6990 sneaks past Nvidia’s new card.

    ( but here cannot bring himself to state that once again the GTX590 BEATS THE 6990 AND TAKES FIRST PLACE - ALWAYS WORDED IN A GIANT BIAS FOR THE RED CARDS FAVOR - ALWAYS WITHOUT EXCEPTION)

    Again, though, neither the 6990 nor the GTX 590 are even for sale anymore, so their significance is largely symbolic. Frankly, I’m glad to see them go."

    ( Finally after blowing it, putting down the 680 as second place, then noting only the 6990 in name "beating the 680" and never mentioning the gtx590 beating the 6990, chris the biased perp dismisses the 6990 loss and the 590 win and proclaims he's glad to "see them go" - since of course his purposes for biased fanboyism have been served)

    I mean you HAVE TO BE BLIND NOT TO SEE IT ON PAGE AFTER PAGE, AND CHRIS ANGELI IS DEFINITELY BLIND.
  • 4 Hide
    Anonymous , 23 March 2012 03:55
    @silicondoc_85
    Might be my fault cause of a fever, but after reading your post twice I still don't get what's your point.

    Apart from that, this card gives some nice insight into what Kepler has to offer.
    I'm still waiting to see other models before making any decisions.
  • 3 Hide
    Marsas , 23 March 2012 04:39
    Wow, I was just waiting for Nvidia cards to show up so AMD would lower the price of their high end cards (7950 and 7970) and I could buy one, but after this, I don't know how much the price should drop to make up for performance difference against this card.
  • -3 Hide
    silicondoc_85 , 23 March 2012 04:55
    More sick red bias of unbelievable proportion on power efficiency.

    " We set the GTX 580 as 100%, and the rest of the results speak for themselves. "

    (He tries to get away with saying nothing here, given the 172% GTX680 massive, massive win, but can't do it, so spinning is absolutely required below)

    The Radeon HD 7970 and 7950 both do deliver more performance per watt of power used compared to GeForce GTX 580—and by a significant amount.

    ( First he must brag up his favored brand, slamming down on the Nvidia card)

    But GeForce GTX 680 is like, way up there.

    ( Finally he brings himself to say it - it was very difficult, but the Nvidia win here is so enormous, he had to first try to say nothing, then brag up amd cards against nvidia, then finally in a tiny sentence, quickly mention the 680 being up there. Immediately afterwards, as usual the red fan has to discount and deny this massive accomplishment as this site and red fans and he has been pushing amd power efficiency wins down everyone's throat for the last two years solid. Now you shall see, it won't matter...)

    As a gamer, do you care about this?

    ( As a red fanboy, this is now "unimportant" when Nvidia scores the massive win, so Chris tells you you really shouldn't care, for the 1st time EVER when it comes to power efficency - when RED LOSES BAd and NVIDIA wins )

    Not nearly as much as absolute performance, we imagine.

    ( THE BIAS reeks again, the Nvidia card the the best absolute performance too, but since AMD can't claim a win on power, we switch up... and omit the dual Nvidia win neing mentioned directly - we go on further pretending that didn't happen, so that we can completely dismiss power/performance this time around, with Chris Angelini's massive red fan bias)

    And I personally doubt I’d ever pay more for a card specifically because it gave me better performance/watt.

    ( Here Chris declares the years of power/perf pushing what they should have been for the last two years here, not a paid for feature - but it was pushed as the reason why AMD cards must be chosen - worse yet Chris pretends you have to pay for it here, WHEN THE WINNING CARD THE NVIDIA GTX680 COSTS LESS AND IS MOST EFFICIENT... as Chris' fanboy minds twists against Nvidia, his analysis twists sideways as well, clearly completely losing tracks of the facts he just discovered in testing, that Nvidia is faster, CHEAPER, AND MORE POWER EFFICIENT ...)

    But with AMD and Nvidia both talking about their efficiency this generation, thanks to 28 nm manufacturing and new architectural decisions, the exercise is still interesting.

    ( LMAO - the last sentence "declars the tie! " after Nvidia's smashing win - the conclusion - they both talk about it, and "it's interesting"... ANOTHER GIGANTIC BIAS IN THE LOSER AMD'S FAVOR )
    ----
    I don't mind a slip here or there, but when entire sections are mindlessly biased in favor of AMD, over and over and over again... IT REALLY PISSES ME OFF !

    How about the truth next time Chris ? How about forcing yourself to remove your gigantic amd bias in your thoughts and typing... ? Here I WILL FIX IT
    --------------------------

    The GeForce GTX 680 won the power efficiency test by an enormous margin,as you see above. Nvidia’s and AMD’s respective new architectures diverge greatly here as Nvidia is faster and more power efficient, everything we've been promoting about AMD's card for years.
    The tables have more than completey turned.

    The Radeon HD 7970 and 7950 both do deliver more performance per watt of power used compared to AMD's prior generation, but they cannot come close to the GeForce GTX 680 which is so good we have to make the same recommendation on it we have for years for the AMD cards that were no where near this good.

    As gamers, we've cared about for years on this site, often placing it as the purchase decision above absolute performance.
    And I personally promoted amd cards specifically because they gave better performance/watt.

    With Nvidia now talking about this like AMD has for years, we can't just suddenly dismiss this as we'd like to and tell you it means nothing.

    (although they/Chris did of course in the real article)

  • 2 Hide
    jakjawagon , 23 March 2012 05:23
    Quote:
    anything below 60 Hz has to still be a multiple of 60


    This would be impossible and makes no sense. I think you meant 'factor'. /pedant
  • 2 Hide
    Anonymous , 23 March 2012 14:53
    Is there any reason why there are no eyeinfinity tests? You already said that nivida now supports 3ways + screens so lets see it. I want to see Battlefield 3 across 3 displays with ultra settings...

    Also gamers who can afford this level of hardware don't want to know about resolutions of 1680x1050...
  • 1 Hide
    santfu , 24 March 2012 08:39
    I'm not sure that silicondoc is feeling very well. The only fanboyism that i see is from silicondoc.

    I agree with Ashleyh for top end cards now 1680 x 1050 is a pointless test. For the cost of these cards you get some very nice 1920 x 1080 monitors or 3 not so nice ones.
  • -1 Hide
    dizzy_davidh , 24 March 2012 11:47
    silicondoc_85I love it, the GTX 680 wins and in page after page our red c angel tells us it's not meaningful as all cards can do.He even calls the GTX590 single card king defunct !The bias is so bad, and stinks to high heaven, but the review is blind as a bat to his red fanboy bloviating he would have us believe.I'm sure he will 100% excuse himself as he has before claiming rabid red fans complain so he's doing a good job.The bias is beyond disgusting anyway.As we will see the soon Crysis 1 and Metro 2033 will be the favorite games of all. All the rest will be described as " capable of being run on any card offering from either competitor so a dead match washout".This is the very definition of the red raging fanboy. When they lose in every fashion and form, suddenly everything is a tie and all cards are good, as they are all capable.You will never see the reverse done for Nvidia, it does not happen here, nor almost anywhere else.The real truth is more than apparent. The 680 is about 20% faster across the board, costs less, lower thermals and wattage, smaller core, with an immensely larger feature set that now the raging reds will have extreme trouble lying about over and over again as they have for years claiming "they don't care about any of it and it all sucks".Worse yet for them the 7970 loses in triple monitor gaming as well, the big fat ram lie is kicked to the curb as it should have been YEARS AGO.The GTX590 still has the single card crown these low life liars still cannot admit as we see it in so many of these reviews - but for the little lying amd fan, "it was not sufficiently dethroned" and for this reviewer as it shows up on top or under the 680 over and over again, it is "defunkct" - THAT'S RIGHT CHRIS IT IS ACCORDING TO YOU DEFUNKCT EVEN AS YOU INCLUDE IT IN YOUR BENCHMARKS.---thanks for the thuosands of endless lies in red fanboy favor. thanks so much !I'm sure you all have nvidia cards in your own personal systems a well ! I know you do that proves you aren't a lying sack.way to go...

    I have to agree. whenever there is an nVidia review at TomsHardware it's test results never come out as good as they should be, being either close to the manufacturers own or my personal results.

    As for the 590 being slower fps-wise in games like BF3 than the 680, nVidia themselves have results that show that a 590 will beat a 680 so again your results are crap!

    I can only think that your test setup is flawed in some way or you simply have no idea what you are talking about (I suspect the latter).
  • 0 Hide
    SSri , 23 April 2012 22:55
    GTX 680 comes out extremely poor in computing performance scoring almost just a third of GTX 580's! Is there any pun intended in that card? I can't believe that a top of the line card can be so poor in computation..It is pretty fishy...