Sticky

Nvidia Volta Megathread

Seems Volta is not really discussed much here, besides being mentioned as a Vega competitor. Time to get some info on the board?

https://www.extremetech.com/extreme/249106-nvidia-goes-ai-new-volta-architecture

This article gets me thing that Volta is not just GPU but there is an AI and general computing side as well. May sound far fetched but at may have repercussions for us. If AI is now becoming a mainstream thing in business and special applications, we might see consumer applications some five years down the road. Especially considering that players like NVidia are willing to open source their code. I mean some of you guys might end up using your Volta GPUs for AI on your rigs, eg. things like truly learning AI in games. But I think AI will have a broader application than just gaming - even on the local PC. Not to mention there are already some applications that use the GPU computing power for special applications that are not AI.

Long-term, it seems there is a lot of hype around AI currently and certainly it looks like self-driving vehicles are just around the corner. It reminds me of the 1960s when human level AI was supposed to be just around the corner. But it's quite possible that in the first half of 21st century we'll see a number of mainstream AI applications and it's going to be one of the next big technological waves. (We're still waiting for a real biotech wave too.)
Reply to varis
37 answers Last reply
More about true volta thread
  1. nvidia definitely going all out with AI. and they will not going to give up the market completely to ASIC hence why they make their own Tensor Core inside their GPU. many people often only look nvidia as graphic company only. hence many expect that once nvidia secure their lead in GPU performance they will start to have it easy and start charging expensive price for very minimal performance improvement like intel. but nvidia really have no time to rest easy. because they also competing with other company like google in the field of AI for example. they were able to make their architecture more power efficient with kepler but that is not enough because in mobile (where they compete with tegra) there are more power efficient architecture exist. while their venture into smartphone and tablet did not go as plan they still benefit a lot from their experience dealing with mobile. that's why even if nvidia already ahead of AMD in GPU they still have to work harder.
    Reply to renz496
  2. AMD and NVidia have announced their press conferences at Computex 2017: NVidia is tomorrow tuesday and AMD coming wednesday.

    https://techdrake.com/2017/05/nvidia-and-amd-announced-computex-2017-press-conferences/

    Overview of what's coming in the conf:

    http://www.trustedreviews.com/news/computex-2017-news-launches-nvidia-intel-amd-asus-microsoft-announcements

    Some reporters seem to even think NVidia would launch Volta at Computex. I'd hope for GTX20 at best...

    Liveblog:

    http://www.anandtech.com/show/11457/asus-computex-2017-press-conference-live-blog
    Reply to varis
  3. honestly i don't think we will going to see any volta related today. meanwhile board partner are gearing up to showcase some of their extreme 1080ti for computex. that effort will be useless if nvidia suddenly release 1080ti successor today

    https://videocardz.com/69791/msi-to-unveil-geforce-gtx-1080-ti-lightning-z-at-computex
    https://videocardz.com/newz/evga-teases-upcoming-products
    https://videocardz.com/newz/colorful-igame-gtx-1080-ti-neptune-pictured
    Reply to renz496
  4. Sane thinking. So far Volta was discussed but it was all about AI and tensor cores.

    Lighter gaming laptops with geforce are coming though :)
    Reply to varis
  5. the main star on the gaming side of the presentation is no doubt was the new thin gaming laptop. JHH even show actual product on stage. that's is like to show off that you can get such performance on laptop right now where as there isn't something similar from "competitor".
    Reply to renz496
  6. I think there are 2 main takeaways for the "Volta generation" from nVidia and they won't be pure "GPU" enhancements. There will be that as well, but my take is nVLink for consumer and hardware schedulers for nVidia. Those two are really interesting to me at least, even if "brute performance" doesn't go up two-fold.

    Cheers!
    Reply to Yuka
  7. the 1080ti just released, we wont see its successor for quite some time, probably a year and a half. a side grade 1170/1180 may show up, but big volta wont be around for a long time.
    Reply to nikoli707
  8. So what exactly does AI incorporation into Volta architecture mean for gaming? I mean it's badass but I'm having issues trying to imagine the impact it will make in the pc gaming scene.
    Reply to techbard
  9. techbard said:
    So what exactly does AI incorporation into Volta architecture mean for gaming? I mean it's badass but I'm having issues trying to imagine the impact it will make in the pc gaming scene.


    probably none just like how it was with pascal. though realistically FP16 can still somehow benefit gaming (this is what AMD try to do with Vega). but those tensor cores? honest i don't know. that thing is pretty much specific function designed for AI related task. but then again GV100 is designed in a way to be a compute monster. simultaneous FP32 and INT32 operation. massive FP64 (1:2 FP32 ratio), massive FP16 and the new tensor core that are more optimized for AI specific task. at this point we just don't know if those tensor core is "mandatory" to all volta design or specific to GV100 only.
    Reply to renz496
  10. so nvidia indeed researching their own infinity fabric for their future GPU?:

    http://techreport.com/news/32189/nvidia-explores-ways-of-cramming-many-gpus-onto-one-package
    Reply to renz496
  11. In terms of market, Nvidia has established a generational lead over AMD (the latest AMD gaming cards are only as fast as Nvidia's second best, two year old cards). With the release of Vega, I'm looking for Nvidia to make some Volta news towards the holiday buying season. Not that they will have a new product to sell, but that they can insert some doubt into anyone looking to purchase the current generations. I think they will make maintaining that lead an important strategy.
    Reply to 17seconds
  12. Well, they already said: "hey, look, we'll park Volta release for a few extra months, due to... reasons; but hey! there's plenty of Pascal stuff still!".

    Cheers!
    Reply to Yuka
  13. well if we look at nvidia past release it is more realistic to expect Volta in Q2 2018. earliest in Q1 next year. for sure no gaming volta this year since nvidia already said so during their latest earning.
    Reply to renz496
  14. renz496 said:
    well if we look at nvidia past release it is more realistic to expect Volta in Q2 2018. earliest in Q1 next year. for sure no gaming volta this year since nvidia already said so during their latest earning.


    Oh, yes. I meant that in the gaming-plebeian context. I don't believe they'll have delays for the "pro" stuff..

    Cheers!
    Reply to Yuka
  15. Yuka said:
    renz496 said:
    well if we look at nvidia past release it is more realistic to expect Volta in Q2 2018. earliest in Q1 next year. for sure no gaming volta this year since nvidia already said so during their latest earning.


    Oh, yes. I meant that in the gaming-plebeian context. I don't believe they'll have delays for the "pro" stuff..

    Cheers!


    actually i did not expect even the "pro" volta to be ready in late 2018 or Q1 2018. but the competition in AI space is heating up. nvidia need to make sure GPU not to be out done by chip that is build specifically for AI application only like google TPU. those that pre-ordered DGX-1 before nvdia officially announced tesla V100 being upgraded to tesla V100 without additional charge. it seems nvidia want to replace tesla P100 completely. that is very fast even by nvidia standard. just look how long nvidia keep their GK110 as their compute flagship before they replace them with P100.
    Reply to renz496
  16. I knew you guys would help me resurrect this thread.
    Reply to 17seconds
  17. nothing much to discuss on volta right now. especially on the gaming side of things. but on compute side of things there probably some new enhancement that will make volta a monster in mining......
    Reply to renz496
  18. Uhm... I'll go on a tangent here, because I don't really know the inner workings on AI development nor there is abundant information on implementation and design for it.

    From the holistic design perspective, Artificial Intelligence has 2 main aspects to it: Process and Analysis.

    For Process, I'm putting in the same bag the "simulation" or adaptation of thought process into a machine lang (Lisp or any lang that is actually used) where you can indeed tailor ASIC stuff without much complexity in the process-to-metal translation.

    For Analysis, I'm putting in the same bag all the "computation" behind each singular point of decision and intake of information given (massively parallel is an understatement here). This is a huge bag, even bigger than process, but straight forward: you can process as much as you can calculate (input vs output to an effect), so the effective decision nodes will be a reflect of that amount of data you process in time (kind of obvious, right?). Simple in concept, a nightmare in actual implementation (those trees!) since coordinating the decisions into a single point of action is quite the challenge, depending on implementation.

    So, what I'm trying to get at here is the point of nVidia being competitive in this AI landscape.

    Much like you can use multipurpose CPUs to do whatever you want, you can use GPU for "general purpose" training and designing of AIs with some degree of "openness" in approaches. Specific design should come after you have chosen a certain way of designing your AI process, I would imagine and the rest is just growing on the capabilities of the analysis you want? I would imagine since the computational needs, at the end of the day, will be tailored for mass production (specific products with "AI" fed operations), you would choose that method for mass, but interim would always be "general"; or I would imagine it's the feasible way. All of that to say that I do think nVidia has a chance to grab a lot of market as long as they can improve on the analysis side as much as they can. Hence nVLink and all I/O improvements they can make to "feed the beast" will become a deciding factor to their success, in my opinion, obviously. Some of those can trickle down to consumers, I would imagine, but I don't really see how the general purpose calculation route will help "push more triangles" for games. At the end, it seems nVidia is better of making specific AI-accelerators like Google and if Volta is that first step, I won't expect huge leaps in the gaming arena TBH.

    Cheers!
    Reply to Yuka
  19. it all depends on what kind of improvement nvidia will come up with gaming volta. right now i'm suspecting nvidia will increase Volta gaming performance using the similar method they able to gain more performance with maxwell even when maxwell was build on the same 28nm process as kepler; architecture re-work. while volta will be build on 12nm from what i heard it was just some fancy name for more optimized 16nm process. GV100 is a lot bigger and faster than GP100 but power consumption wise both are about the same.
    Reply to renz496
  20. Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.
    Reply to manleysteele
  21. manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.
    Reply to renz496
  22. renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.
    Reply to manleysteele
  23. manleysteele said:
    renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.


    IPC wise both maxwell and pascal are identical. true pascal can be clocked higher but we have to remember architecture wise both masxwell and gaming pascal is almost outright identical. when we increase the core clocked performance increase will not going to be linear. there will be point where the increase of clock will net much smaller performance increase. also heat can be a problem at high frequency. but if you think pascal is amazing then volta probably will be more frightening lol.
    Reply to renz496
  24. renz496 said:
    manleysteele said:
    renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.


    IPC wise both maxwell and pascal are identical. true pascal can be clocked higher but we have to remember architecture wise both masxwell and gaming pascal is almost outright identical. when we increase the core clocked performance increase will not going to be linear. there will be point where the increase of clock will net much smaller performance increase. also heat can be a problem at high frequency. but if you think pascal is amazing then volta probably will be more frightening lol.


    Your first sentence confirmed my point, nicely. On your second assertion, perhaps, but at what level of clock does this occur? We're not there yet. Heat is an ongoing problem, even at present frequencies and densities. I do expect gaming Volta make some tradeoffs between power, clock, and core count. Still, I also expect a good bump on the full performance metric, both in the middle and on the high end.
    Reply to manleysteele
  25. Quote:
    but at what level of clock does this occur?


    maybe somewhere above 2.1ghz mark? probably why nvidia decided to "lock" pascal at pretty much 2.0-2.1Ghz limit right now. they can probably make work around to make it clock higher but will the effort going to worth it? just look at AMD Vega. AMD spend 3.9 billion of transistor just so GCN can be clocked much higher. they finally able to do it but their power consumption also going crazy.
    Reply to renz496
  26. renz496 said:
    manleysteele said:
    renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.


    IPC wise both maxwell and pascal are identical. true pascal can be clocked higher but we have to remember architecture wise both masxwell and gaming pascal is almost outright identical. when we increase the core clocked performance increase will not going to be linear. there will be point where the increase of clock will net much smaller performance increase. also heat can be a problem at high frequency. but if you think pascal is amazing then volta probably will be more frightening lol.


    ipc maxwell considerably faster than pascal. a 980ti@1600mhz matches a 1070@2200mhz. and kingpins 2000mhz+ 980ti world record at the time was something like 27000ish in firestrike gpu score which is close to a 1080ti in stock form. all that really does mean much, we just want 20-40% real world gaming performance increases at the same power draw. i dont really care what the clock rate or memory config is, just give me performance.
    Reply to nikoli707
  27. nikoli707 said:
    renz496 said:
    manleysteele said:
    renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.


    IPC wise both maxwell and pascal are identical. true pascal can be clocked higher but we have to remember architecture wise both masxwell and gaming pascal is almost outright identical. when we increase the core clocked performance increase will not going to be linear. there will be point where the increase of clock will net much smaller performance increase. also heat can be a problem at high frequency. but if you think pascal is amazing then volta probably will be more frightening lol.


    ipc maxwell considerably faster than pascal. a 980ti@1600mhz matches a 1070@2200mhz. and kingpins 2000mhz+ 980ti world record at the time was something like 27000ish in firestrike gpu score which is close to a 1080ti in stock form. all that really does mean much, we just want 20-40% real world gaming performance increases at the same power draw. i dont really care what the clock rate or memory config is, just give me performance.


    A 980Ti has 2816 cores. A 1070 has 1920 cores. I agree that performance/watt is what we are looking for. The comparison you are looking for is the 2048 core GTX 980 vs. the 1920 core GTX1070. Still not exactly equivalent, but life is unfair.
    Reply to manleysteele
  28. Most often people only look at core clock but overlook the core count. On architecture level maxwell and pascal structure is almost identically the same. So IPC wise it ahould be identical (per anandtech assumption). Just look at GTX980 vs GTX1060. GTX1060 need much higher clock to reach 980 performance but 1060 also have much less CUDA core than 980 (1280 vs 2048).
    Reply to renz496
  29. I have one problem with looking at individual "execution units" (or shader arrangements) and taking them as part of the "efficiency" equation.

    That is similar to taking ALUs, AGUs, FPUs and other internal execution units from a CPU and using them as an effective measure of "efficiency" in a CPU design. I know you *can* do it, but it's hard to justify it in context. When you count them, you're leaving other effective parts of the GPU that need to be included, such as memory controllers and... I can't think of another, haha.

    It's simpler to just analyze GPUs as a monolithic entity, much like we see a CPU core. GPUs don't have a divide (yet, they might actually go "MCM" shortly).

    Cheers!
    Reply to Yuka
  30. manleysteele said:
    nikoli707 said:
    renz496 said:
    manleysteele said:
    renz496 said:
    manleysteele said:
    Even if the full version of GV102 comes in with 4096 cores, I'm more interested in at what clock and power numbers. Right now, I'm thinking the GV102 needs to hit 2.5 GHz out of the box and reach 3 GHz when overclocked. It needs to do both things on air with a similar power budget to GP102. A large shortfall on any one of these targets would be a disappointment to me. They also need GDDR6 to deliver it's target performance on day one. Greedy of me, I know. Maybe unrealistic. I hope not.


    rather than increasing the clock i think it is better for nvidia to improve the IPC of their architecture instead. increasing the clock further might not going to increase the performance that much. we already seeing the diminishing return with pascal. as for what memory they used i don't care much about it as long as they can get the performance target.


    Diminishing returns from Pascal? You're kidding me, right? Pascal buries Maxwell at the same core count and that burying is almost all due to clock. Plus it reaches those higher clocks using less power.


    IPC wise both maxwell and pascal are identical. true pascal can be clocked higher but we have to remember architecture wise both masxwell and gaming pascal is almost outright identical. when we increase the core clocked performance increase will not going to be linear. there will be point where the increase of clock will net much smaller performance increase. also heat can be a problem at high frequency. but if you think pascal is amazing then volta probably will be more frightening lol.


    ipc maxwell considerably faster than pascal. a 980ti@1600mhz matches a 1070@2200mhz. and kingpins 2000mhz+ 980ti world record at the time was something like 27000ish in firestrike gpu score which is close to a 1080ti in stock form. all that really does mean much, we just want 20-40% real world gaming performance increases at the same power draw. i dont really care what the clock rate or memory config is, just give me performance.


    A 980Ti has 2816 cores. A 1070 has 1920 cores. I agree that performance/watt is what we are looking for. The comparison you are looking for is the 2048 core GTX 980 vs. the 1920 core GTX1070. Still not exactly equivalent, but life is unfair.


    Another way to look at it is that the 980 ti has 8 billion transistors, while the 1070 has 10% less, 7.2 billion.
    Reply to TMTOWTSAC
  31. World’s Largest Server Companies Announce NVIDIA Volta Systems Supercharged for AI

    https://www.hpcwire.com/off-the-wire/worlds-largest-server-companies-announce-nvidia-volta-systems-supercharged-ai/
    Reply to Hellfire13
  32. Nvidia is very agressive to maintain the lead they have on AI. They rely 4 years on GK110 for their top compute card before. Now GP100 has only been a year on the market and they already replace them with GV100 ( though nvidia still selling even GK110 based accelerator to the market).
    Reply to renz496
  33. There's one caveat though. From what I'm gathering, nVidia is pushing to lead the conversations for AI work in terms of technology. All the IP they are showing and discussing is proprietary stuff with zero hints of FOSS. This smells like a rehash of the CUDA strategy. I'm not saying it's bad or good, but that is what it smells like to my nose, heh.

    Cheers!
    Reply to Yuka
  34. Because for over a decade it has been working very well for them. Open source might be the best option in ideal world but the thing with Open source is everyone have a say in shaping the API/software. That's good so the software will work on all hardware but sometimes they can hold back things from going forward. I think that's why AMD very reluctant to give IHV direct access to their Mantle API before. They will give the base spec to Khronos group but they did not want nvidia or intel to have any hands in developing mantle itself.
    Reply to renz496
  35. Fair point.

    That is the good side of having a proprietary API in use: tailored performance. Open Source APIs have the "fit all" and the "backwards compatibility" problems. Those two are not easy nor simple to approach.

    Hence why I don't consider it a bad thing inherently. It's just... I get itchy when I have to deal with proprietary stuff, haha.

    Cheers!
    Reply to Yuka
  36. I think nvidia quite openly said they did not like having to do all the job while others just get a free rides. That's why they go with more proprietary route and if possible charge money for them. Consumer might not like this but this is probably why nvidia are in much better position than amd financial wise.

    Personally i would like nvidia to adopt more open standard (like adaptive sync) instead of seeing them as competitor to their solution. Though if nvidia indeed suporting adaptive sync it might end up hurting AMD more. To me despite the complain many people had for nvidia not supporting adaptive sync it give AMD some breathing room. Rigt now supporting adaptive sync becoming one of AMD advantage. Just look the recent pricing with Vega. It probably much less of a mess for AMD if nvidia decided to capitalize their lead by keeping 1080 original pricing and charge more for 1080ti.
    Reply to renz496
  37. It's just an approach they have that I (and a lot more people) don't really share nor like.

    FOSS, or free to use software, does not mean people is stealing things from other people when they contribute. The whole point is that "everybody wins". The take nVidia uses is to justify their own angle, but at least they're honest about it: they don't like sharing because they feel like they lose instead. That is fine; they're not hiding it either. They're a company after all. This model brings them the most profit, otherwise they'd use another business model.

    And I can give you a lot of examples where the FOSS is just as good as the private/closed version. And in terms of vulnerabilities, FOSS is better hands down.

    BUT! This is a very dense and diverse topic that is too tangencial to this thread, haha.

    In regards to nVidia having success, that is thanks to their engineers first and marketing team second, haha. AMD derping is also helping them a lot.

    Cheers!
    Reply to Yuka
Ask a new question Answer

Read More

Nvidia Graphics Cards Graphics