PC crashing into a random solid color when under graphical load.

I'm aware there are similar problems around this forum such as this, but none of them helped me. Also my current priority is identifying the cause of the problem and I think the info I'll provide will make it easy.

Copying this from a post I've made on reddit:

Operating System
Windows 10

Computer Specs (PSU, GPU, CPU, RAM, Motherboard)
PSU: Corsair RM 850
GPU: Gigabyte NV98TG1 (980Ti)
CPU: i5-4690K OC'd to 4.5ghz (i've disabled OC)
RAM: 2x 8gb hyperx
Motherboard: Gigabyte z97 hd3

Speccy Link
http://speccy.piriform.com/results/niph08ZOsMCGBNrq4HEA0i3

Description of problem
Hi. I've had this problem where my computer freezes to a random solid color while gaming. The reset button restarts the computer but I won't get the display back until I power off/on. It's been like that for the past week, only happened once every 2 day, and I was able to continue my gaming for the rest of that day. I've done some kind of a stress test by leaving DCS 2.0 open with highest settings (99% GPU usage) and left it running for 60 minutes. No crash. So I tought it was something temporary.

Yesterday I've started EVE online (which is not so gpu heavy) It crashed immediately upon graphics initialization. I've resetted my computer, booted to archlinux, downloaded unigine and began an openGL test. Crashed immediately. I've booted back to Windows, started DCS 2.0, crashed immediately.

Right now, I'm only continuing the tests on Unigine Heaven. What I've tried so far and the consequences:




When this issue began
1-2 weeks ago

Recurring issue
Yes

Date of purchase
2 years ago

Under Warranty
Yes (i should be but i'm not actually sure)

Cause/Steps to recreate the issue
start the computer up, start a game, experience crash.

What I've tried so far to resolve the issue
Reply to Ex Nihilo
18 answers Last reply
More about crashing random solid color graphical load
  1. Your problem is very complex, you need to perform isolation testing of the main parts of your computer most likely in motherboard, RAM, and Video cards. If there are no available parts for this procedure go on in testing of the GPU. I hope you have a good cooling of the GPU. Remember if overheat happen it can cause problem on it. Try to use other GPU and try to run not so solid graphical programs. Or use the built-in vga port of the mainboard. But also consider to clean and check the temperature of your processor and mainboards. Have you check also the driver update of your computer?
    Reply to combinebasic
  2. You mention the GPU use is maxed, but you don't mention anything about the memory usage on the GPU, or temps the card is reaching.
    Try blowing out the fans on the 980 to increase airflow and monitor temps and voltage draw for the card (I use hwmonitor) to look for anomalies like spikes or high rpms at idle, or the fans not working at all.
    Might have a bad memory module on the GPU which is only being accessed during certain tests and not others.

    Your theory of the PSU starting to fail is also very possible, although I wouldn't expect an 850 w corsair to fail when under a "normal" pc load.
    Make sure all the power cords are secure into wall sockets/power strips/etc. also. A loose cord or outlet under load could cause power issues as well.
    Reply to Jesse_20
  3. combinebasic said:
    Your problem is very complex, you need to perform isolation testing of the main parts of your computer most likely in motherboard, RAM, and Video cards. If there are no available parts for this procedure go on in testing of the GPU. I hope you have a good cooling of the GPU. Remember if overheat happen it can cause problem on it. Try to use other GPU and try to run not so solid graphical programs. Or use the built-in vga port of the mainboard. But also consider to clean and check the temperature of your processor and mainboards. Have you check also the driver update of your computer?


    Driver update has been done as I've stated above. Temperatures are well within acceptable limits.I once used the CPU for graphical rendering and tried stress testing. It didn't crash but I highly doubt this is a proper isolation testing since the computer barely draws any power from the PSU without my 980ti plugged in. As I've also stated above, I tried running hashcat (an md5 cracker using only the GPU.) to test. It didn't crash.

    I'm not sure what you mean by "the other gpu". I only have one and unfortunately, I don't have any spare PSU or ram too :/

    UPDATE: I ran windows' memtest. No errors there.
    Reply to Ex Nihilo
  4. Ex Nihilo said:
    combinebasic said:
    Your problem is very complex, you need to perform isolation testing of the main parts of your computer most likely in motherboard, RAM, and Video cards. If there are no available parts for this procedure go on in testing of the GPU. I hope you have a good cooling of the GPU. Remember if overheat happen it can cause problem on it. Try to use other GPU and try to run not so solid graphical programs. Or use the built-in vga port of the mainboard. But also consider to clean and check the temperature of your processor and mainboards. Have you check also the driver update of your computer?


    Driver update has been done as I've stated above. Temperatures are well within acceptable limits.I once used the CPU for graphical rendering and tried stress testing. It didn't crash but I highly doubt this is a proper isolation testing since the computer barely draws any power from the PSU without my 980ti plugged in. As I've also stated above, I tried running hashcat (an md5 cracker using only the GPU.) to test. It didn't crash.

    I'm not sure what you mean by "the other gpu". I only have one and unfortunately, I don't have any spare PSU or ram too :/

    UPDATE: I ran windows' memtest. No errors there.


    What I mean use other GPU, RAM, or PSU with the same SPECS. Because it is possibly the cause of the problem is in this parts, you need to remove this suspected parts and try to replace with the other parts you have. If you don't have this parts its better you need to bring your PC in service center that have complete parts to resolve your problem. Remember if you dont have resources to fix the problem it may cause more damage of your PC.
    Reply to combinebasic
  5. Jesse_20 said:
    You mention the GPU use is maxed, but you don't mention anything about the memory usage on the GPU, or temps the card is reaching.
    Try blowing out the fans on the 980 to increase airflow and monitor temps and voltage draw for the card (I use hwmonitor) to look for anomalies like spikes or high rpms at idle, or the fans not working at all.
    Might have a bad memory module on the GPU which is only being accessed during certain tests and not others.

    Your theory of the PSU starting to fail is also very possible, although I wouldn't expect an 850 w corsair to fail when under a "normal" pc load.
    Make sure all the power cords are secure into wall sockets/power strips/etc. also. A loose cord or outlet under load could cause power issues as well.


    the temps look normal.fans working as they should (idle when below threshold temp. Rising smoothly under load as expected).

    You're right about I've supplied no solid data about the GPU. the thing is, afterburner fails to save any logs in case of an abrupt shutdown/freeze. I've tried eyeballing the values but the crash happens immediately upon running unigine heaven or any game so I can't make any sense out of it.

    I've also reseated every cable, that goes out from the PSU to the motherboard and other peripherals.

    I'd be more than happy if you could reccommend me a way to test the memory modules. I've tried memtestCL without any errors if that's of any significance.
    Reply to Ex Nihilo
  6. combinebasic said:
    Ex Nihilo said:
    combinebasic said:
    Your problem is very complex, you need to perform isolation testing of the main parts of your computer most likely in motherboard, RAM, and Video cards. If there are no available parts for this procedure go on in testing of the GPU. I hope you have a good cooling of the GPU. Remember if overheat happen it can cause problem on it. Try to use other GPU and try to run not so solid graphical programs. Or use the built-in vga port of the mainboard. But also consider to clean and check the temperature of your processor and mainboards. Have you check also the driver update of your computer?


    Driver update has been done as I've stated above. Temperatures are well within acceptable limits.I once used the CPU for graphical rendering and tried stress testing. It didn't crash but I highly doubt this is a proper isolation testing since the computer barely draws any power from the PSU without my 980ti plugged in. As I've also stated above, I tried running hashcat (an md5 cracker using only the GPU.) to test. It didn't crash.

    I'm not sure what you mean by "the other gpu". I only have one and unfortunately, I don't have any spare PSU or ram too :/

    UPDATE: I ran windows' memtest. No errors there.


    What I mean use other GPU, RAM, or PSU with the same SPECS. Because it is possibly the cause of the problem is in this parts, you need to remove this suspected parts and try to replace with the other parts you have. If you don't have this parts its better you need to bring your PC in service center that have complete parts to resolve your problem. Remember if you dont have resources to fix the problem it may cause more damage of your PC.



    I might bring my gfx card and PSU to a friend's house if he's available and test them on his case. as i've said, no spare parts here. Just looking for in-house solutions to identify the problem if that's possible.
    Reply to Ex Nihilo
  7. Ex Nihilo said:
    combinebasic said:
    Ex Nihilo said:
    combinebasic said:
    Your problem is very complex, you need to perform isolation testing of the main parts of your computer most likely in motherboard, RAM, and Video cards. If there are no available parts for this procedure go on in testing of the GPU. I hope you have a good cooling of the GPU. Remember if overheat happen it can cause problem on it. Try to use other GPU and try to run not so solid graphical programs. Or use the built-in vga port of the mainboard. But also consider to clean and check the temperature of your processor and mainboards. Have you check also the driver update of your computer?


    Driver update has been done as I've stated above. Temperatures are well within acceptable limits.I once used the CPU for graphical rendering and tried stress testing. It didn't crash but I highly doubt this is a proper isolation testing since the computer barely draws any power from the PSU without my 980ti plugged in. As I've also stated above, I tried running hashcat (an md5 cracker using only the GPU.) to test. It didn't crash.

    I'm not sure what you mean by "the other gpu". I only have one and unfortunately, I don't have any spare PSU or ram too :/

    UPDATE: I ran windows' memtest. No errors there.


    What I mean use other GPU, RAM, or PSU with the same SPECS. Because it is possibly the cause of the problem is in this parts, you need to remove this suspected parts and try to replace with the other parts you have. If you don't have this parts its better you need to bring your PC in service center that have complete parts to resolve your problem. Remember if you dont have resources to fix the problem it may cause more damage of your PC.



    I might bring my gfx card and PSU to a friend's house if he's available and test them on his case. as i've said, no spare parts here. Just looking for in-house solutions to identify the problem if that's possible.


    Ok just try on it, but remember install it in the same SPECS and if it will display try to run the program you use that mainly cause the problem. If it results with the same problem identify which part, then replace it with a new one parts.

    I will logout this time because its now bed time here in the Philippines, just update me tomorrow for the results. Or just other persons here to attend the solutions of your problem.
    Reply to combinebasic
  8. memtestCL will do the trick, but not if you run it at default settings. The default only tests the first 128 MB of GPU ram, so you need to override the command line to accommodate for your total memory, so something like:
    memtestcl 4096 5
    would test 4 GB of vram with 5 test loops.

    just add 1024 for each GB ram on the gpu.

    I was referring more to the actual power cables from the wall to the PC, not interior connections. As your PSU tries to draw more power from the wall, loose outlets or plugs can result in arcing which causes unstable power surges. Make sure these are all tight. Might want to try a new PSU to wall power cord also if you have an spare.
    Reply to Jesse_20
  9. Jesse_20 said:
    memtestCL will do the trick, but not if you run it at default settings. The default only tests the first 128 MB of GPU ram, so you need to override the command line to accommodate for your total memory, so something like:
    memtestcl 4096 5
    would test 4 GB of vram with 5 test loops.

    just add 1024 for each GB ram on the gpu.

    I was referring more to the actual power cables from the wall to the PC, not interior connections. As your PSU tries to draw more power from the wall, loose outlets or plugs can result in arcing which causes unstable power surges. Make sure these are all tight. Might want to try a new PSU to wall power cord also if you have an spare.


    No extra cord at home, but I've tried changing the socket (instead of extender, plugged directly into the wall. no avail)

    Also I did the memtestg80 (memtestcl refused to do anything higher than 128 for some reason) and i got some errors! at 6144, i had 4 million or so errors, same at 5000mb. So I've tried starting from 500 and increased by 500mb increments. It froze again with a white screen at 3000mb.

    I'll test a few times and then report again.

    Update: I've actually found out that 4 billion errors might be a bug of memtestg80. I'll try some other options.
    Reply to Ex Nihilo
  10. Update again: Memtestg80 surely freezes the system like unigine/gaming does when a 2000mb test is done ( card is 6gb). still not sure if memtest introduces a high load suddenly, or if it's actually the memory block.

    yet another update: I've just noticed there's a loud coil whine like sound coming from the GPU area. contrary to usual coil whine, this one happens while the card is idling, and sounds like a tiny drill, instead of whining.
    Reply to Ex Nihilo
  11. Ex Nihilo said:
    Update again: Memtestg80 surely freezes the system like unigine/gaming does when a 2000mb test is done ( card is 6gb). still not sure if memtest introduces a high load suddenly, or if it's actually the memory block.

    yet another update: I've just noticed there's a loud coil whine like sound coming from the GPU area. contrary to usual coil whine, this one happens while the card is idling, and sounds like a tiny drill, instead of whining.

    Hi Ex Nihilo did you repair your PC? About what you report here still the GPU is the suspect. If there are unusual sound check your GPU cooling system if there are abnormality of the part. Try to clean it, or I said use another compatible GPU.
    Reply to combinebasic
  12. combinebasic said:
    Ex Nihilo said:
    Update again: Memtestg80 surely freezes the system like unigine/gaming does when a 2000mb test is done ( card is 6gb). still not sure if memtest introduces a high load suddenly, or if it's actually the memory block.

    yet another update: I've just noticed there's a loud coil whine like sound coming from the GPU area. contrary to usual coil whine, this one happens while the card is idling, and sounds like a tiny drill, instead of whining.

    Hi Ex Nihilo did you repair your PC? About what you report here still the GPU is the suspect. If there are unusual sound check your GPU cooling system if there are abnormality of the part. Try to clean it, or I said use another compatible GPU.


    Hi. I still couldn't find a friend with an adequate PSU to plug and stress test my graphics card. The cooling system is not rattling, it's obviously an electrically induced buzz. (happens even when the fan is not running/changes frequency depending on the number of pixels drawn). Also I highly doubt cleaning will help since the temperatures are well within accepted limits.

    Just to clarify, the crash used to happen after a long session of gaming 2 weeks ago. but now, it IMMEDIATELY crashes when I start a 3d game or unigine heaven.

    I'm suspecting these:
    A gfx card memory block went bad
    A gfx voltage controller or something like that went bad
    Motherboard
    PSU

    I'm looking for a way to fill my 6GB's of VRAM without actually putting heavy load(without performing a full-fledged memory stress test) on the card. If it fails, I'll make sure it's a memory problem.
    Reply to Ex Nihilo
  13. Hi guys. I've brought my gfx card to my friend's house and tried it in his case. Crashed his computer too in 2 seconds into Unigine Heaven.
    Guess I'll need to find my card's invoice, and then RMA it. I hope they don't claim it's an user error or something like that.
    Reply to Ex Nihilo
  14. Glad you were able to track it down. GL w RMA.
    Reply to Jesse_20
  15. Jesse_20 said:
    Glad you were able to track it down. GL w RMA.


    thanks man. from what i've read on the internet, I think I'll need that. Gigabyte didn't even bother to reply to my e-mail as of now.
    Reply to Ex Nihilo
  16. Hi guys, my card just went and returned from the RMA. It's been repaired. The report says the culprit was 2 busted resistors on the PCB.
    Reply to Ex Nihilo
  17. That's pretty good you verify now the problem, the mobo is the problem, maybe there are also caps damage.


    Ex Nihilo said:
    Hi guys, my card just went and returned from the RMA. It's been repaired. The report says the culprit was 2 busted resistors on the PCB.
    Reply to combinebasic
  18. combinebasic said:
    That's pretty good you verify now the problem, the mobo is the problem, maybe there are also caps damage.


    Ex Nihilo said:
    Hi guys, my card just went and returned from the RMA. It's been repaired. The report says the culprit was 2 busted resistors on the PCB.




    can you explain? I said the card returned from RMA with repairs on two busted resistors. How did you come to the conclusion that mobo is the culprit? or did you mean that busted resistors on peripherals are usually signs of bad mobos?
    Reply to Ex Nihilo
Ask a new question Answer

Read More

GPUs Reddit Power Supplies Motherboards