Last few weeks (and it slowly got worse as time went on), I have been getting application crashes/closes, weird seemingly memory related errors, a couple BSoDs, and Nvidia driver CRC errors.
All browsers will crash at random. Sometimes immediately on open for 10+ tries, sometimes it will manage to stay open for a bit. Sometimes just a tab will crash, other times the entire application will close suddenly with no error. Sometimes with an error. Other applications will silent close at random. Sometimes they will white screen. Sometimes they will black screen. When trying to update to the newest Nvidia drivers, upon extraction, I constantly get 7zip CRC errors.
Everyone says its a file or download issue (which I would agree with normally given the CRC error, but extracting on a different computer works fine and even if I try to download the last 10ish Nvidia drivers, they all have CRC errors and sometimes at different points in the extraction, but usually around 24%.
I know a lot of you are already thinking you know whats wrong, but let me continue first to show you what I have eliminated as a possibility. It is possible there are a few issues going on. Some of those issues may already be fixed. Some of these issues very well could have been file corruption caused by the main issue. Please see the troubleshooting section below. It should also be noted that this is a custom system less than a year old.
Computer Specs
CPU - 13th Gen Intel i9-13900KPSU - Corsair HX1500i 1500WGPU - Nvidia 4090 founders editionRAM - CORSAIR Vengeance 128GB (4 x 32GB) DDR5 5600 (PC5 44800)MOBO - ASUS ROG Maximus Z790 HeroHDD - WD_BLACK 1TB SN850X NVMe (Main OS - Direct on motherboard & heatsinked)COOL - Noctua NH-D15 CPU coolerOS - Windows 11 Pro (Installed from OEM disk)
Less important specs just for information
2 additional (for storage) WD_BLACK 2TB SN850X NVMe (Direct on motherboard & heatsinked)Noctua NT-H1 3.5g, Pro-Grade Thermal CompoundNo other fancy gadgets, RGB, etc.No normally connected (outside of troubleshooting) spinning or solid state HDDNo CD or DVD drives connected (I use a USB DVD drive when needed)No WiFi used. Only use onboard LAN.No other PCIE addon cards except the primary GPUOnly the basic peripherals connected at the moment. (Mouse and keyboard)Single monitor connected with DisplayPort.
Troubleshooting steps taken
- Removed the 4090 and tried a known working card. Apps still crashed, still CRC error with drivers
- Ran lots of different time consuming RAM tests from different vendors (3rd party, mobo, Microsoft, etc) with lots of mixed results, but generally came back error free.
- Still decided to buy identical, brand new RAM, to 100% replace old and also tried different combinations of slots on the motherboard to rule out RAM slot issue. No change. Still crashing.
- I have not swapped PSU, but I don't see any reason to. System not under load. No indicators anything having to do with power issues. Only mentioning this here to indicate I skipped this as a potential source of problem.
- Also did not remove or consider any of the storage NVMe as potential problems. They do not tie into the system in any other way. They don't have any apps installed on them. They are just file storage and relatively empty at the moment.
- I updated the BIOS of the motherboard to the latest as I forgot to do that when I first installed everything.
- I am not doing any sort of overclocking. I did enable XMP at first, but at present I am using the default motherboard UEFI configuration.
- I completely deleted the OS off the primary 1TB NVMe and connected an older 90GB SATA solid state disk and installed fresh from OEM disk Windows 10, but I ran into driver issues where none of the motherboard's driver software msi packages would open. Also parts of the start menu were missing strangely. No start button or clock. Just blank start menu bar at bottom. This was on a completely fresh Windows 10. Like 5 mins old.
- I wiped the 90GB drive and tried again this time with Windows 11. Again I installed fresh from OEM disk. The OS installed successfully. I am currently using that Windows 11 installation on this temporary 90GB SSD to type this StackExchange post. The browser seems stable so far. This includes Chrome, Edge, and Firefox. All of them are stable. Discord also opens without crashing. I have not tried installing other programs yet. Partially because need to get Nvidia drivers installed.
- I put the 4090 back in the system and it is working just the same as the temporary card (Nvidia 1050). Still unable to install ANY Nvidia driver. Even on a brand new operating system. Constantly get
7zip: CRC error
. Even if I download the driver from a different computer (also, I test it on the other computer to confirm it downloaded correctly and works), once I bring it over to the broken system, I get the CRC error. - I just bought an identical 1TB NVMe which will be here in a few days as I was thinking maybe something was wrong with the OS drive. Maybe the Nvidia driver was trying to uncompress onto the disk (maybe TEMP space) and getting errors and maybe, over time, the OS accumulated file corruption leading to lots of errors/instability/crashes with other apps??? Considering I am on completely isolated SATA HDD now with fresh operating system, I now no longer believe this to be the case.
- I tried changing the OS temp directories to point to a known working hard drive space (I tried to point them to one of the storage NVMe). No changes.
- List item
At this point I am at a loss as to whats wrong. I still have not confirmed two things. I have not tested the CPU, but if it was, I feel like I would be getting a lot more BSoDs and a lot more weird and unexplainable behaviors. I have also not replaced the motherboard yet for reasons I assume are obvious. It’s time consuming and it’s the last thing I want to try to test due to having to pull everything out. Plus the system is otherwise stable.
I was typing the paragraph above, Firefox crashed (First time on this new build)! Again this is on the brand new fresh OS, different HDD. It's got to be something wrong with the motherboard. Maybe the interface between RAM and rest of system? I'm actually kinda glad in some ways that it crashed again. I feel I can eliminate more variables now. Not NVMe and not OS. At least not previous OS. It could still be something with Windows 11 compatibility and my hardware.
Still not sure if the driver 7zip CRC error is related to the bigger issue or separate. Seems related though since the driver install works on other computers. Last note... I currently do not have 7zip installed on this system. Nvidia must package part of the 7zip software in their driver installers.
I just tried to install my first application other than browser and the install fails with a file corruption error which is oddly similar to the Nvidia failure. Despite this being a failure, I am actually happy to see this error as it gives me more info as to what might be going on.
I just downloaded the same file in two locations. Once on the broken computer and once on a known working computer that I am using as a backup. I then ran an MD5 check so we can put to rest all the stupid CRC comments everyone on other forums are making saying I am having download issues and that I need to use a different browser... The file is downloading just fine and is identical on both systems! When the file starts to uncompress on the broken system, that's when errors start. I am still leaning towards motherboard issues related to memory, or CPU issues. Remember, the RAM is 1 day old. I just replaced what I thought was potentially broken RAM.
Also, here is the 7zip CRC error that the Nvidia install gives.
I paid to send my $600 motherboard back to ASUS following their horrible and not reassuring RMA process. It clearly says for motherboard issues they would send a new board. I waited weeks and finally got package in the mail just now from FedEx. They sent my exact same motherboard back to me with a sheet of paper that says can't replicate issue and to try to update BIOS.
In anger over my motherboard comping back to me, I impulse purchased a brand new inexpensive 13th gen i5 CPU as this was the very last thing on my check list to try. Allow me to vent for a sec by reminding you that at this point I have now essentially bought an ENTIRE second computer trying to track down this issue... I just finished putting the new i5 into the system and for the moment, it appears that the issue may be fixed, I am not convinced just yet as things have appeared at first to be better (like after the OS install) only for the issue to return. So I am going to leave this computer on for a little bit and do a lot of opening and extracting the Nvidia drivers to see if I ever get the CRC error on the new processor. So far it has not popped up. I went ahead and started the RMA process with Intel for my i9. I will update again in a few days with results.