UGTS Document #21 - Last Modified: 8/29/2015 3:23 PM
Debugging crash dump files with WinDbg and the Driver Verifier

When Windows crashes and goes to a blue screen, it writes a dump of memory (a .dmp file) to the C:\Windows\Minidump folder. If your computer has mysteriously restarted itself for no apparent reason recently, it may have crashed and you might want to start-Run 'minidump' to check the contents of this folder to see if a crash dump exists for the point in time that the restart happened.

Crash dump files can be analyzed with windbg.exe to determine in some cases the cause of the error. To use WinDbg, you must first install and configure it. To install, first download the latest version of the Windows SDK and install it. Then Start-Run windbg (run as Administrator on Vista or higher). Now you'll need to configure the symbol path so that WinDbg can download PDB files automatically from Microsoft, and analyze crash dump files effectively. To do this go to File, Symbol File Path. Enter the text below, replacing [localpath] with a full path to a local folder on your machine where you want to place the downloaded symbol files::


Now you can debug crash dump files. Go to File, Open Crash Dump... (or just press Ctrl+D) and browse to the .dmp file. Then run at the command line,

!analyze -v 

It is usually hit or miss whether a crash dump file will contain useful information. More often than not, the crash cause the code to jump off to a random location in memory, and the original cause of the problem can't be found, or the problem is one which does not cause an immediate crash, but corrupts memory so that the crash occurs further down the line. However, you usually at least can tell what the stop code is. 

IRQL_NOT_LESS_OR_EQUAL is a common stop code for a driver bug.  This particular error occurs when a driver attempts to access memory that has been paged to disk while it is handling an interrupt.  This is not allowed because it requires Windows to drop to a lower IRQL value to handle the page fault (disk access).  This kind of error is most commonly seen when resuming your computer from sleep or hibernation, or after long periods of inactivity, because it is usually on wake-up that drivers which operate by timer need to access resources which may have been paged out to disk due to the computer going to sleep. When this happens, you sometimes see the error reported as a disk access failure (which can be unnerving because it looks like your hard disk drive is having trouble), because the real cause is that one of the drivers is trying to access the disk indirectly, but is not allowed due to the IRQL.

Sometimes you can get the name of the driver - the Bugcheck Analysis might say something like: Probably caused by : [DRIVER].SYS. In this case, it would be a good idea to make sure you update the drivers for this device, in case the driver has been fixed.

Sometimes it takes multiple crash dump files to see the cause of the crashes. If you're done looking at one crash dump and want to look at another without having to restart WinDbg, go to Debug, Stop Debugging (or press Shift+F5), and then open another crash dump file.

One way you can greatly improve your chances of debugging crash dumps due to driver failures is to use the 'driver verifier'. The driver verifier works by subjecting selected drivers to more stringent testing, and if any of those tests fail, Windows immediately crashes with a bluescreen and a crash dump to show you what the driver did wrong. In other words, the driver verifier enforces Murphy's law. It makes sure that anything that can go wrong, will go wrong, and that it goes wrong immediately so that you know immediately if you have bad drivers that need to be upgraded or removed.

To turn on the driver verifier, Start-run verifier.exe, and then Create Standard Settings, Next. You may be tempted to turn on all checks here, but resist that urge unless you know what you're doing. The non-standard checks are for code developers only, and they they test drivers in ways where failures aren't necessarily likely to be trouble in the real world. In other words, if you turn on these checks, the verifier will crash your system over failed checks that may not be important to you, and you'll spend extra time weeding through the crashes to get to the ones that matter.

After selecting the standard checks, choose Select driver names from a list. Select all drivers that weren't written by Microsoft. The reason for this is that most drivers have bugs because the developers who wrote them didn't work at Microsoft. Writing device drivers correctly is very hard, and it's easy to have incorrect assumptions about the way Windows works, or the various situations that they driver may experience.

After selecting the drivers, press Finish and restart the computer when prompted.

If your drivers are particularly bad, you might now get a nasty surprise - your computer will not start up but instead go into a vicious crash-reboot cycle.  To break that cycle, press F8 to get the boot menu, and choose the Last Known Good Configuration where the driver verifier was not turned on.   This will let you get into windows so that you can open the crash dump files and copy them to another machine which has windbg installed.

After you've determined the cause of your failures, uninstall or update the drivers that were responsible. Note that if you uninstall a driver, it will not disappear from the list shown by the driver verifier until the system is restarted, so you should restart your machine after any uninstallations.

Finally, after you've fixed your machine, don't forget to turn the driver verifier OFF.  There are two reasons for this:. First, some anti-virus software runs much more slowly when the verifier is turned on for the drivers used ny that software, to the point of maxing out your computer's CPU at 100% for long stretches of time. Second, the driver verifier causes your computer to crash more often if you have any drivers with flaws in them. Crashing more often is not what you want when you're busy at work.