How to Reliably Capture Logs When a Windows App Crashes from a Programming Bug - Designs That Don't Bet on In-Process Logging, Plus Best Practices for WER, Final Markers, and Watchdogs

· · Windows Development, Exception Handling, Logging, WER, Crash Dumps, Bug Investigation

The most important mindset

Accept that you cannot guarantee a log will be written from inside the crashing process itself.

Once you factor in stack corruption, heap corruption, and forced termination, an in-process final log is fundamentally a best-effort thing. So think in three layers instead.

  1. Routine time-series logs (always running)
  2. A final crash marker at the moment of death (minimal only)
  3. Evidence captured by the OS or another process (WER dumps, etc.)
Phase Goal What to do
Normal operation Keep a time-series record Structured logs, boundary events
At crash Minimum evidence Final crash marker, WER dump
Just after exit Detect abnormal termination Record exit code from another process, decide on restart
Next startup Heavy post-processing Compress, upload, notify the user

Minimal setup (for small tools)

  • Routine log: a local append-only file
  • Final crash marker: a dedicated short file
  • Dump: WER LocalDumps
  • On next startup: show “the previous run ended abnormally”

Heavier setup (for 24/7 operation)

  • Worker process: the actual workload
  • Launcher/watchdog: start monitoring, exit recording, restart
  • WER LocalDumps: configured for the worker
  • Next startup or watchdog: collect diagnostic information

Best practices for routine logs

At minimum, include the following in every log line.

  • UTC timestamp
  • PID and TID (process ID and thread ID)
  • App name, version, build number
  • Session ID
  • Operation ID (job ID, etc.)
  • Most recent external side effect (file write, DB update, device command, etc.)
  • Exception type, HRESULT, Win32 error code

The recommended format is JSON Lines, one event per line. Being able to correlate across multiple files later matters far more than having long, human-readable prose.

Write critical events synchronously

  • Fine-grained events: an async buffer is fine
  • Warning or above: flush early
  • Important boundary events: write synchronously (ProcessStart, ConfigLoaded, ExternalCommandSent, etc.)

Rules for the final crash marker

This is not the place to build a full-featured logger. Write once, write short, write reliably.

What to include:

  • Time of occurrence (UTC)
  • PID / TID
  • Session ID
  • Version / build number
  • Which hook triggered it (UnhandledException, etc.)
  • Exception type or exception code
  • Most recent operation ID

What you must absolutely not do:

  • Resolve a logger from a DI container
  • Use async/await
  • Wait on a lock
  • Show a UI dialog
  • Compress or send over HTTP

All the crash handler should do is:

  1. Prevent re-entry
  2. Write a single line
  3. Flush
  4. Exit

Framework-specific notes

  • WinForms: keeping the app alive via ThreadException is dangerous. It is the wrong tool for programming bugs.
  • WPF: DispatcherUnhandledException is the same. Use it as an entry point for recording, not for recovery.
  • .NET in general: AppDomain.UnhandledException is the last notification. Do not attempt heavy recovery there.
  • Native C++: in addition to SetUnhandledExceptionFilter, also catch _set_invalid_parameter_handler and set_terminate.

Build on top of WER LocalDumps

WER LocalDumps is the easiest mechanism to work with, because the OS writes the dump for you.

Example configuration:

reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpFolder /t REG_EXPAND_SZ /d "C:\CrashDumps\MyApp" /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpCount /t REG_DWORD /d 10 /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpType /t REG_DWORD /d 2 /f

Always archive dumps and PDBs together. A dump alone is unreadable without the matching EXE/DLL and PDB from that build.

What changes when you add a watchdog process

A watchdog (monitoring process) can record the following:

  • Child process start time and end time
  • Exit code
  • Number of restarts
  • Whether a dump exists

With just this much, you can finally tell whether “the process actually crashed,” “the OS shut down,” or “the user closed it.”

Common anti-patterns

  1. catch(Exception) that just logs and continues -> half-broken state lingers and triggers more failures downstream
  2. Trusting only the queue of an async logger -> the queue vanishes the moment the process dies
  3. Sending HTTP from the crash handler -> DNS or auth issues now ride on top of an already broken context
  4. Dumps that cannot be tied back to routine logs -> session and PID are not shared
  5. Keeping a WinForms/WPF app alive through unhandled exception events -> easy to end up in a zombie state

Minimum rollout checklist

  • Routine logs are written one event per line
  • Every log line carries UTC, PID, TID, version, and session
  • A dedicated final crash marker file exists
  • WER LocalDumps is configured per application
  • PDBs are archived alongside the shipped binaries
  • The next startup can detect the previous abnormal termination
  • You crashed it on purpose on a test machine and confirmed evidence really survives

Summary

“Do not put your hopes on the crashing process alone.” That is the whole story.

  • Split into routine logs, a final crash marker, and OS / out-of-process evidence
  • At crash time, just write a short local record
  • Push heavy work to the next startup or to another process
  • Build on top of WER LocalDumps
  • Make “record and exit” the default, not “keep going”

Related Articles

Recent articles sharing the same tags. Deepen your understanding with closely related topics.

Related Topics

These topic pages place the article in a broader service and decision context.

Where This Topic Connects

This article connects naturally to the following service pages.

Back to the Blog