How to Reliably Capture Logs When a Windows App Crashes from a Programming Bug - Designs That Don't Bet on In-Process Logging, Plus Best Practices for WER, Final Markers, and Watchdogs
The most important mindset
Accept that you cannot guarantee a log will be written from inside the crashing process itself.
Once you factor in stack corruption, heap corruption, and forced termination, an in-process final log is fundamentally a best-effort thing. So think in three layers instead.
- Routine time-series logs (always running)
- A final crash marker at the moment of death (minimal only)
- Evidence captured by the OS or another process (WER dumps, etc.)
Recommended architecture: split the responsibilities
| Phase | Goal | What to do |
|---|---|---|
| Normal operation | Keep a time-series record | Structured logs, boundary events |
| At crash | Minimum evidence | Final crash marker, WER dump |
| Just after exit | Detect abnormal termination | Record exit code from another process, decide on restart |
| Next startup | Heavy post-processing | Compress, upload, notify the user |
Minimal setup (for small tools)
- Routine log: a local append-only file
- Final crash marker: a dedicated short file
- Dump: WER LocalDumps
- On next startup: show “the previous run ended abnormally”
Heavier setup (for 24/7 operation)
- Worker process: the actual workload
- Launcher/watchdog: start monitoring, exit recording, restart
- WER LocalDumps: configured for the worker
- Next startup or watchdog: collect diagnostic information
Best practices for routine logs
At minimum, include the following in every log line.
- UTC timestamp
- PID and TID (process ID and thread ID)
- App name, version, build number
- Session ID
- Operation ID (job ID, etc.)
- Most recent external side effect (file write, DB update, device command, etc.)
- Exception type, HRESULT, Win32 error code
The recommended format is JSON Lines, one event per line. Being able to correlate across multiple files later matters far more than having long, human-readable prose.
Write critical events synchronously
- Fine-grained events: an async buffer is fine
- Warning or above: flush early
- Important boundary events: write synchronously (ProcessStart, ConfigLoaded, ExternalCommandSent, etc.)
Rules for the final crash marker
This is not the place to build a full-featured logger. Write once, write short, write reliably.
What to include:
- Time of occurrence (UTC)
- PID / TID
- Session ID
- Version / build number
- Which hook triggered it (UnhandledException, etc.)
- Exception type or exception code
- Most recent operation ID
What you must absolutely not do:
- Resolve a logger from a DI container
- Use async/await
- Wait on a lock
- Show a UI dialog
- Compress or send over HTTP
All the crash handler should do is:
- Prevent re-entry
- Write a single line
- Flush
- Exit
Framework-specific notes
- WinForms: keeping the app alive via
ThreadExceptionis dangerous. It is the wrong tool for programming bugs. - WPF:
DispatcherUnhandledExceptionis the same. Use it as an entry point for recording, not for recovery. - .NET in general:
AppDomain.UnhandledExceptionis the last notification. Do not attempt heavy recovery there. - Native C++: in addition to
SetUnhandledExceptionFilter, also catch_set_invalid_parameter_handlerandset_terminate.
Build on top of WER LocalDumps
WER LocalDumps is the easiest mechanism to work with, because the OS writes the dump for you.
Example configuration:
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpFolder /t REG_EXPAND_SZ /d "C:\CrashDumps\MyApp" /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpCount /t REG_DWORD /d 10 /f
reg add "HKLM\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps\MyApp.exe" /v DumpType /t REG_DWORD /d 2 /f
Always archive dumps and PDBs together. A dump alone is unreadable without the matching EXE/DLL and PDB from that build.
What changes when you add a watchdog process
A watchdog (monitoring process) can record the following:
- Child process start time and end time
- Exit code
- Number of restarts
- Whether a dump exists
With just this much, you can finally tell whether “the process actually crashed,” “the OS shut down,” or “the user closed it.”
Common anti-patterns
catch(Exception)that just logs and continues -> half-broken state lingers and triggers more failures downstream- Trusting only the queue of an async logger -> the queue vanishes the moment the process dies
- Sending HTTP from the crash handler -> DNS or auth issues now ride on top of an already broken context
- Dumps that cannot be tied back to routine logs -> session and PID are not shared
- Keeping a WinForms/WPF app alive through unhandled exception events -> easy to end up in a zombie state
Minimum rollout checklist
- Routine logs are written one event per line
- Every log line carries UTC, PID, TID, version, and session
- A dedicated final crash marker file exists
- WER LocalDumps is configured per application
- PDBs are archived alongside the shipped binaries
- The next startup can detect the previous abnormal termination
- You crashed it on purpose on a test machine and confirmed evidence really survives
Summary
“Do not put your hopes on the crashing process alone.” That is the whole story.
- Split into routine logs, a final crash marker, and OS / out-of-process evidence
- At crash time, just write a short local record
- Push heavy work to the next startup or to another process
- Build on top of WER LocalDumps
- Make “record and exit” the default, not “keep going”
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
Getting Started with Windows App Crash Dump Collection: WER, ProcDump, and WinDbg
An introduction to collecting crash dumps for hard-to-reproduce Windows app failures: how to choose between WER LocalDumps, ProcDump, and...
Where to `catch`, log, and handle exceptions — sorting out call-hierarchy boundaries and responsibilities for real-world code
A practical breakdown of where in the call hierarchy you should catch exceptions, where the primary log belongs, and where to decide betw...
When You Can't Avoid Building Your Own Logger: Practical Minimum Requirements and Integration Test Checks
When you have no choice but to build a custom application logger, here are the minimum requirements to lock down first and the integratio...
Checklist for Unexpected Exceptions - A Quick Decision Table for Whether to Exit or Keep Running
A practical guide for deciding whether an app should exit or keep running when an unexpected exception occurs. Three options and a decisi...
Pitfalls in COM, OCX, and ActiveX Development - Visual Studio Bitness, Registration, and Admin-Rights Traps
The traps that bite COM, OCX, and ActiveX work in practice: 32-bit/64-bit mismatches, regsvr32 vs Regasm, HKCU vs HKLM scope, and admin-r...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Bug Investigation & Long-Run Failures
Topic page for intermittent failures, communication diagnosis, long-run crashes, and failure-path test foundations.
Where This Topic Connects
This article connects naturally to the following service pages.
Windows App Development
We support Windows desktop applications that involve resident processing, device integration, operational logging, and maintainable structure.
Bug Investigation & Root Cause Analysis
We investigate difficult production issues such as intermittent failures, long-run crashes, leaks, and communication stoppages.