When an Industrial Camera Control App Suddenly Crashes After a Month (Part 2) - What Application Verifier Is and How to Build Failure-Path Test Infrastructure
Application Verifier is a very strong tool when you want to surface strange behavior around Windows native code or Win32 boundaries earlier than normal testing would. It is especially useful when you want to test things such as handle misuse, heap corruption, and failure paths under low-resource conditions.
In the first part, When an Industrial Camera Control App Suddenly Crashes After a Month (Part 1) - How to Find a Handle Leak and Design Logging for Long-Running Operation, I organized a case where the root cause of a long-run crash turned out to be a handle leak. But strengthening logs alone is only half the story. What you really want is to know, in advance, whether if a future bug causes a memory leak, handle leak, partial failure, or cleanup omission, you will still be able to see clearly what happened.
That is where Application Verifier comes in. It can apply runtime checks and fault injection to Windows native / Win32-boundary behavior. One especially useful trait in real projects is that it lets you trigger memory-shortage-like or resource-shortage-like failure paths without actually exhausting the whole machine.
This second part organizes what Application Verifier is, what it can do, and how to use it to build a practical failure-path test foundation in the context of an industrial camera control application.
Contents
- Short version
- What Application Verifier is
- What Application Verifier can do
- Why we introduced it in this case
- How to trigger memory or resource shortage-like behavior
- How to inspect handle-related problems
- How to build failure-path test infrastructure
- Rough rule-of-thumb guide
- Summary
- References
1. Short version
- Application Verifier is a tool that makes it easier to catch misuse around native / unmanaged / Win32 boundaries at runtime
- Its value is not only “finding bugs,” but also forcing failure paths that normally stay hidden
Handlescan catch invalid handle usage,Heapscan help surface heap corruption, andLow Resource Simulationcan inject memory-shortage-like or resource-shortage-like failures- It is not ideal to rely on Application Verifier alone for long-lived EXE leak analysis; combining it with your own
Handle Countmonitoring and resource-lifetime logs is far more practical - In a failure-path test foundation, it is much easier to read results if you run normal verifier runs and fault-injection runs separately
- Even when the thing you want to test is a DLL, the target you enable in Application Verifier is usually the test EXE that actually loads it
In short, Application Verifier is a tool for dragging out the nasty native / Win32 bugs that otherwise stay half-hidden. In the kind of world where device-control software mixes native SDKs, P/Invoke, and Win32 APIs, it is an especially good fit.
2. What Application Verifier is
2.1. The one-sentence description
Application Verifier is a runtime validation tool for Windows user-mode applications. It observes how an application uses OS APIs and resources while it is actually running, and can both detect suspicious behavior and intentionally inject failures.
Unlike static analysis or ordinary unit tests, it shows what happens on the actual execution path. That is exactly why it is so useful for failure-path testing.
flowchart LR
A[Test harness] --> B[Control application / SDK wrapper]
B --> C[Application Verifier]
C --> D[Win32 APIs / native DLLs / OS resources]
C --> E[verifier stop]
C --> F[debugger output]
C --> G[AppVerifier logs]
B --> H[structured application log]
2.2. Where it is effective
It is especially effective in situations like these:
- the application calls into native DLLs or device SDKs
- it crosses P/Invoke or COM boundaries
- it uses handles, heaps, locks, or virtual memory heavily, directly or indirectly
- normal-path tests rarely fail, but failure-path lifetime handling looks suspicious
- the first visible symptom is not a crash, but occasional strange failures
On the other hand, Application Verifier is not the right main tool for chasing purely managed object-graph leaks. So even in a C# application, it becomes much more valuable when native SDKs or Win32 boundaries are substantial.
2.3. Why it is useful in practice
The practical benefits are mainly these:
- It can stop native-boundary misuse earlier
- invalid handle
- heap corruption
- lock misuse
- virtual memory API misuse
- It can force failure paths that are usually hard to trigger
- allocation-like failures
- file / event creation failures
- low-resource behavior without actually collapsing the whole machine
- It works well with debugger tooling
!avrf!htrace!heap -p -a- verifier stop information
In device-control applications, the hardest part is often not “there is a bug.” The hardest part is “we cannot clearly see what happened when the abnormal path ran.” Application Verifier helps a lot with that.
3. What Application Verifier can do
3.1. Basics: Handles / Heaps / Locks / Memory / TLS and more
The core test set in Application Verifier is usually the Basics area.
That includes the kinds of checks you actually want in many native-boundary applications.
| Layer | What it checks | How it helps in this context |
|---|---|---|
Handles |
invalid handle usage | catches use-after-close and bad-handle paths |
Heaps |
heap corruption | helps surface corruption or use-after-free near native SDK boundaries |
Leak |
unfreed resources at DLL unload | useful especially in short-lived harness scenarios |
Locks / SRWLock |
lock misuse | useful around reconnect and shutdown races |
Memory |
misuse of APIs such as VirtualAlloc |
useful for large buffers or mapped-memory behavior |
TLS |
thread-local-storage misuse | useful in native code with tricky thread assumptions |
Threadpool |
thread-pool worker and API correctness | useful when callbacks and async-native work are involved |
The core strength here is not “read a crash after the fact.” The strength is that suspicious usage can stop closer to the place where it actually happens.
3.2. Low Resource Simulation: surface memory and resource failures earlier
This is one of the most practically useful parts. You can trigger behavior that looks like memory shortage or resource shortage without really exhausting the whole development machine.
The basic idea is fault injection:
- choose certain resource-creation or allocation APIs
- force them to fail with a chosen probability
That makes it much easier to test error paths that would otherwise almost never run.
Typical examples include:
HeapAllocVirtualAllocCreateFileCreateEventMapViewOfFile- OLE / COM allocations such as
SysAllocString
3.3. Page Heap and the debugger
If the suspicion is heap corruption, Heaps plus page heap is especially strong.
Full page heap can make corruption stop much closer to the moment it happens, thanks to guard-page behavior.
But it is also heavy. So it is usually better used for:
- focused reproduction
- debugger-attached scenarios
- narrow investigation once suspicion has already become localized
A practical progression is often:
- use
Basicsbroadly first - if heap corruption becomes likely, move to full page heap
- if that is too heavy, dial back
- keep long-duration production-like runs focused more on counters and your own logs
3.4. !avrf / !htrace / logs
Application Verifier does not stop at “a verifier stop happened.” Debugger extensions and logs make it much easier to understand what happened.
!avrf- inspect current verifier configuration and verifier-stop state
!htrace- inspect handle open / close / misuse history
!heap -p -a- inspect heap blocks when page heap is involved
- AppVerifier logs
- retain stop information
When Handles is enabled, handle tracing becomes especially useful because it becomes much easier to ask:
where was this handle opened, and where was it closed?
4. Why we introduced it in this case
4.1. The goal was not only “find one bug”
The goal in this case was not just:
“find exactly one bug using AppVerifier”
The deeper goal was to confirm things like:
- if another future failure path leaks memory or a handle, can we still understand what happened?
- can we follow it all the way through with debugger information?
- can we avoid ending up in the state of “something failed, but we do not know why”?
So Application Verifier was used not only as a detector, but also as a way of testing the observability foundation itself.
4.2. Triggering memory-shortage-like conditions
Actually exhausting memory on a development machine is annoying. And once the whole machine becomes unstable, the test itself fills with noise.
So instead, Low Resource Simulation was used to intentionally step on the kinds of failure paths that would happen under memory or resource shortage.
That makes questions like these much easier to answer:
- if
CreateEventfails, do the logs still preservecameraIdandphase? - after partial initialization fails, does cleanup still run?
- if
VirtualAllocfails, does a retry path behave correctly? - if
CreateFilefails on a save path, does the handle count return?
The point here is not merely “make the app fail.” The point is:
can we still explain the way it fails?
4.3. Confirming that handle problems were actually traceable
As discussed in part 1, handle problems often have a large gap between where the app finally dies and where the original leak happened.
So we specifically wanted to confirm:
- when an invalid-handle stop appears, can
!htracefollow the open / close history? - can that history be linked to our own
resourceId/sessionId/phaselogs? - after fault scenarios, does
Handle Countreturn? - when the harness uses a short-lived process, do leaks become easier to compare?
That is how the discussion moves from:
“a bug happened”
to:
“the lifetime contract broke in this specific responsibility”
5. How to trigger memory or resource shortage-like behavior
5.1. The basic idea of Low Resource Simulation
Low Resource Simulation is basically fault injection. It does not attempt to perfectly recreate a truly low-resource machine. Instead, it intentionally introduces representative API failures that tend to happen in low-resource conditions.
That makes it especially useful for:
- checking cleanup in failure paths
- checking retry / reconnect robustness
- checking partially successful / partially failed initialization
- checking whether rare failures still leave enough evidence in the logs
One important practical rule:
do not turn everything on at once
If you enable every kind of failure from the start, the logs explode and it becomes hard to tell what you are actually looking at.
5.2. What kinds of failures can be injected
Typical targets include things like these:
| Type | Example | Example in a device-control app |
|---|---|---|
Heap_Alloc |
heap allocation | temporary buffers, metadata buffers, SDK wrapper allocations |
Virtual_Alloc |
virtual memory allocation | frame buffers, ring buffers |
File |
CreateFile and related operations |
save path open, log file open |
Event |
CreateEvent and related operations |
frame-ready notifications, stop / reconnect synchronization |
MapView |
memory mapping APIs | shared memory, mapped files |
Ole_Alloc |
SysAllocString and similar |
COM / OLE boundaries |
Wait |
WaitForXXX family |
checking wait-path failure handling |
Registry |
registry APIs | configuration and driver-adjacent settings |
In practice, it is much more effective to enable the kinds of failures that are close to the specific failure path you care about than to turn everything on at once.
5.3. How to aim it in practice
The command-line shape can look roughly like this:
appverif /verify CameraHarness.exe
appverif /verify CameraHarness.exe /faults
appverif -enable lowres -for CameraHarness.exe -with heap_alloc=20000 virtual_alloc=20000 file=20000 event=20000
appverif -query lowres -for CameraHarness.exe
The practical sequence is usually:
- run the normal path with
Basicsonly - then add
Low Resource Simulation - if needed, focus on just the failure types you care about, such as
fileorevent - if needed, aim it more narrowly at specific DLLs
The /faults shortcut is convenient, but it focuses mostly on OLE_ALLOC and HEAP_ALLOC.
If you want to test CreateFile or CreateEvent failure paths specifically, it is much clearer to explicitly write file=... or event=....
In device-control applications, it is often much easier to understand the result if you aim the fault injection more narrowly at the camera wrapper or save-path-related code than if you scatter it across the entire process.
6. How to inspect handle-related problems
6.1. The Handles check
For handle-heavy code, the first thing to use is Handles.
That makes invalid-handle usage much easier to detect.
Typical cases include:
- using a handle after it was closed
- passing a broken handle value
- using an uninitialized handle after a partial failure
- lifetime getting crossed and another thread using a handle incorrectly
In long-running programs, these may only show up as “occasionally some strange failure happens.” Under verifier, they often stop much closer to the misuse itself.
6.2. Using !htrace to inspect open / close stacks
One of the reasons Handles is so valuable is how well it works with handle tracing.
windbg -xd av -xd ch -xd sov CameraHarness.exe
!avrf
!htrace 0x00000ABC
What you usually want to know from !htrace is:
- where the handle was opened
- where it was closed
- whether it was referenced as an invalid handle
- whether opens are accumulating more than expected
Handle leaks and handle misuse are difficult precisely because the final failing API is often not where the bug started.
!htrace makes the lifetime history much more concrete.
6.3. How to combine it with your own logs
Even so, Application Verifier alone is not enough. For long-lived EXE leak analysis, relying on it alone is still quite painful.
In practice, it works best when combined with:
- periodic
Handle Count sessionIdresourceIdphase- lifecycle logs for
create/openandclose/dispose - dumps and debugger output when verifier stops
The pattern then becomes:
- heartbeat logging reveals a suspicious
Handle Countslope - lifecycle logs narrow down resources that are created without a matching close
- verifier runs surface invalid-handle use or misuse earlier
!htracereveals the open / close stacks
That combined view is much stronger than any one of those tools alone.
7. How to build failure-path test infrastructure
7.1. Push execution into a harness EXE
Application Verifier cannot be switched on after a process has already started. You configure first, then launch.
Also, its settings remain until you explicitly clear them. So in practice, it is usually easier to target a test harness EXE than the real production app itself.
A common shape looks like this:
flowchart LR
A[Scenario Runner] --> B[CameraHarness.exe]
B --> C[CameraSdkWrapper.dll]
C --> D[Vendor SDK]
B --> E[Structured Log]
B --> F[Dump / Debugger]
That gives you:
- one process per scenario
- easier leak comparison
- easy on/off control of AppVerifier settings
- a natural place to test DLL behavior through the EXE that actually loads it
Typical command shapes are:
appverif /verify CameraHarness.exe
appverif /n CameraHarness.exe
The important rule is:
- enable before launch
- disable explicitly when you are done
Harness-first execution reduces a lot of configuration confusion.
7.2. Split the test menu by purpose
In a failure-path test foundation, it is usually clearer not to run everything together. Three buckets are a very practical split:
- normal-path + Basics
- no injected failures
- just confirm that verifier stops do not occur in ordinary operation
- fault-injection runs
- use
Low Resource Simulation - target
event,file,heap_alloc,virtual_alloc, and similar failure types
- use
- heap-deep-dive runs
Heaps- full page heap
- debugger-attached focused reproduction
This keeps:
- “it is broken even in normal use”
- and “it breaks only under injected low-resource conditions”
from becoming mixed together.
7.3. What to collect
At minimum, these are worth collecting:
| Type | What to collect |
|---|---|
| application logs | cameraId, sessionId, phase, handleCount, error code |
| process state | Handle Count, Private Bytes, Thread Count |
| debugger info | !avrf, !htrace, and if needed !heap -p -a |
| dump | at verifier stop or abnormal termination |
| AppVerifier logs | stop records, optionally exported in XML |
If you need to, AppVerifier logs can also be exported and aggregated as XML. But in most practical projects they still make more sense when read side by side with your own logs.
The important thing is not “a lot of logs.” The important thing is:
can we reconstruct the causal chain later?
7.4. What the pass criteria should be
“It did not crash” is not enough as a pass condition. At least these were needed in this context:
- no verifier stop during normal-path + Basics runs
- in fault-injection runs, the intended failure is visible in the logs
- partially initialized resources are cleaned up correctly
Handle Countreturns near baseline after reconnect / retry- if a verifier stop occurs, it can be traced through
sessionId,phase, and stack information - failure never degrades into “we do not know what happened”
It is important to evaluate both:
- whether the app survives
- and whether the failure remains explainable
7.5. Things to keep in mind
Application Verifier is very powerful, but it is not magic.
- it only checks paths you actually execute
- full page heap is heavy
- verifier stops may happen inside third-party SDKs too
- fault-injection and normal-path runs can take very different code paths
- it is not the main tool for pure managed heap leaks
So the division of labor is usually:
- long-term growth trends → your own counters and logs
- native-boundary misuse → Application Verifier
- reconstruction of the causal chain → structured logs + dump + debugger
That split is much more practical in real work than hoping one tool explains everything.
8. Rough rule-of-thumb guide
- suspect invalid handle or double close
- start with
Handles+!htrace
- start with
- suspect heap corruption / use-after-free
- move to
Heaps+ full page heap +!heap -p -a
- move to
- want to trigger memory-shortage-like or resource-shortage-like behavior
- use
Low Resource Simulation
- use
- a long-running process deteriorates slowly
- start first with your own
Handle Count/Private Bytes/ lifecycle logs
- start first with your own
- you want to test a DLL
- enable Application Verifier for the harness EXE that actually loads it
If you enable everything at once, you usually get a fog of logs. It is much clearer to aim the sharpest edge first at the specific failure path you care about.
9. Summary
These are the key points to keep in mind.
What Application Verifier is:
- a runtime verifier for Windows native / Win32 boundary behavior
- a tool that can check Handles / Heaps / Locks / Memory / TLS / Low Resource Simulation
- a way to step on failure paths that usually stay hidden
What made it especially effective in this case:
- handle misuse becomes much easier to trace with
!htrace - low-resource-like failures can be triggered without wrecking the whole development machine
- your own structured logs can be tested to see whether they still explain the failure clearly
The practical way to use it:
- run normal-path Basics and fault-injection runs separately
- drive scenarios through a harness EXE
- combine your own logs, dumps, and debugger information
- use your own counters to watch long-term leak trends
Application Verifier is, in a very practical sense, a tool for going out to meet rare failures instead of waiting for them to happen by chance.
In device-control applications, it matters not only that the app works when everything is healthy. It matters just as much that when it breaks, you can explain what actually happened.
10. References
- Part 1: When an Industrial Camera Control App Suddenly Crashes After a Month (Part 1) - How to Find a Handle Leak and Design Logging for Long-Running Operation
- Application Verifier - Overview
- Application Verifier - Testing Applications
- Application Verifier - Tests within Application Verifier
- Application Verifier - Debugging Application Verifier Stops
- Application Verifier - Features
- !htrace (WinDbg)
- GetProcessHandleCount
Author GitHub
The author of this article, Go Komura, is on GitHub as gomurin0428 .
You can also find COM_BLAS and COM_BigDecimal there.