A Practical Guide to Soft Real-Time on Windows - A Checklist for Reducing Latency
1. Short version
- On ordinary Windows (10/11), the goal is not hard real-time guarantees, but smaller latency and jitter and fewer deadline misses
- The first thing to revisit is not the priority value, but what you put inside the periodic thread
- Split periodic work into a fast path (must hit timing) and a slow path (some delay is OK)
- In the fast path, avoid
Sleep-based waiting, blocking I/O, per-iteration new/malloc, and unbounded queues - In real deployments, AC power, power mode, timer resolution, power throttling, and background load all matter
- Don’t evaluate by averages alone. Look at p99 / p99.9 / max / miss count / queue depth / DPC / ISR / page faults
2. Scope
| In scope | Out of scope |
|---|---|
| General-purpose PCs running Windows 10/11 | Dedicated RTOSes or commercial RT extensions |
| Normal user-mode apps (C++/C#) | Control logic that lives mostly in a kernel driver |
| Standard APIs and standard settings | Moving everything onto an FPGA or microcontroller |
3. Basic terms
| Term | Meaning |
|---|---|
| Latency | Work starting or finishing later than scheduled |
| Jitter | Variation in period or execution time. Not running at the same interval each time |
| Deadline miss | Work that does not finish by the deadline you set |
4. Checklist: the big picture
| Item | What to do | Typical anti-pattern |
|---|---|---|
| Waiting strategy | Drive on absolute deadlines. Use events or a high-resolution waitable timer | Periodic loops based on Sleep(1) |
| Splitting work | Separate the fast path from the slow path | Saving, sending, or UI work inside the fast path |
| Queue | Make it bounded and decide what happens on overflow | Unbounded queues that just postpone the problem |
| Inside the fast path | Move out allocations, heavy logging, blocking I/O, and heavy locks | new/malloc/synchronous I/O |
| Priority | Raise only the threads that need it. Use MMCSS for audio/video | Jumping straight to REALTIME_PRIORITY_CLASS |
| OS / power | Check AC power, power mode, and timer resolution | Evaluating on battery or in a power-saving mode |
5. Each item in detail
5.1 Don’t leave the periodic loop to Sleep
Sleep(1) does not mean “wait exactly 1 ms” - it means “wait at least 1 ms.” The drift accumulates.
Switch to an absolute-deadline style:
int64_t next = QpcNow() + periodTicks;
while (running)
{
WaitUntil(next - wakeMarginTicks);
while (QpcNow() < next) { CpuRelax(); } // short spin only at the very end
int64_t started = QpcNow();
FastStep(); // no blocking, no alloc, no heavy lock
int64_t finished = QpcNow();
RecordTiming(next, started, finished);
next += periodTicks; // key point: do NOT use next = now + period
while (finished > next)
{
++missedDeadlines; // record the slip
next += periodTicks;
}
}
Two points:
- Don’t use
next = now + periodeach time (the drift won’t accumulate) - Decide in advance what to do when you do fall behind
5.2 Split fast path and slow path
Device/timer -> [fast path: acquire / control / minimal copy] -> [bounded queue] -> [slow path: format / save / send / UI]
Only these belong in the fast path:
- Data acquisition
- Control value calculation
- Minimal copying
- Timestamping
- Enqueueing
- Recording misses and overruns
Everything else goes to the slow path.
5.3 Queues should be bounded
An unbounded queue just hides the latency.
Pick one of three policies for overflow:
- Latest wins: drop older data
- No loss allowed: error / stop / alert
- Logging only: record the drop count
5.4 Don’t put heavy work in the fast path
Things to avoid:
- File writes
- Network sends
- Database writes
- Heavy log output
- Per-iteration
new/malloc/List<T>.Add - Per-iteration string concatenation or
ToString() - Heavy locks
- First-touch code that tends to trigger page faults
Watch out especially for:
- Allocation and deallocation: in the fast path, preallocate buffers and reuse them
- Blocking I/O: it may look fast on your dev machine, but it will jitter in production
- Page faults: touch the memory you need once at startup
5.5 Raise priority only for the threads that need it
Basic policy:
- Keep UI and ordinary worker threads at normal priority
- Raise only the fast-path thread, and only as needed
- Consider background mode for save/send/log-aggregation threads
- Think per thread, not per process
For continuous audio/video streams, consider MMCSS first:
DWORD taskIndex = 0;
HANDLE hAvrt = AvSetMmThreadCharacteristicsW(L"Audio", &taskIndex);
// ... work ...
AvRevertMmThreadCharacteristics(hAvrt);
REALTIME_PRIORITY_CLASS has large side effects, so use it only on a dedicated machine, after thorough testing, and only when you really need it.
5.6 Power-settings checklist
- Run on AC power (no amount of tuning will be stable on battery)
- Set the power mode toward “Best performance”
- Create a dedicated power plan if needed (Balanced for everyday use, the dedicated one only in production)
- Check minimum/maximum processor state (it’s worth trying 100%/100% on AC)
- Check process power throttling / EcoQoS
- Cut unnecessary background load (cloud sync, auto-update, resident monitors, and so on)
- Test minimized and hidden states too (on Windows 11 the timer resolution can change for hidden apps)
- Save BIOS/UEFI for last (C-states and quiet-mode settings vary a lot by hardware)
6. Measurement and evaluation
6.1 What to record
- Scheduled period time, actual start time, actual finish time
- Lateness
- Execution time
- Missed deadline count / consecutive missed deadline count
- Queue depth / drop count
- DPC / ISR spikes
- Page faults
- Temperature / clock variation
6.2 How to read p99 / p99.9 / max
- Average: the overall trend. Big delays are easily buried
- p99: out of 1000 samples, the upper bound after dropping the 10 slowest
- p99.9: out of 1000 samples, the upper bound after dropping the 1 slowest
- max: the worst case
If you want to catch “usually fine, occasionally hangs,” p99 / p99.9 / max are essential.
6.3 Tools to use
- In-app measurement: first, take period, lateness, execution time, and queue depth yourself
- ETW / WPR / WPA: look at CPU, context switches, DPC/ISR, page faults
- LatencyMon: get a feel for driver-induced jitter
- Temperature/clock monitoring: check the impact of thermal throttling
6.4 Test conditions
Test these conditions separately:
- Right after startup (before warm-up) / after warm-up
- Long continuous runs
- UI in the foreground / UI minimized or hidden
- AC power / battery
- With network or disk load
7. Rough rules of thumb
| Required level | Realistic approach |
|---|---|
| ~10-20 ms range, occasional jitter is OK | Fast/slow split, bounded queue, normal priority, event-driven; that’s enough |
| ~1-5 ms range, must keep up continuously | Allocation-free fast path, dedicated thread, MMCSS, high-resolution waitable timer, AC power |
| Sub-1 ms, sustained high load | Hard to do in user mode alone. Move the critical part elsewhere (FPGA, RTOS, etc.) |
| Lives alongside GUI / logging / networking / DB | Don’t cram it into one process and one loop. Separate responsibilities |
8. Summary
The order to improve in practice:
- Is the periodic loop relying on
Sleep? - Are the fast path and slow path separated?
- Is the queue bounded, and is the overflow policy decided?
- Is there I/O, allocation, or a heavy lock inside the fast path?
- Are priority and MMCSS used only on the threads that need them?
- Are power settings and measurement aligned?
Even on ordinary Windows, if you align design, waiting strategy, power settings, and measurement, soft real-time becomes very practical.
9. References
- Multimedia Class Scheduler Service
- AvSetMmThreadCharacteristicsW function
- SetThreadPriority function
- SetPriorityClass function
- timeBeginPeriod function
- CreateWaitableTimerExW function
- Acquiring high-resolution time stamps
- GetSystemTimePreciseAsFileTime function
- SetProcessInformation function
- VirtualLock function
- CPU Sets
- SetThreadIdealProcessor function
- SetThreadAffinityMask function
- Processor power management options
- Change the power mode for your Windows PC
- CPU Analysis (WPA / WPT)
- Stopwatch Class
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
Why Bring Generic Host / BackgroundService into a Desktop App - Startup, Lifetime, and Graceful Shutdown Get Much Easier to Reason About
If startup, shutdown, exception handling, and periodic work are starting to bleed into the UI of your WPF or WinForms resident app, this ...
How to Use FileSystemWatcher Safely - Lost Events, Duplicate Notifications, and the Traps Around Completion Detection
FileSystemWatcher events are hints, not completion signals. This article walks through lost events, duplicate notifications, and completi...
Safe File Integration Locking - Best Practices for File Locks, Atomic Claims, and Idempotent Processing
Treats file-integration concurrency as a handover protocol: temp-then-rename publishing, atomic claim renames, lease-based locks, and ide...
Pitfalls in COM, OCX, and ActiveX Development - Visual Studio Bitness, Registration, and Admin-Rights Traps
The traps that bite COM, OCX, and ActiveX work in practice: 32-bit/64-bit mismatches, regsvr32 vs Regasm, HKCU vs HKLM scope, and admin-r...
Where to `catch`, log, and handle exceptions — sorting out call-hierarchy boundaries and responsibilities for real-world code
A practical breakdown of where in the call hierarchy you should catch exceptions, where the primary log belongs, and where to decide betw...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.
Where This Topic Connects
This article connects naturally to the following service pages.
Windows App Development
We support Windows desktop applications that involve resident processing, device integration, operational logging, and maintainable structure.
Technical Consulting & Design Review
We help clarify design direction, architectural boundaries, lifetime ownership, and how to handle legacy Windows assets.