Where Exceptions Should Be Caught, Logged, and Handled - A Practical Guide to Boundaries and Responsibilities in the Call Hierarchy
Contents
- Table of Contents
- 1. The short answer
- 2. Catching, logging, and error handling are different things
- 2.1. Catching
- 2.2. Logging
- 2.3. Error handling
- 2.4. Translating exceptions
- 3. The first decision table to look at
- 4. What to do at each level of the call hierarchy
- 4.1. The deepest helper / utility / private method
- 4.2. External I/O boundaries: Repository / Gateway / SDK wrapper
- 4.3. Application Service / UseCase
- 4.4. UI / HTTP / Job / Message boundaries
- 4.5. The final unhandled-exception handler
- 4.6. Looking at one call hierarchy end to end
- Save button → SaveOrderUseCase → PaymentGateway → HTTP
- 5. Separate expected failures from unexpected exceptions
- 6. Where to log, and how many times
- 7. Common anti-patterns
- 7.1. catch (Exception) in a deep layer that returns null / false
- 7.2. Writing Error in every layer and then rethrowing
- 7.3. Library layers or shared components directly showing UI
- 7.4. Logging OperationCanceledException as an outage
- 7.5. Retrying lightly when there are external side effects
- 7.6. Trying to recover from everything in the final unhandled-exception handler
- 8. Review checklist
- 9. A rough quick guide
- 10. Summary
- 11. References
- 12. Related Articles
Table of Contents
- The short answer
- Catching, logging, and error handling are different things
- 2.1. Catching
- 2.2. Logging
- 2.3. Error handling
- 2.4. Translating exceptions
- The first decision table to look at
- What to do at each level of the call hierarchy
- 4.1. The deepest helper / utility / private method
- 4.2. External I/O boundaries: Repository / Gateway / SDK wrapper
- 4.3. Application Service / UseCase
- 4.4. UI / HTTP / Job / Message boundaries
- 4.5. The final unhandled-exception handler
- 4.6. Looking at one call hierarchy end to end
- Separate expected failures from unexpected exceptions
- Where to log, and how many times
- Common anti-patterns
- Review checklist
- A rough quick guide
- Summary
- References
- Related Articles
1. The short answer
- As a rule, do not broadly
catchin deep layers. Putcatchcloser to the boundary where you can define the unit of failure. - For logs, the default should be one primary log for one failure. If every layer keeps writing
Errorfor the same exception, the reader loses. - The responsibility of the deepest layer is cleanup, local rollback, exception translation, and limited retry only when appropriate. If it rethrows, it normally does not write the primary log there.
- Processing boundaries such as one screen action, one HTTP request, one job run, or one message handling unit are usually the most natural places for the primary log.
- Expected failures should be converted into results at the unit of the use case. You do not have to keep throwing everything upward as an exception forever.
AppDomain.UnhandledException, WPF’sDispatcherUnhandledException, WinFormsThreadException, ASP.NET Core exception handlers, and the host’s final exception handling are the last place to record, rather than the main place to recover.- User cancellation or shutdown-related
OperationCanceledExceptionis normally not treated as Error. - When in doubt, check in this order.
- Can this place really make the decision?
- Is the failed unit visible here?
- Can the state be rolled back or rebuilt here?
- If I log here, will the same exception also be logged above?
The core idea is simple: do not catch where you merely can catch; catch where you can make a responsible decision.
2. Catching, logging, and error handling are different things
2.1. Catching
catch means receiving an exception once and changing control flow because of it.
But that is not recovery by itself.
For example, even if a lower method catches an exception, it may still not know:
- what should be shown to the user
- whether the whole screen should stop or only this operation should fail
- whether the current request or job may continue
If it cannot answer those questions, that place is often not a good place for catch.
2.2. Logging
Logging is not only about recording that an exception happened. It is about recording which piece of work failed so you can trace it later.
That is why good logging points usually have one or more of these:
- requestId / traceId
- userId
- orderId / fileId / batchId
- which input item failed
- which UI action it was
- which queue or which message it was
Deep helpers and common functions often know the technical detail but do not have that operational context.
So the place that knows the technical details and the place that knows the operational context are often different places.
2.3. Error handling
Here, “error handling” means things like:
- showing an error message on the screen
- returning 4xx / 5xx in HTTP
- failing just one item and moving on to the next
- reinitializing a subsystem
- stopping the process and letting restart policy take over
- releasing resources and exiting safely
In other words, it means deciding the visible shape of failure for the caller or the user.
2.4. Translating exceptions
In real systems, there is one more important step between catch and “handle it.”
That step is translation.
For example:
HttpRequestExceptionIOExceptionJsonException- database-driver-specific exceptions
- vendor-SDK-specific exceptions
If those leak directly into a UI or Controller, higher layers start learning lower-layer implementation details.
So at a boundary, it is often better to translate them into failures that make sense at that layer, such as:
- “Could not connect to the payment service”
- “The CSV format was invalid”
- “Could not write to the save destination”
- “The device response was invalid”
The key point is that translation is not the same thing as logging.
If you only translate and rethrow, that is normally not the place for the main log.
3. The first decision table to look at
It becomes much easier to stay organized if you first decide the broad policy with a table like this.
| Place | Basic policy | Primary log | Main responsibility |
|---|---|---|---|
| helper / utility / private method | As a rule, do not broadly catch |
none | cleanup in finally, local rollback, minimum context addition |
| Repository / Gateway / SDK wrapper | catch only specific exceptions | usually no | exception translation, limited retry, disposing broken connections or handles |
| Application Service / UseCase | turn expected failures into results | if swallowed, only as needed here | define the unit of failure, allow partial failure, make use-case-level decisions |
| UI / Controller / API / Job / Message boundary | main receiver of unexpected exceptions | often here | user response, HTTP response, continue-next-item or abort decision |
| unhandled-exception handler / host final boundary | last line of defense for missed cases | Critical |
final recording, flush, dump, exit / restart path |
Visually, it usually looks like this:
flowchart TD
A["An exception happened"] --> B{"Can this place decide retry / result conversion / continue-or-stop?"}
B -- "No" --> C["As a rule, do not catch it here; send it upward"]
B -- "Yes" --> D{"Is this a layer boundary?"}
D -- "No" --> E["Only local cleanup"]
D -- "Yes" --> F["Translate into a meaningful exception if needed"]
E --> G{"Can the unit of failure and operational context be identified here?"}
F --> G
G -- "No" --> H["Do not write the primary log here; send upward"]
G -- "Yes" --> I["Write one primary log and decide the response"]
I --> J["Exit / reinitialize / continue next item if needed"]
There are two key ideas in that diagram.
- The first reason to
catchis recovery or cleanup, not logging - The first reason to log is that the operational context is complete, not simply that you noticed an exception
4. What to do at each level of the call hierarchy
4.1. The deepest helper / utility / private method
At this level, the default rule is do not catch broadly.
For places such as string conversion, parsing, calculations, internal formatting, or shared helpers, the code usually cannot decide:
- what screen action this belonged to
- what request it belonged to
- whether only this operation should fail
- whether the whole screen should stop
What this layer is allowed to do is mostly:
- release resources in
finally - roll back local state that it partially broke
- add only minimal context to the exception message
- replace it with a more suitable exception type
- dispose objects that are no longer reusable
Things better avoided here are:
catch (Exception)and returnnull/false/ an empty array- show a
MessageBoxhere - write an
Errorlog here and then rethrow - “just continue somehow” when the state cannot really be restored
The especially dangerous pattern is continuing to use an object after a failure that happened halfway through mutating its internal state.
If the state can be restored locally, restore it. If not, move to a discard-and-recreate assumption.
4.2. External I/O boundaries: Repository / Gateway / SDK wrapper
This is a layer where the reasons to catch are much clearer.
Why? Because implementation-specific details from lower layers surface here:
- database-driver exceptions
- HTTP communication exceptions
- file I/O exceptions
- COM / P/Invoke / vendor SDK specific exceptions
- parser or serializer exceptions
Typical responsibilities at this layer are these four:
-
Catch specific exceptions
Catch meaningful concrete exceptions rather than a wideException. -
Translate them into meaningful failures
So that upper layers do not need to know lower-layer implementation details directly. - If local retry is appropriate, do it here
But only under fairly strict conditions:- the failure is known to be transient
- the operation is idempotent
- retry count and delay policy are defined
- final behavior after failure is clear
Retry belongs here only when those conditions are met.
- Throw away broken connections or handles
In many cases, “recreate the connection” is safer than “keep using the same object next time.”
For logging at this layer, the following policy helps avoid confusion:
- If you rethrow upward, normally do not write the primary log here
- If this layer swallows the exception and converts it to a result, then this layer owns the necessary log or metric
- During retry, keep individual attempts in
Debug/Information/Warning, and record only the final failure more strongly
This layer is usually where translation happens, not where the final visible decision is made.
4.3. Application Service / UseCase
This is the layer that decides how this unit of work should fail.
Examples include:
- a save operation
- finalizing an order
- importing a CSV
- processing one batch item
- applying one message
These are coherent use-case-level units.
This layer can make decisions such as:
- a validation error should fail only this time
NotFoundshould become something equivalent to 404- a business-rule violation should be returned for user correction
- one invalid CSV row should be recorded as
Warningand processing should continue - a temporary external-service failure should fail the whole operation
- partial work should be discarded and retried from the beginning
In other words, this is where the unit of failure can often be defined.
Typical good uses for this layer are:
- converting expected failures into
Resultobjects or failure DTOs - aggregating partial failures
- deciding how many failures may be tolerated before continuing
- converting to error codes or user-message keys
What this layer should usually avoid is bringing in too much UI rendering or HTTP response body construction.
It is cleaner if this layer decides use-case meaning, and leaves final presentation to the outer boundary.
4.4. UI / HTTP / Job / Message boundaries
This is where the primary log point often lives in real applications.
Examples include:
- one click on a Save button in WinForms / WPF
- one HTTP request in ASP.NET Core
- one worker message
- one input item in a batch
- one scheduled job run
This location knows:
- what operation it was
- who initiated it
- which item number it was
- which request / batch / message it was
- what should be returned to the user or caller if it fails
That makes it a natural place to:
- catch unexpected exceptions broadly
- write one primary log with context
- convert into an error dialog, HTTP 500, Problem Details, job failure, or continue-next-item behavior
The important point is not merely that it catches broadly, but that what should be returned after catching is already defined here.
For batch or queue processing, it often helps to separate two levels:
- Catch at the one-item boundary
Decide whether one failed item should be skipped and the next item should continue - Do not broadly swallow at the parent loop
If the parent loop dies from an unexpected exception, prefer process-level restart
“Fail one item and continue” and “the parent loop survives every unexpected exception silently” are very different designs.
4.5. The final unhandled-exception handler
This is the last line of defense.
It is not a magical recovery point.
Typical examples are:
AppDomain.UnhandledException- WPF
Application.DispatcherUnhandledException - WinForms
Application.ThreadException - ASP.NET Core exception middleware or handlers
- the final exception handling around Generic Host / worker /
BackgroundService
Its main responsibilities are things like:
- final logging
- flushing
- arranging dump collection
- storing session or near-last context
- setting exit codes and restart paths
It is also better not to expect too much from it:
- by the time an exception reaches here, it is often already a design miss above
- the state may already be corrupted
- locks may still be held, so heavy work is dangerous
- even if the app appears able to continue, that does not mean it is safe to continue
There are also practical .NET-specific points worth keeping in mind:
AppDomain.UnhandledExceptionis for notification and recording of an unhandled exception. Putting too much recovery logic after that point is risky.- In WPF
DispatcherUnhandledException, there is a path whereHandled = truekeeps the app alive, but the first question is whether recovery is actually safe. - WinForms
ThreadExceptioncan also leave the application in an unknown state even after handling. - ASP.NET Core exception middleware needs to be placed early enough in the pipeline to receive downstream exceptions.
- Since .NET 6, an unhandled exception in
BackgroundServiceis logged and by default tends toward stopping the host. In many cases, stopping and letting restart policy take over is safer than broadly swallowing everything in the parent loop.
Desktop applications in particular often do have a path that appears to “catch and continue” after an unhandled exception.
But being able to continue and being safe to continue are not the same thing.
4.6. Looking at one call hierarchy end to end
For example, imagine a flow like this:
flowchart LR
A["UI / Controller / Job boundary"] --> B["Application Service / UseCase"]
B --> C["Domain / business logic"]
C --> D["Repository / Gateway / SDK wrapper"]
D --> E["DB / HTTP / File / Vendor SDK"]
In that case, the roles usually separate roughly like this.
Save button → SaveOrderUseCase → PaymentGateway → HTTP
PaymentGateway- catches communication failures and invalid response formats
- translates them into something like “payment service connection failed” or “invalid payment service response”
- performs retry here only when the conditions justify it
- if it rethrows, it normally does not write the primary log
SaveOrderUseCase- turns expected failures such as payment rejection into results
- treats the failure as “only this order finalization failed”
- shapes the failure so UI or API layers can return it cleanly
- UI button handler / Controller
- broadly catches unexpected exceptions
- writes the primary log with
orderId,userId, andrequestId - converts the failure into a dialog, 500, or 503 response
- Unhandled-exception handler
- records only what leaked that far
- performs dump collection or final flush
- prioritizes the exit path rather than recovery
With that split, technical details stay closed lower down, operational context is attached higher up, and decisions are made at boundaries.
5. Separate expected failures from unexpected exceptions
The most important thing in this whole topic is not treating everything as the same kind of “exception.”
It helps to separate them roughly like this:
| Kind of failure | First place to handle it | Typical treatment |
|---|---|---|
| validation issue | UseCase / request boundary | return as an input error |
NotFound / Conflict |
UseCase / Controller | 404 / 409 or screen message |
| user cancel / shutdown | operation boundary | cancellation; usually not an Error |
| one invalid CSV row | one-item boundary | record as Warning and continue |
| transient timeout that still ends in failure | I/O boundary to request boundary | fail after retry |
NullReferenceException, broken assumptions |
request / job boundary | primary log and failure response |
AccessViolationException, severe OutOfMemoryException, or native-boundary corruption smell |
final boundary | treat as Critical and move toward shutdown |
Expected failures are failures you can design for ahead of time.
Unexpected exceptions are failures where it is questionable whether the state should still be trusted afterward.
Separating those two alone reduces problems like:
- logging
NotFoundasErrorevery time - treating user cancellation as an outage
- letting truly dangerous broken-assumption failures continue as if they were only “this request failed”
6. Where to log, and how many times
When designing logs, it is often more important to decide who owns the primary log than to decide the exact place of catch.
The basic rules are:
- One primary
Error/Criticallog for one failure - Lower layers add translation and context only when needed
- Upper boundaries write the primary log with the unit of failure and operational context
- Only a layer that swallows the failure fully owns the responsibility to record that swallowed failure
- Expected failures should not always become
Error OperationCanceledExceptionshould be separated from ordinary failure logs
A rough table of logging points looks like this:
| Situation | Main logging place | Typical level | Note |
|---|---|---|---|
| validation error | request / use-case boundary | Information or no log |
not an outage, but a contract failure |
| user cancel / shutdown | operation boundary | Debug / Information |
normally not Error |
| transient failure during retry | the layer that owns retry | Debug / Warning |
do not make noise before final failure |
| failure after all retries are exhausted | request / job boundary, or the layer that swallows it | Warning / Error |
record with the unit of failure |
| one bad row and continue | item boundary | Warning |
include fileId and rowNumber |
| unexpected exception that kills a whole request | request / UI / job boundary | Error |
include requestId, userId, entityId |
| process-ending severity | unhandled-exception boundary | Critical |
flush, dump, restart path |
In practice, one of the most common problems is duplicate logging like this:
- Repository writes
Error - Service writes
Errorfor the same exception - Controller writes
Erroragain - the final unhandled-exception handler also writes
Critical
Then one outage produces multiple copies of the same stack trace.
What the operator actually needs is not four copies of the same stack trace, but one primary log and, at most, a small number of supporting logs.
Another way to say it is: one log, as much context as needed.
7. Common anti-patterns
7.1. catch (Exception) in a deep layer that returns null / false
This tends to erase the real cause.
It also makes the caller unable to tell whether “the data truly was not there” or “something broke halfway through.”
7.2. Writing Error in every layer and then rethrowing
This is one of the biggest causes of duplicate logs.
If you split responsibilities into:
- lower layers translate
- upper boundaries write the main log
the noise drops a lot.
In C#, if you rethrow, the basic form is throw; so you do not damage the stack trace.
7.3. Library layers or shared components directly showing UI
If a shared component opens a MessageBox or directly decides an HTTP response body, both reuse and responsibility boundaries collapse.
Lower layers are safer when they stop at returning or throwing a meaningful failure.
7.4. Logging OperationCanceledException as an outage
Cancellation is part of control flow.
If you write Error every time, real failures get buried.
7.5. Retrying lightly when there are external side effects
For things like email sending, billing, device commands, or file moves, doing the same operation one more time can easily cause damage.
Retry belongs only where transient failure and idempotency are both visible.
7.6. Trying to recover from everything in the final unhandled-exception handler
That place is only the last insurance policy.
It should not become the center of the design.
Recovery strategy is usually safer when it lives one step earlier, at the request / job / subsystem boundary.
8. Review checklist
When reviewing exception handling, it helps to look through the following in order:
- Can this
catchbe explained in one sentence as what decision it exists to make? - Can this place really decide retry, result conversion, continue-or-stop behavior, or user response?
- If it logs here, will the same failure also be logged as
Errorabove? - Are lower-layer-specific exceptions translated into meaningful failures at the boundary?
- Can this place restore corrupted state? If not, is discard-and-recreate the assumption?
- Is
OperationCanceledExceptionseparated from normal failure? - Is it clear whether continuation happens per item, per request, or only after process restart?
- Is the final unhandled-exception handler treated as a recording point rather than a recovery point?
- Does the log include the failure-unit context such as requestId, userId, batchId, fileId, or rowNumber?
- Are “expected failures” and “broken assumptions” being treated differently?
The especially effective question is: “What exactly does this catch decide?”
If that cannot be answered clearly, the catch is often unnecessary or too deep.
9. A rough quick guide
Finally, in a much shorter form, the split usually looks like this:
| Situation | catch |
Logging | Error handling |
|---|---|---|---|
| helper / utility | usually no | no | no |
| Repository / Gateway / SDK wrapper | catch only specific exceptions | usually not the primary log | translation, local retry, disposing connections |
| UseCase / Application Service | catch expected failures | only if swallowing as needed | result conversion, partial failure handling |
| UI / Controller / request / item / job boundary | catch unexpected exceptions broadly | primary log | response, message, continue / abort |
| unhandled-exception handler | only what leaked through | Critical |
final recording, exit path |
When you are unsure, these five rules are usually enough:
- Do not broadly swallow in deep layers
- Catch at boundaries
- Write the primary log once
- The layer that swallows owns the responsibility
- The final unhandled exception is for recording and exit routing
10. Summary
Exception handling is not a story of “you can catch anywhere, so you should catch anywhere.”
In practice, this order of questions is usually enough:
- Can this place really make the decision?
- Is the unit of failure visible here?
- Can the state be restored or rebuilt here?
- Will logging here create duplicates?
- Is this a recovery point, or only the last recording point?
Once you look in that order, the call hierarchy becomes much easier to organize.
The three most important ideas are these:
- deep layers mainly translate and clean up
- boundaries mainly decide and write the primary log
- the final unhandled-exception handler mainly records and routes termination
Put differently,
exceptions should be caught at boundaries, enriched with context, and only fully handled where recovery is actually possible.
Once that is decided, both code reviews and incident investigations become much less inconsistent.
11. References
- .NET: Best practices for exceptions
- .NET: System.AppDomain.UnhandledException event
- WPF: Application.DispatcherUnhandledException event
- Windows Forms: Application.ThreadException event
- Handle errors in ASP.NET Core
- ASP.NET Core middleware
- Use BackgroundService to create Windows Services
12. Related Articles
- Checklist for Unexpected Exceptions - Should the App Exit or Continue? A Practical Decision Table
- If You Have to Build Your Own Logger, What Is the Minimum You Actually Need? - Practical Requirements and Integration Test Checks
- What the .NET Generic Host Is - DI, Configuration, Logging, and BackgroundService Explained
- Where Unit Tests End and Integration Tests Begin - A Practical Boundary Guide
Related Articles
Recent articles sharing the same tags. Deepen your understanding with closely related topics.
Checklist for Unexpected Exceptions - Should the App Exit or Continue? A Practical Decision Table
A practical decision table for whether a Windows application should exit or continue after an unexpected exception, viewed through state ...
If You Have to Build Your Own Logger, What Is the Minimum You Actually Need? - Practical Requirements and Integration Test Checks
A practical guide to the minimum requirements for a custom application logger and the integration tests worth running with real files, th...
How to Preserve Crash Logs in Windows Apps Even When They Die from Programming Errors - Best Practices with WER, Final Markers, and Watchdog Design
A practical guide to preserving useful crash evidence in Windows apps by combining normal logs, final crash markers, WER LocalDumps, and ...
Why Windows Code Should Prefer Event Waits Over Timer Polling
A practical guide to why Windows code should prefer event-driven waits over short timer polling when waiting for work arrival, I/O comple...
A Minimum Security Checklist for Windows Application Development
A practical minimum security checklist for Windows desktop application development, covering permissions, signing, secrets, transport sec...
Related Topics
These topic pages place the article in a broader service and decision context.
Windows Technical Topics
Topic hub for KomuraSoft LLC's Windows development, investigation, and legacy-asset articles.