If You Have to Build Your Own Logger, What Is the Minimum You Actually Need? - Practical Requirements and Integration Test Checks

· · Windows Development, Logging, Integration Testing, Test Design, Reliability

If you can use an established logging framework, that is usually the safer choice. Even so, there are times when application constraints or operational requirements push you toward building a custom logger. The hard part is deciding how much is enough for a first version without ending up with something either too weak to trust or too heavy to maintain.

This article narrows the scope to diagnostic application logs. It does not try to solve audit trails, distributed tracing, metrics pipelines, or cloud aggregation all at once. The goal is a practical minimum that is useful during incident investigation, plus a short list of integration tests that make the implementation worth trusting.

Contents

  1. Conclusion Up Front
  2. Narrow the Scope First
  3. The Minimum Requirements
    • 1. Use UTF-8 JSON Lines
    • 2. Lock the Required Fields Early
    • 3. Default to One Process, One File
    • 4. Choose the Write Strategy by Volume
    • 5. Decide the Flush Rules
    • 6. Add Rotation and Retention from Version 1
    • 7. Do Not Silently Fail Over to a Mystery Location
  4. A Reasonable Version 1 Shape
  5. Common Mistakes
  6. Think About Integration Tests with Real Files, Threads, and Processes
  7. Integration Tests Worth Keeping
    • Single-Write Sanity
    • Concurrency Inside One Process
    • Flush and Shutdown Behavior
    • Rotation and Retention
    • Failure Cases
    • Multiple Processes
  8. Six Tests Worth Shipping First
  9. Summary

Conclusion Up Front

For a first release, the essentials are usually these.

  • Use UTF-8 JSON Lines
  • Keep one record per line
  • Require timestamp, level, category, message, fields, sessionId, and processId
  • Default to one process, one file
  • Use synchronous writes for low volume, or single writer + bounded queue for higher volume
  • Synchronously flush Error / Critical and session start / end events
  • Add rotation and retention from version 1
  • If the primary log location fails, do not silently write somewhere else without making that failure explicit

That is enough to stay practical while avoiding the most common reliability problems.

Narrow the Scope First

Custom loggers get complicated when they are asked to solve too many problems at once. If you mix diagnostic logs, audit logs, performance counters, distributed traces, and user analytics into one design, the requirement list explodes immediately.

Here the target is much smaller: logs used to investigate application failures. In other words, you want to reconstruct when something happened, in which part of the application, and with what surrounding context. Once the scope is limited to that job, design decisions become much easier.

The Minimum Requirements

1. Use UTF-8 JSON Lines

Plain text concatenation is easy to start with, but it becomes awkward to process later. A heavy custom binary format has the opposite problem: it is harder to inspect during operations.

UTF-8 JSON Lines is a good middle ground. Each line is one record, which keeps the file human-readable and easy to parse from scripts and tools. If a write is interrupted, you can usually isolate the damage to a single line instead of losing the structure of the entire file.

2. Lock the Required Fields Early

A practical minimum is the following seven fields.

  • timestamp
  • level
  • category
  • message
  • fields
  • sessionId
  • processId

If you only keep a message string, you will regret it as soon as search or correlation requirements grow. If you add too many fields up front, every call site gets heavier than necessary. This smaller fixed set is a reasonable starting point.

3. Default to One Process, One File

Letting multiple processes append to the same file creates more failure modes than it seems. Locking, partial writes, rotation timing, and crash behavior all become harder.

A safer default is one process, one file. If you later need aggregated logs, do the aggregation explicitly through a separate process or a downstream collector.

4. Choose the Write Strategy by Volume

At low volume, synchronous writes are often the best choice because the behavior is simple and easy to reason about. Premature async logging tends to create ambiguity around shutdown, flush timing, and lost final records.

If log volume becomes high enough that synchronous I/O is a bottleneck, switch to single writer + bounded queue. The key design decision is what happens when the queue is full. Decide that policy explicitly instead of leaving it accidental.

5. Decide the Flush Rules

Error, Critical, and session start / end events are usually worth flushing synchronously because they matter most during incident review. Treating every Info message the same way often hurts performance without adding much operational value.

6. Add Rotation and Retention from Version 1

Rotation is often postponed, but it quickly becomes an operational problem if you do. The exact policy can vary, but the system should at least avoid unbounded growth and define how many files are retained.

7. Do Not Silently Fail Over to a Mystery Location

If the intended log directory is unavailable, silently redirecting writes to another place usually makes incident response worse. The operator expects logs in one place and loses time when they are missing.

If logging fails, surface that failure clearly through the application, standard error, an event log, or another explicit mechanism. The worst outcome is logs that moved somewhere nobody knows to check.

A Reasonable Version 1 Shape

A practical first version is often no more than this.

  • UTF-8 JSON Lines
  • one process, one file
  • a session-oriented file name
  • size-based or startup-based rotation
  • a retention limit
  • synchronous flush for Error / Critical
  • an API that accepts structured fields

Anything beyond that should usually wait until real operational pain makes the next requirement obvious.

Common Mistakes

These are the mistakes worth avoiding.

  • putting all context into a single message string
  • sharing one file across multiple processes
  • making everything asynchronous without a clear flush policy
  • postponing rotation and retention
  • silently falling back to another folder when writes fail
  • adding network shipping or local database storage in version 1

All of these can feel convenient early on and become expensive during troubleshooting.

Think About Integration Tests with Real Files, Threads, and Processes

A logger is hard to trust if it is only covered by unit tests. Verifying string formatting or JSON serialization alone does not tell you much about the real operational risks, which usually live in file I/O, concurrency, rotation, shutdown, and permission failures.

That is why integration tests should use real files, real threads, and, when relevant, real processes. The goal is to avoid a logger that works in the happy path but cannot be trusted under failure conditions.

Integration Tests Worth Keeping

Single-Write Sanity

  • each line is exactly one JSON record
  • the file can be read back as UTF-8
  • every record contains the required fields
  • embedded newlines do not break record boundaries

Concurrency Inside One Process

  • simultaneous writes from multiple threads do not corrupt records
  • record counts match expectations
  • when a queue is used, ordering or drop behavior matches the documented policy

Flush and Shutdown Behavior

  • Error / Critical records become visible immediately
  • normal shutdown drains the queue
  • important end-of-session records survive near-exception shutdown paths

Rotation and Retention

  • rotation happens when the threshold is reached
  • old files are removed according to the retention rule
  • records written around the rotation boundary still remain valid JSON lines

Failure Cases

  • behavior when the target directory does not exist
  • behavior when write permission is missing
  • behavior when a disk-full-like error occurs
  • behavior when the bounded queue overflows

Multiple Processes

If the specification says one process, one file, you can test that another process does not join the same file. If you use an aggregation process instead, test delivery failures on that handoff path as well.

Six Tests Worth Shipping First

If you try to automate every scenario in the first pass, the suite can become heavier than the logger itself. A pragmatic starting set is the following six tests.

  1. normal single-threaded write
  2. concurrent multi-threaded write
  3. synchronous flush for Error / Critical
  4. rotation and retention
  5. explicit failure when the target path is unavailable
  6. drain and final flush on normal shutdown

That small set already moves the logger a long way from “it prints strings” to something operationally dependable.

Summary

The first goal of a custom logger is not feature richness. It is trust during failure investigation. That usually means fixing the format to UTF-8 JSON Lines, limiting the required fields, defaulting to one process, one file, and deciding flush, rotation, retention, and failure behavior early.

Then verify those decisions with integration tests that use real files, threads, and processes. If you establish the minimum design and the minimum test set first, the logger becomes much easier to grow later without turning into a maintenance burden.

Related Topics

These topic pages place the article in a broader service and decision context.

Where This Topic Connects

This article connects naturally to the following service pages.

Back to the Blog