Concurrency & Multithreading

This Concurrency Bug Stayed Hidden for a Year

A subtle race condition in parallel batch processing caused silent data corruption despite using thread-safe collections. The bug surfaced only under specific runtime timing.

By Harshit Singhal | 2026

We had a background job that processed thousands of records in parallel. Each batch ran concurrently, and we kept track of total successful and failed records.

Everything worked perfectly.

For almost a year.

Then one day, the totals started becoming… wrong.

The Setup

  • Records processed in chunks
  • Multiple chunks running concurrently
  • Shared counters tracking totals
  • Periodic database updates with progress

The Symptom

  • Some runs showed fewer successful records than expected
  • Re-running the same data produced different counts
  • The issue appeared only in one environment

What Was Actually Happening

Initial total = 10
Worker A reads total (10)
Worker B reads total (10)

Worker A increments → 11
Worker B increments → 11

Final total = 11 (should be 12)

This is a classic lost update race condition.

The Buggy Code

int totalSuccess = 0;

Parallel.ForEach(records, record =>
{
    if (Process(record))
    {
        totalSuccess++; // not atomic
    }
});

Why volatile Alone Doesn't Fix It

private static volatile int totalSuccess = 0;

This ensures visibility, but not atomicity.

The Fix: Atomic Counters

int totalSuccess = 0;

Parallel.ForEach(records, record =>
{
    if (Process(record))
    {
        Interlocked.Increment(ref totalSuccess);
    }
});

Snapshot-Based Progress Reporting

var finished = Interlocked.Increment(ref completedChunks);

if (finished % maxConcurrency == 0)
{
    var successSnapshot = Volatile.Read(ref totalSuccess);
    var failureSnapshot = Volatile.Read(ref totalFailed);

    job.TotalSuccessfulRecords = successSnapshot;
    job.TotalFailedRecords = failureSnapshot;

    await UpdateJobProgress(job);
}

Lessons Learned

  • Thread-safe collections ≠ thread-safe logic
  • ++ is not atomic
  • volatile ensures visibility, not correctness
  • Use Interlocked for counters
  • Use snapshot reads for reporting
  • Reduce shared mutable state
  • Concurrency bugs are timing dependent

Takeaway

If you're running parallel batch jobs and tracking totals:

  • Use atomic counters
  • Take snapshot reads for reporting
  • Avoid frequent shared writes

Otherwise, everything may look fine… until it doesn't.

#concurrency #multithreading #csharp #dotnet #performance