DEV Community

Cover image for Payment Failure Architecture: Designing Retry, Reversal, Refund, Settlement, and Reconciliation Flows
Vaibhav Shakya
Vaibhav Shakya

Posted on

Payment Failure Architecture: Designing Retry, Reversal, Refund, Settlement, and Reconciliation Flows

Payment failures are rarely just “failed payments” ⚙️

A customer may see money debited.

The app may show timeout.

The backend may keep the transaction pending.

The payment processor may confirm success a few minutes later.

All systems can be correct from their own boundary, but the product experience still breaks.


The real architecture problem

The mistake is treating payment as a single success/failed flag.

In real systems, these are separate flows:

  • Retry
  • Reversal
  • Refund
  • Settlement
  • Reconciliation

Each flow has different owners, timelines, risks, audit requirements, and customer impact.


What strong payment systems usually need

A production-grade payment architecture should include:

  • Durable payment attempt IDs
  • Idempotent retry handling
  • Controlled state transitions
  • Event deduplication
  • Pending verification states
  • Append-only ledger entries
  • Settlement mapping
  • Reconciliation jobs
  • Support-visible operational status

The goal is not only to process payments.

The goal is to explain what happened when money moved, confirmation was delayed, refund failed, or settlement did not match.


Mobile should not decide payment truth

The mobile app should not be the final source of truth for payment completion.

It should recover safely from:

  • Timeout
  • App switch
  • SDK return
  • Browser return
  • Delayed callbacks
  • Network loss

After returning from a payment flow, the app should ask the backend for the authoritative payment status instead of assuming success or failure locally.


Why this matters

A timeout is not always a failed payment.

A successful payment is not always settled.

A refund initiated event does not always mean money has reached the customer.

A callback should not directly overwrite business state without validation.

That is the difference between integrating a payment gateway and designing a payment system.


Read the full article

I wrote a detailed Medium article on designing payment failure architecture across mobile, backend, platform, finance, and support boundaries.

👉 Read the full article here:
https://medium.com/@vaibhav.shakya786/payment-failure-architecture-designing-retry-reversal-refund-settlement-and-reconciliation-f5556de08038


Top comments (1)

Collapse
 
mickyarun profile image
arun rajkumar

This nails the part most gateway integrations miss — the mobile return is an event, not a verdict. We run open banking payments where the bank confirms async on its own clock, so the app coming back "success" means "go ask the backend," never "show the receipt." The addition I'd make to your list: treat the reconciliation job, not your own DB, as the final arbiter. Your DB being internally consistent is the easy half; staying consistent with money that already moved at the rail is the half that actually pages you. Splitting retry/reversal/refund/settlement into separate state machines instead of one status flag is the best call in here.