Status: Active
Created: August 2025
Last Updated: August 2025
Context
We are using Stripe Subscriptions, meaning that Stripe is responsible for creating the subscriptions and managing the retries of the failing methods. At the beginning we want to mimic the Stripe retry logic to have similar recovery results.Solution
Approach 1: New Dunning concept
This approach involved creating a newDunningAttempt model that keeps track of the failed payment attempts for subscriptions. A dedicated worker would manage the state transitions.
Overview:
- New
DunningAttemptmodel: Introduce a new class and table to process subscription failed payments. This new class will have:id: UUIDorder_id: foreign key to Order (unique - we should have only 1 row per order)status: retrying, succeeded, failedattempt_number: the current retry count (from 1 through 4)next_payment_attempt_at: date when the next retry is scheduledlast_failure_reason: the error message that failedstarted_at: when the dunning process began
- Renewal Job: A scheduled job (
subscription.cycle) runs regularly to find subscriptions due for renewal. - Payment Attempt: For each due order subscription, it calls
PaymentServiceto attempt a charge. - State Transition:
- On Failure: The subscription status is changed to
past_due, and a new DunningAttempt record is created withattempt_numberset to 1, statusretrying, and anext_payment_attempt_at. The benefits will be revoked. - On Success: The subscription is renewed, and the status remains
active.
- On Failure: The subscription status is changed to
- Dunning Worker: A new periodic worker (
subscription.dunning) runs hourly.- It queries for
dunning_attemptswithstatus = "retrying", andnext_payment_attempt_atit’s in the past. - It re-attempts payment. If it fails again, it updates
next_payment_attempt_atfor the next retry. - After a configured number of retries, it moves the DunningAttempt to failed and the subscription to
unpaidorcanceled.
- It queries for
- Recovery: If a payment succeeds during the dunning process, the subscription status is set back to
activeand the DunningAttempt is marked tosucceeded
Sequence Diagram
First payment Dunning retries TBD: User updated payment method TODOApproach 2: Store the retries in the Subscription model
I discarded this solution as it increases the complexity of the Subscription model.(Recommended) Approach 3: Using Order and Payment Models
This approach relies on the existingOrder and Payment models, making it a lightweight and integrated solution. It avoids a new table only needing minimal changes to the Order model.
Overview:
- Change
OrderModel: Add a single, nullable timestamp field to theOrdermodel:next_payment_attempt_at: Schedules the next retry. IfNULL, no retry is pending.
- Use Existing
PaymentModel: ThePaymentmodel, which already has a foreign key toOrder, will serve as the log for all payment attempts.attempt_numbercan be derived by counting thePaymentrecords associated with the order.last_failure_reasoncan be retrieved from the most recent failedPaymentrecord.
- Renewal and Initial Payment: The process starts as before. A scheduled job creates an
Orderand aPaymentis attempted. - State Transition on Failure:
- When a payment fails, the
Paymentrecord is marked asfailed. - The
Orderis updated by settingnext_payment_attempt_atto schedule the first retry (e.g., 3 days from now). - The
Subscriptionstatus is set topast_due. Benefits are revoked.
- When a payment fails, the
- Dunning Worker: A periodic worker (
order.process_dunning) runs hourly.- It queries for
Orders wherenext_payment_attempt_atis notNULLand is in the past. - For each due order, it creates a new
Paymentrecord and attempts a charge. - On Failure: It reschedules
next_payment_attempt_atfor the next attempt. - On Final Failure: After the last attempt,
next_payment_attempt_atis set toNULL. TheSubscriptionis moved tounpaidorcanceled.
- It queries for
- Recovery:
- If a payment succeeds, the
Paymentis marked assucceeded, theOrderis marked as paid, andnext_payment_attempt_atis set toNULL. - The
Subscriptionstatus is set back toactive. Benefits are re-granted.
- If a payment succeeds, the
Comparison with Approach 1
Pros:- Maintainability: Avoids a new table and only adds one nullable column to
Order. - Leverages Existing Models: Reuses the
Paymentmodel for its intended purpose, keeping the design clean and logical. - Clear and Explicit: The
Ordermodel is only concerned with when the next payment is due, while thePaymentmodel correctly stores the history of what happened.
- Complex Queries: Retrieving the full dunning history for an order (like the attempt count or last error) requires querying the associated
Paymentrecords. We only expect 4 payments per order. Mitigation: it can be cached or denormalized if it becomes expensive. - Race conditions: As before, we can have a Race condition when the retry mechanism is working and the user plans to manually pay. Mitigation can be prevented by locking the Order.
Sequence Diagrams
First Subscription Payment Failure This diagram shows the flow when a recurring subscription payment fails for the first time, initiating the dunning process. Dunning Retries This diagram illustrates how the periodic worker handles scheduled payment retries. User Updates Payment Method TODOQuestions
How does the retry mechanism look in our Stripe account In Stripe we have:- 4 retries at most.
- These 4 retries should happen before 3 weeks.
- If all retries failed, we set the status of the subscription to canceled.
- 2682 with at least 1 failed payment
- 1108 recovered orders
- Then 41% of recovery rate.
- 1253 total
- 541 recovered
- Then 43% of recovery rate.
invoice and payment_intent objects. Each payment intent can have multiple transactions associated with it. From what I see, we don’t expose these payment intents in our API. I will keep this as a separate task if we want to do it.
What happens when the customer updates the payment method?
The customer can trigger a manual payment action. If the payment succeeds the order will be marked as paid; if it fails, it will follow the original dunning schedule.
What is the backoff strategy of Stripe?
Stripe has different backoffs, but based on (t1, t2, t3) we have the following ranges:
- (D+0) Subscription payment fails
- (D+2/3) First attempt
- (D+7/12) Second attempt
- (D+14/17) Third attempt
- (D+21) Fourth attempt.
unpaid or canceled?
In Stripe, we are moving it to canceled.
