Idempotency in SaaS APIs — How to Make Your Endpoints Safe to Retry

Your user clicks "Subscribe." The request hits your server, the charge succeeds, the subscription is activated, and then — network timeout. The client never got the 200. So it retries. Now you have two active subscriptions and one very confused customer who was charged twice.
This is the exact problem idempotency solves. And if your SaaS API handles payments, creates resources, or triggers side effects on POST requests, not having it is not a bug — it is a liability waiting for a network blip to discover.
We learned this the way most teams do: a client's integration retried a charge creation after a timeout, and our API processed both requests. The customer was refunded, the client was apologised to, and a proper API idempotency implementation became the next item on the sprint board. This post is the implementation we built — a NestJS interceptor that makes any POST endpoint idempotent, backed by PostgreSQL deduplication, with race condition handling and testing.

What Idempotency Actually Means and Why Every Payment API Needs It
An operation is idempotent if executing it multiple times produces the same result as executing it once. For HTTP, GET and DELETE are naturally idempotent — fetching a resource twice or deleting it twice does not change the outcome after the first call. POST is the dangerous one: every POST potentially creates a new resource or triggers a side effect.
The mechanism that makes POST safe to retry is the idempotency key: a unique identifier the client generates, sent as the Idempotency-Key header, that the server uses to detect and reject duplicates. (The Wikipedia article on idempotence traces the concept back to abstract algebra, but the practical version is simpler: the same request, sent twice, produces the same result as sending it once.) Stripe made this pattern standard for payment APIs (Stripe Idempotent Requests), and it is the right pattern for any SaaS API that mutates state. We covered it briefly in our third-party API reliability patterns post as a pattern for safe retries, but it deserves its own deep dive — because getting it wrong means charging a customer twice.
The three failure scenarios idempotency guards against are:
- Pre-server failure — the request never reached the server. Retrying is safe without idempotency because no operation was performed.
- Mid-processing failure — the request reached the server, processing started, but the server crashed before completing. Retrying without idempotency means the operation runs twice.
- Post-processing failure — the server completed the operation successfully but the response was lost in transit. Retrying without idempotency means the operation runs twice.
Only scenario one is safe to retry naively. Scenarios two and three — the most common in production — require idempotency.
The Idempotency-Key Header Pattern
The pattern is simple: the client generates a UUID (or any sufficiently unique string), includes it as the Idempotency-Key header on every POST request, and sends the same key if it retries.
The server side follows this flow:
- Extract the
Idempotency-Keyfrom the request headers - Look up the key in the idempotency store
- If found, return the stored response without processing
- If not found, process the request, store the response under the key, and return it
The critical implementation detail is that steps two, three, and four must be atomic — you cannot have two concurrent requests both checking, both finding nothing, and both processing. The database constraint handles this.
Here is the interface our implementation uses:
1export interface IdempotencyRecord {
2 key: string;
3 responseStatusCode: number;
4 responseBody: string;
5 createdAt: Date;
6}The client side generates a UUID per unique operation and sends it with every retry. If the request payload changes (user edits a form and resubmits), the client generates a new key — the same key must never map to different request payloads.
Database Implementation with PostgreSQL
We store idempotency records in PostgreSQL rather than Redis for most projects. The reason is not performance — Redis is faster — but operational simplicity: one fewer service to deploy, configure, and monitor, and the guarantee that idempotency records survive a restart.
The Database Table
1// src/idempotency/entities/idempotency.entity.ts
2import { Entity, PrimaryColumn, Column, Index } from 'typeorm';
3
4@Entity('idempotency_records')
5export class IdempotencyEntity {
6 @PrimaryColumn({ type: 'varchar', length: 255 })
7 key: string;
8
9 @Column({ type: 'int' })
10 responseStatusCode: number;
11
12 @Column({ type: 'text' })
13 responseBody: string;
14
15 @Column({ type: 'timestamp with time zone' })
16 createdAt: Date;
17}The key column is the primary key. This is deliberate — PostgreSQL's unique constraint on a primary key is what prevents duplicate processing. Two concurrent inserts with the same key will cause one to fail with a unique violation, and our code catches that to mean "another request already won — return the result."
Setting a TTL
Idempotency keys should not live forever. Stripe expires them after 24 hours, and we follow the same window. A scheduled cleanup job handles expiration:
1// src/idempotency/idempotency-cleanup.service.ts
2import { Injectable } from '@nestjs/common';
3import { Cron, CronExpression } from '@nestjs/schedule';
4import { InjectRepository } from '@nestjs/typeorm';
5import { Repository, LessThan } from 'typeorm';
6import { IdempotencyEntity } from './entities/idempotency.entity';
7
8@Injectable()
9export class IdempotencyCleanupService {
10 constructor(
11 @InjectRepository(IdempotencyEntity)
12 private readonly repository: Repository<IdempotencyEntity>,
13 ) {}
14
15 @Cron(CronExpression.EVERY_HOUR)
16 async cleanupExpiredKeys(): Promise<void> {
17 const cutoff = new Date(Date.now() - 24 * 60 * 60 * 1000);
18 await this.repository.delete({ createdAt: LessThan(cutoff) });
19 }
20}Redis Alternative
For high-throughput endpoints (10K+ charges per second), PostgreSQL latency matters and Redis is worth the operational cost. The same pattern works with SET NX and an expiry:
1const key = `idempotency:${idempotencyKey}`;
2const acquired = await redis.set(key, 'processing', 'NX', 'EX', 24 * 3600);
3if (!acquired) {
4 // Another request is processing or already processed — wait for the result
5}We reach for Redis when a single endpoint exceeds about 500 writes per second and the extra complexity budget has room. Most SaaS APIs never hit that threshold.
NestJS Interceptor for API Idempotency Implementation
An interceptor is the cleanest home for idempotency logic in NestJS. It wraps every POST request, checks the idempotency key, and either returns the cached response or processes the request and caches the result.
1// src/idempotency/idempotency.interceptor.ts
2import {
3 Injectable,
4 NestInterceptor,
5 ExecutionContext,
6 CallHandler,
7 ConflictException,
8 HttpException,
9} from '@nestjs/common';
10import { InjectRepository } from '@nestjs/typeorm';
11import { Repository } from 'typeorm';
12import { Observable, of } from 'rxjs';
13import { catchError, map, switchMap } from 'rxjs/operators';
14import { IdempotencyEntity } from './entities/idempotency.entity';
15
16@Injectable()
17export class IdempotencyInterceptor implements NestInterceptor {
18 constructor(
19 @InjectRepository(IdempotencyEntity)
20 private readonly repository: Repository<IdempotencyEntity>,
21 ) {}
22
23 async intercept(
24 context: ExecutionContext,
25 next: CallHandler,
26 ): Promise<Observable<unknown>> {
27 const request = context.switchToHttp().getRequest();
28 const idempotencyKey = request.headers['idempotency-key'];
29
30 // Skip non-mutating methods
31 if (request.method !== 'POST' && request.method !== 'PATCH') {
32 return next.handle();
33 }
34
35 // Skip if no key provided
36 if (!idempotencyKey) {
37 return next.handle();
38 }
39
40 // Check for existing record atomically
41 const existing = await this.repository.findOne({
42 where: { key: idempotencyKey },
43 });
44
45 if (existing) {
46 // Return the cached response
47 const response = context.switchToHttp().getResponse();
48 response.status(existing.responseStatusCode);
49 return of(JSON.parse(existing.responseBody));
50 }
51
52 // Attempt to insert a new record — this is the race condition guard
53 try {
54 await this.repository.insert({
55 key: idempotencyKey,
56 responseStatusCode: 0,
57 responseBody: '{}',
58 createdAt: new Date(),
59 });
60 } catch (error) {
61 if ((error as any).code === '23505') {
62 // Unique violation — another request inserted first
63 const record = await this.repository.findOne({
64 where: { key: idempotencyKey },
65 });
66 if (record) {
67 const response = context.switchToHttp().getResponse();
68 response.status(record.responseStatusCode);
69 return of(JSON.parse(record.responseBody));
70 }
71 }
72 throw error;
73 }
74
75 // Process the request and update the record with the real response
76 return next.handle().pipe(
77 map((responseBody) => {
78 const response = context.switchToHttp().getResponse();
79 const statusCode = response.statusCode;
80
81 // Only cache successful responses (2xx) and client errors (4xx)
82 // Do not cache 5xx errors — retries should attempt again
83 if (statusCode < 500) {
84 this.repository.update(idempotencyKey, {
85 responseStatusCode: statusCode,
86 responseBody: JSON.stringify(responseBody),
87 });
88 } else {
89 // Server error — remove the placeholder so retries reprocess
90 this.repository.delete(idempotencyKey);
91 }
92
93 return responseBody;
94 }),
95 catchError((err) => {
96 // Remove the placeholder on unhandled errors
97 this.repository.delete(idempotencyKey);
98 throw err;
99 }),
100 );
101 }
102}Key design decisions in this interceptor:
- Skip non-POST methods — GET, PUT, DELETE, and HEAD are idempotent by HTTP spec
- Skip requests without a key — idempotency is opt-in per request; clients that do not send a key are processed normally (though we strongly encourage all clients to send one)
- Atomic insert as the race guard — PostgreSQL's unique constraint on the primary key means only one concurrent request succeeds in inserting the placeholder record; the rest find the conflict and retrieve the winner's result
- Do not cache 5xx errors — a 503 today might succeed tomorrow; caching it would permanently poison the idempotency key
- Remove placeholder on failure — if processing crashes, the placeholder is removed so the same key with the same request can be retried cleanly
Registering the Interceptor
Register it globally so every POST endpoint is automatically protected:
1// src/app.module.ts
2import { Module } from '@nestjs/common';
3import { APP_INTERCEPTOR } from '@nestjs/core';
4import { IdempotencyInterceptor } from './idempotency/idempotency.interceptor';
5
6@Module({
7 providers: [
8 {
9 provide: APP_INTERCEPTOR,
10 useClass: IdempotencyInterceptor,
11 },
12 ],
13})
14export class AppModule {}Or apply it selectively with @UseInterceptors(IdempotencyInterceptor) on individual controllers. We use the global approach because the overhead of checking a primary-key lookup is negligible and the safety is universal.
Storing and Returning Cached Responses
The interceptor caches two things: the HTTP status code and the response body. When a duplicate key arrives, we reconstruct the original response:
1response.status(existing.responseStatusCode);
2return of(JSON.parse(existing.responseBody));Cache invalidation for idempotency is straightforward — the key is the invalidation. After 24 hours the key is pruned, and a new request with the same key is treated as a fresh operation. (In practice, clients should never reuse a key that old.)
One subtle point: parameter fingerprinting. Stripe hashes the request body and stores it alongside the response. If a client reuses an idempotency key with a different request body, Stripe rejects it with a 400. This prevents accidental misuse where a client sends the same key for a different charge amount:
1function validateParameters(
2 storedBody: string,
3 incomingBody: string,
4): boolean {
5 const storedHash = crypto.createHash('sha256').update(storedBody).digest('hex');
6 const incomingHash = crypto.createHash('sha256').update(incomingBody).digest('hex');
7 return storedHash === incomingHash;
8}We include parameter validation in production deployments but omit it from the interceptor above for clarity. Add it when you expose idempotency to third-party developers who might misuse keys.
Edge Cases: Partial Failures and Timeouts
Idempotency introduces its own failure modes. Here are the ones that will hit you in production:

Concurrent Requests With the Same Key
Two requests with the same idempotency key arrive at the exact same time. Both check the database, both find nothing, and both attempt to process. This is why the INSERT with primary key guard is non-negotiable — without it, concurrent requests bypass idempotency entirely.
Our implementation handles this with the unique constraint: the first insert wins, the second hits 23505, retrieves the winner's result, and returns it. The client sees a single successful response regardless of how many concurrent retries were in flight.
Server Crash During Processing
The request reaches the server, we insert the placeholder idempotency record, and then the server crashes before completing. The client retries. Now the key exists in the database with a 0 status and "{}" body — the placeholder.
Our implementation handles this by checking if the stored record has a meaningful status code. If it is still 0, we treat it as an incomplete operation and reprocess. A more robust approach adds a status column (processing, completed, failed) so we can differentiate between "still processing" and "done."
Key Expiration During Long-Running Operations
If your endpoint takes longer than 24 hours to process (unlikely for most SaaS APIs, but possible for batch operations), the idempotency key could expire before the client retries. Set your TTL to the maximum expected processing time plus the maximum retry window. For most APIs, 24 hours is conservative.
Which Endpoints Need Idempotency
Not every endpoint needs idempotency protection, and applying it everywhere adds unnecessary database writes. Here is the decision framework we use:
| Endpoint Type | Idempotency Required | Reasoning |
|---|---|---|
| POST /charges | Yes | Duplicate = lost money |
| POST /subscriptions | Yes | Duplicate = two bills |
| POST /orders | Yes | Duplicate = two shipments |
| POST /webhooks (inbound) | Yes | Duplicate = double processing |
| POST /users | Recommended | Duplicate = support ticket |
| POST /send-email | Recommended | Duplicate = spam complaint |
| POST /api-keys | Recommended | Annoying rather than catastrophic |
| PATCH /profile | Optional | Natural idempotency if upsert-based |
| POST /search | No | Read-only, no side effects |
| POST /login | No | Stateless, no persisted side effect |
The rule: if processing the request twice causes any harm or inconsistency, protect it. Payment endpoints are the most obvious, but webhook handlers, email sending, and resource provisioning are equally vulnerable.
(The "just log in" endpoints are safe because a duplicate login creates a second token scoped to the same user, which is harmless. A duplicate charge is not harmless. The distinction is whether the side effect compounds.)
Testing Idempotent Endpoints
Testing idempotency requires verifying that the same request with the same key returns the same response, and that concurrent requests with the same key do not duplicate processing.
1// test/idempotency.e2e-spec.ts
2import { Test, TestingModule } from '@nestjs/testing';
3import { INestApplication } from '@nestjs/common';
4import * as request from 'supertest';
5import { AppModule } from '../src/app.module';
6
7describe('IdempotencyInterceptor (e2e)', () => {
8 let app: INestApplication;
9
10 beforeAll(async () => {
11 const moduleFixture: TestingModule = await Test.createTestingModule({
12 imports: [AppModule],
13 }).compile();
14
15 app = moduleFixture.createNestApplication();
16 await app.init();
17 });
18
19 it('returns the same response for duplicate idempotency keys', async () => {
20 const idempotencyKey = 'test-key-1';
21
22 const first = await request(app.getHttpServer())
23 .post('/charges')
24 .set('Idempotency-Key', idempotencyKey)
25 .send({ amount: 2000, currency: 'usd' });
26
27 expect(first.status).toBe(201);
28
29 const second = await request(app.getHttpServer())
30 .post('/charges')
31 .set('Idempotency-Key', idempotencyKey)
32 .send({ amount: 2000, currency: 'usd' });
33
34 expect(second.status).toBe(first.status);
35 expect(second.body).toEqual(first.body);
36 });
37
38 it('processes independently for different idempotency keys', async () => {
39 const first = await request(app.getHttpServer())
40 .post('/charges')
41 .set('Idempotency-Key', 'key-a')
42 .send({ amount: 1000, currency: 'usd' });
43
44 const second = await request(app.getHttpServer())
45 .post('/charges')
46 .set('Idempotency-Key', 'key-b')
47 .send({ amount: 2000, currency: 'usd' });
48
49 expect(first.body.id).not.toBe(second.body.id);
50 });
51
52 it('handles requests without idempotency key normally', async () => {
53 const response = await request(app.getHttpServer())
54 .post('/charges')
55 .send({ amount: 3000, currency: 'usd' });
56
57 expect(response.status).toBe(201);
58 });
59
60 it('does not cache 5xx errors', async () => {
61 const idempotencyKey = 'test-key-error';
62
63 const first = await request(app.getHttpServer())
64 .post('/charges/fail')
65 .set('Idempotency-Key', idempotencyKey)
66 .send({ amount: 1000, currency: 'usd' });
67
68 // Expect the second request to be processed fresh (no cached error)
69 const second = await request(app.getHttpServer())
70 .post('/charges/fail')
71 .set('Idempotency-Key', idempotencyKey)
72 .send({ amount: 1000, currency: 'usd' });
73
74 // Both should fail independently (no caching of 5xx)
75 expect(first.status).toBe(500);
76 expect(second.status).toBe(500);
77 });
78});The fourth test is the important one — it verifies that 5xx errors are not cached under the idempotency key. If our interceptor cached a 503 and the upstream service recovered, the client would see the cached error forever.
Conclusion
Idempotency is not a "nice to have" for SaaS APIs that handle money or resources. It is the minimum viable reliability pattern — the difference between a network blip that causes a support ticket and one that nobody notices because the system handled it safely on retry.
Our implementation uses a NestJS interceptor, PostgreSQL unique constraints for atomic deduplication, 24-hour key expiration, and parameter fingerprinting for safety. It handles concurrent requests, server crashes during processing, and the critical rule that 5xx errors must never be cached.
The database idempotency pattern is boring, well-understood, and has been handling Stripe-scale traffic for over a decade without incident. (We use the same pattern in our Stripe subscription billing implementation for webhook deduplication — if Stripe delivers the same invoice.paid event twice, idempotency prevents double-processing.) When a retry saved you from a double charge you did not need the cache invalidation strategy, the sharding plan, or the eventual consistency notes — you just needed a UUID and a unique constraint. That is the pattern. It works. Use it before the network timeout finds you.
We built this after a client's retry charged a customer twice. We refunded it, apologised, and spent the next sprint making sure it never happened again. If you are staring at your API right now and wondering which endpoints are safe to retry and which are not — that is exactly the question a senior engineer should be asking. If you want to talk through which endpoints in your API need protection first, we have opinions about that.

The most expensive idempotency bug is the one you discover after a payment failure causes a double charge at 2am. The cheapest fix is a UUID, a database column, and an interceptor that costs about 80 lines of code and an hour to write. (For more on designing APIs that third-party developers actually trust, see our post on building a public developer API — idempotency is one of the reliability guarantees that separates a hobby API from a platform.) Ship it this sprint.
Frequently Asked Questions
An idempotency key is a unique identifier the client generates and sends with every mutating API request. The server stores the key and its response after processing. If the client retries with the same key, the server returns the cached response instead of processing the request again. This prevents duplicate charges, duplicate orders, and duplicate resource creation when network timeouts or failures cause clients to retry requests.
All POST endpoints that create resources or trigger side effects need idempotency — charge creation, order placement, email sending, subscription provisioning, and webhook handlers. GET, PUT, DELETE, and HEAD requests are idempotent by HTTP specification. PATCH is idempotent only if designed carefully. In practice, protect any endpoint where processing the same request twice would cause harm or inconsistency.
Implement idempotency as a NestJS interceptor that extracts the Idempotency-Key header, checks a database or Redis store for an existing result, and either returns the cached response or processes the request and caches the result. Use PostgreSQL ON CONFLICT or Redis SET NX to handle race conditions. Add a unique constraint on the idempotency key column for database-level deduplication. Set a TTL of 24 hours for key expiration.
Use PostgreSQL for most SaaS applications. It eliminates an infrastructure dependency, works with existing database transactions for atomic writes, and provides strong consistency guarantees. Redis is better for high-throughput scenarios (10K+ requests per second) where sub-millisecond latency matters, but adds operational complexity, persistence concerns, and the risk of cache loss during a restart. Start with PostgreSQL and add Redis only when you measure a latency problem.
Store idempotency keys for 24 hours, matching the industry standard set by Stripe. This covers the longest reasonable retry window — a client that retries over minutes or hours will find the key still valid. After 24 hours, the key can be safely pruned because clients should not retry requests older than one day. Set up a background job or scheduled task to clean expired keys and prevent unbounded storage growth.
Use a database-level unique constraint combined with INSERT ON CONFLICT DO NOTHING to ensure only the first concurrent request wins. Subsequent requests with the same key receive the cached response. For higher throughput, use an advisory lock or a distributed lock via Redis SET NX with a short TTL. The key insight is that the check of existing keys and the insertion of new keys must happen in a single atomic operation to prevent race conditions in high-concurrency scenarios.
