If you’re building a SaaS product on AWS Marketplace, there’s a subtle bug waiting for you in the subscription flow. It won’t show up in testing. It won’t throw an error. Your customer will just land on a broken page, and you’ll spend hours figuring out why.
I’ve shipped 4 SaaS products on AWS Marketplace. This race condition bit me on the first one. Here’s what it is and how to fix it.
How AWS Marketplace Subscription Works
When a customer subscribes to your SaaS product on AWS Marketplace, two things happen:
Flow A: The redirect. The customer clicks “Subscribe” and AWS sends them to your fulfillment URL with a registration token. You call ResolveCustomer to validate it, create a tenant record in your database, and redirect them to your signup page.
Flow B: The SQS notification. AWS also drops a subscribe-success message into your SQS queue. Your backend polls this queue and uses it to update the tenant’s subscription status.
Here’s the problem: these two flows are completely independent. AWS does not guarantee ordering between them.
The Race
The happy path looks like this:
1. Customer clicks Subscribe on AWS Marketplace
2. Customer is redirected to your /register endpoint
3. You call ResolveCustomer, create tenant row (status: subscribed)
4. Customer completes signup
... minutes later ...
5. SQS delivers subscribe-success
6. You UPDATE the tenant row -> status stays 'subscribed' (no-op)
Everything works. But here’s what actually happens sometimes:
1. Customer clicks Subscribe on AWS Marketplace
2. SQS delivers subscribe-success <-- this arrives FIRST
3. You try to UPDATE the tenant row
4. ... but the row doesn't exist yet
5. UPDATE affects 0 rows. No error. Silent failure.
6. SQS message is deleted from the queue. <-- it's gone now
... seconds later ...
7. Customer is redirected to your /register endpoint
8. You call ResolveCustomer, create tenant row
9. But you missed the subscribe-success event
10. What status do you set?
The SQS event arrived before your customer did. Your UPDATE hit nothing. The message was deleted from the queue. And now you have a customer with no subscription status, or worse, a customer stuck on a “subscription pending” screen forever.
This isn’t a theoretical edge case. It happens in production. The time between the customer clicking Subscribe and actually landing on your registration page can vary wildly – they might have a slow connection, they might get distracted, or your redirect might take a few seconds while SQS delivers in milliseconds.
The Wrong Fix
The obvious fix is: “Just don’t delete the SQS message if the tenant doesn’t exist yet. Let it retry.”
This is fragile. You’re now relying on SQS redelivery timing. If the customer takes 5 minutes to complete the redirect, you’re burning SQS visibility timeouts and retries. If they never complete registration, you have a poison message bouncing forever. And you’ve coupled your SQS processing to the state of a completely separate HTTP flow.
The Fix: Event Sourcing Lite
The solution is to decouple the two concerns:
- Always persist the SQS event, regardless of whether the tenant exists.
- Reconcile at registration time by reading the event history.
Here’s how it works in practice.
Step 1: Always save the event
When an SQS message arrives, write it to a subscription_events table first, unconditionally. Then attempt to update the tenant:
saveSubscriptionEvent(message) {
const { action, customerIdentifier, productCode } = message;
// Always write to the audit log -- this never fails
db.subscriptionEvents.add(action, customerIdentifier, productCode, message);
// Attempt to update the tenant (may not exist yet)
if (action === 'subscribe-success') {
const result = db.customers.updateSubscriptionStatus(
customerIdentifier, 'subscribed'
);
if (result.changes === 0) {
// Tenant hasn't registered yet. That's fine.
// The event is safely persisted in subscription_events.
logger.warn(
`Customer ${customerIdentifier} not found in tenants table. ` +
`Status will be reconciled at registration time.`
);
}
}
// Delete from SQS -- safe because the event is persisted locally
this.deleteMessage(message);
}
The key insight: the subscription_events table is your durable log. It doesn’t depend on any other table existing. The SQS message can be safely deleted because the information has been transferred to your database.
Step 2: Reconcile at registration
When the customer finally hits /register, check the event history before creating the tenant:
// POST /register
app.post('/register', async (req, res) => {
const { customerIdentifier, customerAWSAccountId } =
await resolveCustomer(req.body.token);
const existingTenant = db.customers.getByAwsAcctId(customerAWSAccountId);
if (existingTenant) {
// Returning customer -- redirect to login
return res.redirect('/login');
}
// New customer -- check if SQS events arrived before they did
const latestEvent = db.subscriptionEvents.getLatestByCustomer(
customerIdentifier
);
const subscriptionStatus = latestEvent?.action === 'unsubscribe-success'
? 'unsubscribed'
: 'subscribed';
db.customers.add(
customerAWSAccountId,
customerIdentifier,
offerType,
subscriptionStatus // <-- reconciled from event history
);
res.redirect('/signup');
});
The query is simple:
SELECT action, customer_identifier, created_at
FROM subscription_events
WHERE customer_identifier = ?
ORDER BY created_at DESC
LIMIT 1
If a subscribe-success event exists, the tenant is created as subscribed. If somehow an unsubscribe-success is the latest event, the tenant is created as unsubscribed. If no events exist yet (normal flow where the customer arrived before SQS), the default is subscribed – which is correct because ResolveCustomer itself validates that the subscription is active.
Why This Works
The subscription_events table acts as a write-ahead log. It decouples event persistence from tenant existence. No matter what order things happen:
Normal order (customer registers first):
/register creates tenant as 'subscribed' (default)
SQS arrives later, UPDATEs tenant -> no-op, already correct
Race condition (SQS arrives first):
SQS handler writes to subscription_events, UPDATE hits 0 rows -> that's fine
/register reads subscription_events, finds subscribe-success
Creates tenant as 'subscribed' -> correct
Edge case (unsubscribe before register):
SQS delivers unsubscribe-success, persisted to subscription_events
Customer visits /register
Latest event is unsubscribe-success -> tenant created as 'unsubscribed'
Access correctly denied
Every path converges to the correct state.
The Schema
You need one extra table:
CREATE TABLE subscription_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
action TEXT NOT NULL,
customer_identifier TEXT NOT NULL,
product_code TEXT NOT NULL,
offer_identifier TEXT,
raw_payload TEXT,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_subscription_events_customer
ON subscription_events(customer_identifier, created_at DESC);
The descending index on created_at makes the “get latest event” query fast. The raw_payload column stores the full SQS message body – useful for debugging and audit.
Bonus: You Get an Audit Trail for Free
This pattern gives you a complete history of every subscription lifecycle event. When a customer opens a support ticket saying “I subscribed but can’t access the product,” you can query:
SELECT action, created_at
FROM subscription_events
WHERE customer_identifier = 'cust-abc-123'
ORDER BY created_at DESC;
subscribe-success 2024-03-15 14:23:01
unsubscribe-pending 2024-03-15 14:22:58
subscribe-success 2024-01-10 09:15:33
You’ll know exactly what happened and when, without digging through CloudWatch logs.
Takeaway
The general pattern here is older than AWS: persist events before acting on them, and reconcile state from the event log. It’s event sourcing applied to a very specific problem, and it’s the simplest version of it – just one table, one query at registration time, and zero retry logic.
If you’re building an AWS Marketplace SaaS integration, save yourself the debugging session. Add the subscription_events table from day one.
I’ve packaged the production code behind this (and all the other AWS Marketplace plumbing – ResolveCustomer, auth, entitlements, metering) into a self-hosted Node.js gateway kit. If you’re listing a SaaS product on AWS Marketplace and don’t want to rebuild this from scratch, check it out here.














