Not Everything Needs to Happen Synchronously
A new tenant signs up. Your system needs to create their database record, provision resources, set up billing, seed sample data, send a welcome email, and trigger a product tour.
All in one HTTP request? That request takes 8 seconds. The user stares at a loading spinner. One failure in any step fails the entire signup.
Event-driven architecture (EDA) breaks this into independent, asynchronous steps. Publish an event, let subscribers handle their piece. Each can fail, retry, and scale independently.
It’s not always the right call. But when it is, it changes how your SaaS operates.
The Core Pattern
Producer publishes an event. Consumers subscribe and react. The producer doesn’t know or care what the consumers do.
Tenant signs up. System publishes TenantCreated. The billing service subscribes and creates a Stripe customer. The email service subscribes and sends the welcome message.
The provisioning service subscribes and sets up resources. Three independent reactions, zero coupling.
If the email service goes down, billing still works. If provisioning takes 10 seconds, the signup response isn’t blocked. Each concern is isolated.
This is the publish-subscribe pattern. Our complete SaaS architecture guide covers how EDA fits into the broader platform.
When EDA Makes Sense for SaaS
Not every operation needs to be event-driven. Synchronous request-response is simpler and should be your default.
Use events when the benefits outweigh the complexity. Here’s where they do.
Tenant lifecycle events
Signup, upgrade, downgrade, cancellation. Each triggers multiple downstream actions. Billing updates, feature flag changes, notification sends.
One event, five reactions. Perfect for pub/sub.
Heavy operations
Report generation, data exports, bulk imports. These take seconds or minutes.
Queue them as events, process asynchronously, notify the user when done. We covered this pattern in scaling from 100 to 10,000 users.
Cross-service communication
If your SaaS has separate services for billing, auth, notifications, and core logic, events are how they communicate without coupling.
Direct API calls between services create tight dependencies. Events create loose coupling. When service A publishes an event, it doesn’t need to know that services B, C, and D consume it.
Real-time features
Activity feeds, live dashboards, collaborative editing. Events propagate changes to connected clients.
WebSockets or server-sent events deliver them to the browser. The event bus is the backbone.
Choosing a Message Broker
Two practical choices for most SaaS products. Each serves different needs.
Kafka
High-throughput, ordered event streams. Events are persisted to a log. Consumers can replay from any point in time.
SumUp processes millions of payment events daily across 30+ countries on Kafka. Use it when you need event ordering guarantees, replay capability, or high-throughput processing.
It’s overkill for a pre-revenue SaaS. It’s essential for a platform processing thousands of events per second.
RabbitMQ
Simpler pub/sub with flexible routing. Easier to set up and operate. Better for task queues and work distribution.
Use RabbitMQ when your event volume is moderate and you need routing flexibility. Direct exchanges, topic exchanges, headers-based routing. Most SaaS startups start here and graduate to Kafka if they outgrow it.
Tenant Isolation in Async Workflows
Here’s where multi-tenant SaaS adds complexity to event-driven architecture. Every event must carry tenant_id. Every consumer must scope its work to the right tenant.
Queue isolation
Dedicated queues per tenant prevent one tenant’s event backlog from blocking another’s. Not always practical at scale, but essential for enterprise tiers.
For shared queues, use priority levels. Enterprise tenant events get processed first. Free tier events wait in line.
Dead letter queues per tenant
When an event fails processing, it goes to a dead letter queue. If those queues are shared, one tenant’s failures can obscure another’s.
Separate them. When debugging, you need to see failures per tenant, not a mixed pile of errors from every account.
Rate limiting on event production
A tenant’s runaway integration shouldn’t flood your event bus. Set per-tenant production limits.
Without them, one misbehaving tenant can create a backlog that delays events for every other tenant. We covered noisy-neighbor mitigation in scaling your SaaS.
Idempotent consumers
Events will be delivered more than once. Your consumers must handle duplicates gracefully.
Process each event exactly once, even when you receive it three times. Use event IDs and a processed-events table. Check before processing, skip if already handled.
Event Sourcing: Worth It?
Event sourcing stores every state change as an immutable event. Instead of updating a record, you append a new event. The current state is computed by replaying all events.
For most SaaS products? No. Event sourcing adds significant complexity to your data layer.
Queries become expensive because you’re replaying events instead of reading rows. Debugging requires understanding event history. Schema evolution is painful.
Where it shines: audit-heavy domains. Financial services, compliance-critical systems, collaborative applications. If you need a complete, tamper-proof history of every change, event sourcing provides it.
CQRS
Command Query Responsibility Segregation often accompanies event sourcing. Separate your write model from your read model.
The write side appends events. The read side is a traditional database, rebuilt from events as optimized projections. Powerful pattern. Significant complexity. Don’t adopt it unless your domain demands it.
Saga Pattern for Multi-Step Workflows
Some operations span multiple services and need coordination. Tenant provisioning: create database, set up billing, configure auth, seed data.
What happens if step 3 fails after steps 1 and 2 succeeded? The saga pattern handles this.
Each step publishes an event on success. If a step fails, it publishes a compensation event that undoes previous steps. Database created but billing failed? Publish BillingFailed, and the database service rolls back.
Choreography-based sagas work for simple workflows. Orchestration-based sagas work better for complex provisioning. We covered tenant provisioning in SaaS onboarding architecture.
When to Keep It Simple
Event-driven architecture is powerful. It’s also complex.
If your SaaS has one service, five tenants, and straightforward workflows, synchronous request-response is the right choice. Don’t adopt EDA because it’s architecturally interesting. Adopt it because your system needs it.
One team we worked with went event-driven for everything on day one. User login? Event. Page view? Event. Button click? Event.
Their event bus processed 100x more events than necessary. Debugging was a nightmare. They spent three months simplifying back to synchronous defaults with events only where they were genuinely needed.
Monitoring Event-Driven Systems
Event-driven systems are harder to debug than synchronous ones. A request doesn’t follow a single thread. It fans out across multiple consumers.
Distributed tracing is essential. Every event carries a correlation ID. Every consumer logs that ID. When something fails, you trace the entire chain.
Monitor consumer lag per tenant. If one tenant’s events are processing 30 seconds behind, they’re getting stale data. Alert on lag thresholds.
Dead letter queue depth is your canary. Rising DLQ depth means events are failing faster than you’re fixing them. Monitor it per tenant and per consumer.
Our Recommendation
Start synchronous. Add events for tenant lifecycle, heavy operations, and cross-service communication. Keep everything else as direct function calls or API requests.
When you do go event-driven, carry tenant_id in every event. Isolate queues for enterprise tiers. Build idempotent consumers. Monitor event lag and dead letter queues per tenant.
For how event-driven architecture fits with your data isolation strategy, see multi-tenancy patterns. For how it connects to billing events, see our Stripe architecture guide.
Building event-driven architecture for your SaaS platform? Let’s design it right. We’ve built async, multi-tenant systems and know when events help and when they hurt.