Back to Blog
startupsarchitectureguidebest practices

From Zero to Production: Webhook Infrastructure for Startups

A practical guide to setting up webhook infrastructure at each stage of startup growth — from MVP to Series A to scale — with architecture diagrams and cost benchmarks.

Y
Yuki Tanaka
Founding Engineer
November 28, 2025
10 min read

Webhook infrastructure is one of those things every SaaS company needs but few founders anticipate. You're three weeks from launch, Stripe is set up, and then you realize: you need to handle payment.succeeded, customer.subscription.deleted, invoice.payment_failed, and twenty other event types — reliably.

This guide walks through webhook infrastructure decisions at three stages: MVP, growth, and scale.


Stage 1: MVP (< 10K events/month)

At the MVP stage, you have one integration, a few dozen customers, and limited engineering bandwidth. The right approach is to keep it simple and move fast.

What You Need

  • A webhook endpoint that accepts events
  • Basic signature verification
  • A simple async handler
  • Logging so you can debug failures

What You Don't Need Yet

  • Retry infrastructure
  • Fan-out routing
  • Dead-letter queues
  • Event replay
  • Multi-tenant isolation

The MVP Architecture

Stripe ──→ POST /webhooks/stripe ──→ SQS/Background job ──→ Your handler

Or even simpler for a true MVP:

Stripe ──→ POST /webhooks/stripe ──→ Your handler (synchronous, 5 second timeout)

Yes, synchronous is acceptable for MVP. Stripe has built-in retry. Your handler is simple. Just make sure to:

  1. Verify the signature
  2. Respond 200 before doing anything expensive
  3. Keep handlers under 3 seconds

Recommended Stack

  • Receiver: Stripe's built-in webhook testing tool for local dev, expose with ngrok or Cloudflare Tunnel
  • Processing: Simple function call or background job queue (Sidekiq, Celery, BullMQ)
  • Database: Just write to your existing app DB
  • Monitoring: Stripe Dashboard → Webhooks tab (shows delivery attempts for free)

Cost

$0 additional. Stripe's webhook delivery is free. Your existing web server handles it. You don't need GetHook yet.

When to move to Stage 2: When you have more than 3 webhook integrations, start seeing delivery failures, need fan-out routing, or want to replay missed events.


Stage 2: Growth (10K–500K events/month)

You've shipped the MVP. Customers are using it. You've added Stripe, GitHub, Shopify, and maybe Twilio. Event failures are causing occasional customer complaints.

This is when webhook reliability becomes a real investment.

What Goes Wrong at This Stage

ProblemFrequencyImpact
Provider webhook bursts (traffic spike)1–2×/weekQueue backup, delayed processing
Destination restart during deploy3–5×/dayMissed events for 30s windows
Multi-provider signature format differencesOngoingDeveloper confusion, bugs
Customer asks "why didn't I get this event?"1–5×/weekSupport burden
Need to backfill a new destination1×/quarterManual work

The Growth Architecture

┌─── Billing Service Stripe ──┐ │ GitHub ──┼──→ GetHook Gateway ─────┼─── Fulfillment Service Shopify ─┤ (ingest + queue) │ Twilio ──┘ └─── Analytics / Reporting

Each provider's events are accepted at the GetHook ingest layer, verified, queued, and fanned out to the right destinations based on event type patterns.

Key Capabilities You Now Need

1. Fan-out routing

Different event types go to different destinations:

payment.succeeded → billing-service, email-service, analytics payment.failed → billing-service, alerting order.shipped → fulfillment-service, email-service, sms-service user.signup → crm-service, onboarding-service

2. Independent retry per destination

If your email service is down, fulfillment shouldn't be blocked. Each destination has its own retry queue.

3. Event replay

When you deploy a new service, you need to replay 30 days of events. When a bug causes incorrect processing, you need to re-process specific events.

4. Delivery observability

"Did webhook X reach service Y?" should be answerable in 30 seconds with a timestamp and HTTP response code.

Provider-Specific Considerations

At this stage, you're integrating with multiple providers. Each has its own signature format:

ProviderEvents typically needed
Stripepayment_intent.*, customer.subscription.*, invoice.*
GitHubpush, pull_request, deployment
Shopifyorders/*, products/*, fulfillments/*
Twiliomessage-status.*, call.*

Using GetHook abstracts the signature format differences — you configure the verification preset per source, and your handlers receive pre-verified events with a consistent format.

Cost Benchmark (Stage 2)

ApproachMonthly CostEngineering Time
Build in-house$300–$600/month infra + 40h/month maintenance8–12h/month ongoing
GetHook Growth plan$49/month~2h integration, ~0 ongoing

Stage 3: Scale (500K–10M events/month)

You've raised a Series A or B. Engineering team is 10–30 people. Webhooks are serious infrastructure — downtime has direct revenue impact measured in thousands of dollars per hour.

What Changes at Scale

Multi-tenancy becomes critical. You're now both a consumer (receiving from providers) and a producer (delivering to your customers' endpoints). Your platform needs per-customer:

  • Separate signing secrets
  • Independent retry and dead-letter
  • Delivery logs visible to customers via your own dashboard
  • Custom domains for white-labeled delivery

Compliance and audit requirements arrive. SOC 2, PCI-DSS, and enterprise customer security reviews start asking about event audit trails, data retention policies, and encryption at rest.

Outbound webhooks become a product feature. Your largest customers want to configure webhooks from your platform to their own systems. This is 3–6 months of engineering work if built in-house.

The Scale Architecture

┌──────────────────────────────┐ External Providers │ GetHook │ Stripe, GitHub, ─────────────────→│ Ingest → Queue → Fan-out │ Shopify, etc. │ │ │ Per-source HMAC verification │ │ Per-destination retry │ Your Platform │ Per-tenant isolation │ (your app) ─────────────────────→│ Outbound delivery │ │ │ └──────────────────┬───────────┘ │ ┌──────────────────┼───────────┐ │ Your customers │ │ │ Customer A ─────┘ │ │ Customer B ────────────────│ │ Customer C ────────────────│ └────────────────────────────┘

White-Labeling for Outbound

When your customers configure webhook endpoints in your product, they receive signed events from a domain like webhooks.yourapp.com (not webhooks.gethook.to). GetHook's custom domain support makes this transparent.

Each customer has:

  • A unique signing secret (used to verify events you send them)
  • A custom domain for the webhook portal
  • Independent delivery logs and retry controls

Compliance at Scale

RequirementGetHook Feature
Data at rest encryptionAES-256-GCM for secrets, Postgres-level encryption for payloads
API key audit trailKey prefix + creation time logged, full keys never stored
Data retention controlsConfigurable retention period, automatic cleanup
Tenant data isolationaccount_id filtering enforced at all queries
Immutable delivery logsdelivery_attempts table is append-only

Choosing Between Build vs. Buy at Each Stage

StageEvents/monthRecommendationReason
MVP< 10KBuild basicToo early to invest
Early growth10K–100KUse GetHookProvider complexity, fan-out needs
Growth100K–1MUse GetHookReliability SLA, multi-tenant needs
Scale1M–10MUse GetHook (enterprise)Compliance, white-labeling, outbound
Hyper-scale> 10MEvaluate optionsMay need custom infrastructure

Common Mistakes Startup Founders Make

1. Building retry before building idempotency

Retry without idempotency = duplicate charges. Always implement idempotency first.

2. Using the same webhook secret for all customers

A leaked secret from one customer compromises all of them. Per-customer secrets are non-negotiable at Stage 3+.

3. Not monitoring the dead-letter queue

Dead-letter events accumulate silently. Alert when DLQ grows, and review them weekly.

4. Logging raw webhook bodies

Webhook bodies often contain PII and sensitive data. Log event IDs and types, not bodies.

5. Treating webhook infrastructure as a "later" problem

The cost of retrofitting reliability onto an unreliable system is always higher than building it right the first time. If you're at 10K+ events/month and still using a simple HTTP handler with no retry, upgrade now.


Quick-Start Checklist

MVP → Growth transition:

  • Set up GetHook account (10 minutes)
  • Configure one source per provider (Stripe, GitHub, etc.)
  • Set up destinations for each internal service
  • Create routes with event type patterns
  • Test delivery end-to-end
  • Set up dead-letter queue alerting

Growth → Scale transition:

  • Enable per-customer signing secrets
  • Configure custom domain for outbound delivery
  • Set up brand settings for white-labeled portal
  • Review data retention policies
  • Enable delivery logs for customer-facing observability
  • Test replay from dead-letter queue

Conclusion

Webhook infrastructure isn't glamorous, but it's load-bearing. Get the foundation right at the MVP stage (verify signatures, don't lose events), invest in reliability at the growth stage (retry, fan-out, observability), and build for multi-tenancy at the scale stage (per-customer secrets, white-labeling, compliance).

GetHook is designed to grow with you through all three stages without changing your integration code.

Start building →

Stop losing webhook events.

GetHook gives you reliable delivery, automatic retry, and full observability — in minutes.