From 0 to 10k Followers:
The Engineering Behind Scale
How we built a system capable of handling millions of webhook events per second without crashing.
PostEngageAI Team
Engineering Team
Feb 20, 2026
System Architecture: High Throughput
When we launched PostEngage.ai, we thought handling 100 requests per minute was a lot. Last week, we peaked at 12,000 requests per second during a major influencer campaign.
Scaling isn't just about adding more servers. It's about rethinking how data flows through your system. Here is the story of how our architecture evolved.
The Bottleneck: Database Locks
In V1, every incoming webhook from Instagram triggered a direct database write.Webhook -> API -> Postgres.
It worked fine until we onboarded our first customer with 500k followers. When they posted, thousands of comments flooded in within seconds. Our database CPU spiked to 100%, and connections timed out. We dropped events. It was a disaster.
The Solution: Async Queues
We decoupled ingestion from processing. Now, the flow looks like this:
Distributed Rate Limiting
Sending replies is harder than receiving webhooks because Instagram has strict API limits. We implemented a "Token Bucket" algorithm using Redis Lua scripts.
Before any worker sends a reply, it must "buy" a token from the user's bucket in Redis. This operation is atomic. If the bucket is empty, the job is re-queued with a delay. This ensures we never exceed Instagram's rate limits, no matter how many parallel workers are running.
Our Current Stack
- Framework: Next.js 15 (App Router)
- Database: PostgreSQL (Neon Serverless)
- Queue: Upstash Redis + QStash
- Hosting: Vercel
- Monitoring: OpenTelemetry + Grafana
The beauty of this stack is that it scales to zero. If no one is posting, we pay almost nothing. When a viral post hits, we scale up instantly to handle the load.