Ever had your API get hammered by a bot or someone trying to spam your endpoints? Yeah, we've all been there. Let's fix that with some solid rate limiting for your oRPC APIs.
This isn't just another tutorial - we're building the same rate limiting system that protects Zero Locker in production. No fluff, just the good stuff.
๐ฏ What We're Building
By the end of this, you'll have:
- โ A multi-tier rate limiting system that actually works
- โ oRPC middleware that's a breeze to use
- โ Smart IP detection (because proxies are everywhere)
- โ Interactive testing so you can see it in action
๐งช See It In Action First
Before we dive into the code, let's see what we're building! Try clicking these buttons rapidly - you'll hit the rate limits and see exactly how the system protects your APIs.
Strict Rate Limiting (5 requests/minute)
Perfect for email endpoints - very restrictive to prevent spam
Moderate Rate Limiting (30 requests/minute)
Good for general API calls - balanced protection
Pretty cool, right? Now let's build this thing step by step.
๐ Step 1: The Core Rate Limiting Engine
First things first - we need the brain of our rate limiting system. This handles the sliding window algorithm and keeps track of who's been naughty.
Let's start with the basic types:
export interface RateLimitConfig {maxRequests: numberwindowSeconds: numberidentifier?: string}export interface RateLimitResult {allowed: booleanremaining: numberlimit: numberresetAt: numberretryAfter?: number}Now let's add a simple cache to store our rate limit data:
interface RateLimitEntry {count: numberresetAt: number}class RateLimitCache {private cache = new Map<string, RateLimitEntry>()private cleanupInterval: NodeJS.Timeout | null = nullconstructor() {this.cleanupInterval = setInterval(() => this.cleanup(), 60000)}private cleanup() {const now = Math.floor(Date.now() / 1000)for (const [key, entry] of this.cache.entries()) {if (entry.resetAt < now) this.cache.delete(key)}}get(key: string) {const entry = this.cache.get(key)if (entry && entry.resetAt < Math.floor(Date.now() / 1000)) {this.cache.delete(key)return undefined}return entry}set(key: string, entry: RateLimitEntry) {this.cache.set(key, entry)}}const rateLimitCache = new RateLimitCache()Add a helper function to generate cache keys:
function generateKey(ip: string, identifier?: string) {return identifier ? `ratelimit:${identifier}:${ip}` : `ratelimit:${ip}`}Now the main rate limiting logic:
export async function checkRateLimit(ip: string,config: RateLimitConfig): Promise<RateLimitResult> {const { maxRequests, windowSeconds, identifier } = configconst key = generateKey(ip, identifier)const now = Math.floor(Date.now() / 1000)let entry = rateLimitCache.get(key)if (!entry) {entry = { count: 1, resetAt: now + windowSeconds }rateLimitCache.set(key, entry)return {allowed: true,remaining: maxRequests - 1,limit: maxRequests,resetAt: entry.resetAt,}}if (entry.resetAt < now) {entry = { count: 1, resetAt: now + windowSeconds }rateLimitCache.set(key, entry)return {allowed: true,remaining: maxRequests - 1,limit: maxRequests,resetAt: entry.resetAt,}}entry.count++rateLimitCache.set(key, entry)if (entry.count > maxRequests) {return {allowed: false,remaining: 0,limit: maxRequests,resetAt: entry.resetAt,retryAfter: entry.resetAt - now,}}return {allowed: true,remaining: maxRequests - entry.count,limit: maxRequests,resetAt: entry.resetAt,}}Finally, add some convenient presets:
export const RATE_LIMIT_PRESETS = {STRICT: { maxRequests: 5, windowSeconds: 60 },MODERATE: { maxRequests: 30, windowSeconds: 60 },LENIENT: { maxRequests: 100, windowSeconds: 60 },VERY_LENIENT: { maxRequests: 300, windowSeconds: 60 },} as const๐ง Step 2: The Middleware Magic
Now let's create the middleware that makes this all work seamlessly with oRPC. This is where the magic happens - it intercepts requests and checks if they're allowed through.
Create middleware/rate-limit.ts:
import { ORPCError } from "@orpc/server"import type { MiddlewareNextFn } from "@orpc/server"import { checkRateLimit, RATE_LIMIT_PRESETS, type RateLimitConfig } from "@/lib/utils/rate-limit"import type { PublicContext } from "@/orpc/types"export const rateLimitMiddleware = (config: RateLimitConfig) => {return async ({ context, next }: { context: PublicContext; next: MiddlewareNextFn<unknown> }) => { const result = await checkRateLimit(context.ip, config) if (!result.allowed) { throw new ORPCError("TOO_MANY_REQUESTS", { message: "Rate limit exceeded. Please try again later.", data: { retryAfter: result.retryAfter, limit: result.limit, resetAt: result.resetAt }, }) } return next({ context: { ...context, rateLimit: { remaining: result.remaining, limit: result.limit, resetAt: result.resetAt }, }, })}}export const strictRateLimit = () => rateLimitMiddleware({ ...RATE_LIMIT_PRESETS.STRICT, identifier: "strict" })export const moderateRateLimit = () => rateLimitMiddleware({ ...RATE_LIMIT_PRESETS.MODERATE, identifier: "moderate" })export const lenientRateLimit = () => rateLimitMiddleware({ ...RATE_LIMIT_PRESETS.LENIENT, identifier: "lenient" })export const veryLenientRateLimit = () => rateLimitMiddleware({ ...RATE_LIMIT_PRESETS.VERY_LENIENT, identifier: "very-lenient" })๐ Step 3: Smart IP Detection (Because Proxies Are Tricky)
Here's the thing - getting the real client IP is harder than it should be. Between Vercel, CloudFlare, and other proxies, we need to be smart about this.
Update orpc/context.ts:
function getClientIp(headersList: Headers): string {const forwardedFor = headersList.get("x-forwarded-for")const vercelIp = headersList.get("x-vercel-forwarded-for")const cfConnectingIp = headersList.get("cf-connecting-ip")const realIp = headersList.get("x-real-ip")if (forwardedFor) {const ips = forwardedFor.split(",").map((ip) => ip.trim())if (ips[0]) return ips[0]}if (vercelIp) {const ips = vercelIp.split(",").map((ip) => ip.trim())if (ips[0]) return ips[0]}if (cfConnectingIp) return cfConnectingIpif (realIp) return realIpreturn "UNKNOWN-IP"}export async function createContext(): Promise<ORPCContext> {try { const headersList = await headers() const ip = getClientIp(headersList) const authResult = await auth.api.getSession({ headers: headersList }) return { session: authResult?.session || null, user: authResult?.user || null, ip, }} catch (error) {return { session: null, user: null, ip: "UNKNOWN-IP" }}}Update orpc/types.ts:
export interface ORPCContext {session: Session | nulluser: User | nullip: string}export interface PublicContext {session: Session | nulluser: User | nullip: string}export interface RateLimitInfo {remaining: numberlimit: numberresetAt: number}๐ก๏ธ Step 4: Protect Your Routes (The Fun Part)
Now comes the satisfying part - actually protecting your endpoints. We'll create different procedures for different levels of protection.
Update orpc/routers/user.ts:
import { strictRateLimit, lenientRateLimit } from "@/middleware/rate-limit"const baseProcedure = os.$context<ORPCContext>()const publicProcedure = baseProcedure.use(({ context, next }) =>lenientRateLimit()({ context, next }))const strictPublicProcedure = baseProcedure.use(({ context, next }) =>strictRateLimit()({ context, next }))// Apply to endpointsexport const joinWaitlist = strictPublicProcedure.input(waitlistInputSchema).output(waitlistJoinOutputSchema).handler(async ({ input }) => {// 5 requests/minute limit})export const getUserCount = publicProcedure.input(emptyInputSchema).output(userCountOutputSchema).handler(async () => { // 100 requests/minute limit})๐งช Step 5: Let's Test This Thing
Time to see our rate limiting in action! We'll create some test endpoints and interactive components so you can actually see it working.
export const testRateLimit = baseProcedure.input(z.object({ endpoint: z.enum(["strict", "moderate"]), timestamp: z.string() })).output(z.object({ success: z.boolean(), remaining: z.number(), limit: z.number(), resetAt: z.number(), endpoint: z.string() })).handler(async ({ input, context }) => { const config = input.endpoint === "strict" ? { maxRequests: 5, windowSeconds: 60, identifier: "test-strict" } : { maxRequests: 30, windowSeconds: 60, identifier: "test-moderate" } const result = await checkRateLimit(context.ip, config) if (!result.allowed) { throw new ORPCError("TOO_MANY_REQUESTS", { message: "Rate limit exceeded. Please try again later.", data: { retryAfter: result.retryAfter, limit: result.limit, resetAt: result.resetAt }, }) } return { success: true, remaining: result.remaining, limit: result.limit, resetAt: result.resetAt, endpoint: input.endpoint, }})๐ Step 6: Deploy and Scale
Vercel Ready (Zero Config!)
The best part? This works perfectly on Vercel with zero configuration. We're using Vercel-specific headers for IP detection, so it just works.
When to Scale Up
Current setup is perfect for:
- โ MVP and early-stage apps
- โ Under 1000 daily users
- โ Single-region deployments
Need Redis when you hit:
- ๐ฅ 1000+ concurrent users
- ๐ Multi-region deployments
- ๐ High-traffic production apps
๐ฏ Pro Tips
Pick the right limits:
- Email stuff:
strictRateLimit()(5/min) - nobody likes spam - User signups:
moderateRateLimit()(30/min) - reasonable for humans - Public data:
lenientRateLimit()(100/min) - let the good traffic flow
Start conservative, adjust later:
- Begin with tight limits
- Watch your logs for violations
- Loosen up as you learn your traffic patterns
Error messages matter:
throw new ORPCError("TOO_MANY_REQUESTS", {
message: "Rate limit exceeded. Please try again later.",
data: { retryAfter: result.retryAfter, limit: result.limit },
})
๐ You're Done!
That's it! You now have bulletproof rate limiting for your oRPC APIs. No more worrying about bots hammering your endpoints or someone trying to crash your server.
Want to see this in action? Try Zero Locker - we're using this exact system in production. Or check out our open source implementation to adapt it for your own projects.
This is the real deal - the same rate limiting system protecting Zero Locker in production. No toy examples, just battle-tested code that actually works.