Sloth Bytes
Posts
🦥Rate Limiting For Dummies

🦥Rate Limiting For Dummies

The Coding Sloth
November 26, 2024

Sponsored by

Hello friends!

Welcome to this week’s Sloth bytes. I hope you had a great week.

Stay up-to-date with AI

The Rundown is the most trusted AI newsletter in the world, with 1,000,000+ readers and exclusive interviews with AI leaders like Mark Zuckerberg, Demis Hassibis, Mustafa Suleyman, and more.

Their expert research team spends all day learning what’s new in AI and talking with industry experts, then distills the most important developments into one free email every morning.

Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.

Two-fingered and three-fingered sloths are not related.

Although two-fingered and three-fingered sloths look and act similar they are not closely related to each other. Instead, they are examples of convergent evolution, which is when two different animals evolve similar traits to adapt to the same niche in an ecosystem.

Rate Limiting

Ever wondered why you get those "Too Many Requests" errors or want to stop people from spamming your APIs/servers? This is where rate limiting comes in.

What is Rate Limiting?

Rate limiting controls how many requests a user/client can make to your API within a specific time window. Think of it like a nightclub:

The door is your API endpoint
The bouncer is your rate limiter
The guest list is your authenticated users
The maximum capacity is your rate limit

Why You Need Rate Limiting

Prevent Abuse: Stop malicious users from overwhelming your system
Resource Management: Ensure fair distribution of server resources
Cost Control: Especially important when using paid services
Service Reliability: Maintain performance during traffic spikes

Common Rate Limiting Algorithms

1. Fixed Window (I’ve actually had to implement this in an interview)

from datetime import datetime
from collections import defaultdict

class FixedWindowRateLimiter:
    def __init__(self, max_requests, window_seconds):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.requests = defaultdict(list)
    
    def is_allowed(self, user_id):
        now = datetime.now()
        window_start = now.replace(second=0, microsecond=0)
        
        # Clean old requests
        self.requests[user_id] = [
            req_time for req_time in self.requests[user_id]
            if (now - req_time).seconds < self.window_seconds
        ]
        
        # Check if under limit
        if len(self.requests[user_id]) < self.max_requests:
            self.requests[user_id].append(now)
            return True
            
        return False

Token Bucket

class TokenBucket:
    def __init__(self, capacity, refill_rate):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate
        self.last_refill = datetime.now()
    
    def get_token(self):
        now = datetime.now()
        # Refill tokens based on elapsed time
        elapsed = (now - self.last_refill).seconds
        self.tokens = min(
            self.capacity,
            self.tokens + elapsed * self.refill_rate
        )
        self.last_refill = now
        
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        return False

Real-World Implementation

Real world rate limiting is much easier (because someone already did it for us).

Here's how to implement rate limiting in Express.js using the popular express-rate-limit package:

const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
    windowMs: 15 * 60 * 1000, // 15 minutes
    max: 100, // Limit each IP to 100 requests per window
    message: 'Too many requests, please try again later.'
});

// Apply to all routes
app.use(limiter);

// Or apply to specific routes
app.use('/api/', limiter);

Best Practices

Clear communication

// Include rate limit info in response headers
'X-RateLimit-Limit': '100',
'X-RateLimit-Remaining': '75',
'X-RateLimit-Reset': '1640995200'

Multiple tiers (if you’re doing a SaaS)

const freeTierLimit = rateLimit({
    windowMs: 60 * 60 * 1000, // 1 hour
    max: 100 // normies get 100
});

const proTierLimit = rateLimit({
    windowMs: 60 * 60 * 1000,
    max: 1000 // pro people get 1000 
});

Error handling (still using express)

app.use((err, req, res, next) => {
    if (err.status === 429) {
        return res.status(429).json({
            error: 'Too many requests',
            retryAfter: err.retryAfter
        });
    }
    next(err);
});

The lazy way to handle this stuff

If you don’t want to do all this, I understand. There’s a lot of tools/services that handle this for us.

Redis-based rate limiting (https://upstash.com/)
Nginx rate limiting module (https://blog.nginx.org/blog/rate-limiting-nginx)
Cloud provider solutions (AWS WAF, CloudFlare)

There’s a lot more, so feel free to do your own research.

Introducing the Model Context Protocol (4 minute read)

The Model Context Protocol is an open standard that enables developers to build secure, two-way connections between their data sources and AI-powered tools. Its aim is to help frontier models produce better, more relevant responses.

React Router v7 (3 minute read)

React Router v7 brings all the great things you love from Remix back to React Router

Ai2 OpenScholar: Scientific literature synthesis with retrieval-augmented language models (10 minute read)

Ai2’s & UW’s OpenScholar, a retrieval-augmented LM, helps scientists navigate and synthesize scientific literature.

Tailwind CSS v4.0 Beta 1 (2 minute read)

After a long alpha period, we're excited to release the first public beta of Tailwind CSS v4.0.

Try the internet’s easiest file API

Easy file uploads and retrieval in minutes
No complex setup or infrastructure needed
Focus on building, not configurations.

*A message from our sponsor.

How to improve search without looking at queries or results (12 minute read)

How we improved Canva’s private design search while respecting the privacy of our community.

A Single Korean-Speaking Intern Saved Valve From Going Bankrupt (5 minute read)

Steam, Portal, Left 4 Dead, Team Fortress 2 – all of those wouldn't have been possible if it wasn't for a pure stroke of luck.

Thank you to everyone who submitted 😃

ddat828, clsmv, pyGuy152, agentNinjaK, JamesHarryT, dropbearII, ravener, mc-milo, codiling, and taypham88

Splitting Up Numbers

Create a function that takes a number num and returns each place value in the number.

Examples

num_split(39)
output =[30, 9]

num_split(-434)
output = [-400, -30, -4]

num_split(100)
output =[100, 0, 0]

How To Submit Answers

Reply with

A link to your solution (github, twitter, personal blog, portfolio, replit, etc)
or if you’re on the web version leave a comment!

Video should be coming out within these 2 days

Hopefully…

That’s all from me!

Have a great week, be safe, make good choices, and have fun coding.

If I made a mistake or you have any questions, feel free to comment below or reply to the email!

See you all next week.

Reply

or to participate.