Speed up your website 4x, reduce your cost 75% with HTTP caching

Hi! I’ve been using Django since 10 years now, ranging from dozens of personal projects to highly active websites accommodating millions of users. I’m using Django to showcase but this is really about the Web and HTTP, so there’s no reason it can’t be applied in {insert your favorite web framework}.

The simplicity and success of the HTTP protocol always impressed me, it brings a lot of benefits as well given that it’s used correctly. Today I’ll share with you a magic trick to speed up your website tremendously, while bringing down your costs. It’s more useful the more anonymous visitors you have. Nowadays most funnels begin with Google searches, so I assume this is applicable to most of your projects. You can find the source code for both the project and the load-balancing tool at the end of this post.

Used Stack:

  1. HTML
  2. Django
  3. Cloudflare
  4. Locust (for Loadtesting)

1. Trivial Scenario

Trivial scenario is when all your website content is static, there are no logins or updates or real-time comments. Before we improve this, let’s do a loadtesting with our locust script to see what kind of performance we are getting.

To simulate this environment I’ve deployed a very basic Django application with only 1 view and 1 HTML template, I’ve added time.sleep(1) to simulate the calculations your server is probably doing. I’ve deployed this to https://httpcaching.eralpbayraktar.com which you can query yourself.


<!DOCTYPE html>  
<html lang="en">  
 <meta charset="UTF-8">  
 <title>Hello World Title</title>  
Hello World!  


import time  
from django.views.generic import TemplateView  
class Homepage(TemplateView):  
    template_name = 'index.html'  
  def dispatch(self, request, *args, **kwargs):  
        time.sleep(1)  # simulating your business logic
        return super().dispatch(request, *args, **kwargs)


urlpatterns = [  
  path('', Homepage.as_view(), name='homepage'),  

Here is how to run the loadtest and the loadtesting results:

pip3 install locust
locust -f locustfiledynamic.py

https://i.ibb.co/6gMsW8q/image.png Loadtesting results

I’ve deployed my Django application as 2 gunicorn workers, and each request waits 1 second, so it’s expected that we are capped at 2 RPS (requests per second) since I’m sending more requests than server can handle, the Response Time is growing as the time goes, until some users hit timeouts on various levels. This is very dangerous if your website is featured somewhere and users are flooding your website.


In this scenario all you have to do is add a Page Rule to all your URLs to set the Cache Level to Cache Everything. This will cache all the GET requests to your domain.


Cache Level

This is very easy to test, you can curl with the -I parameter (which shows the response headers)

curl -I https://reaktup.eralpbayraktar.com/
**cf-cache-status**: HIT
**age**: 2152

Here you can see the cf-cache-status will give us if the cache is working (HIT means cache was hit and CF (Cloudfront) is serving the request without even your server knowing about this request, so it’ll be super fast. age means how many seconds ago this was cached.

As I told you cache is only working for GET requests by design, if you try a POST request instead, you will see DYNAMIC, which means your server is serving so no cache is involved.

curl -I -X POST https://reaktup.eralpbayraktar.com/
**cf-cache-status**: DYNAMIC

There’s a third common option which is MISS it means CF knows it should cache the request, but it’s not available at the cache so your server is serving this, but the response will be cached for any further requests.

Anyway enough with the theory! Let’s run our loadtests again.

https://i.ibb.co/tP3dnww/image.png enter image description here

I told you it was going to be fast, we see 2248 RPS. comparing to 2 RPS we had, it’s 1000x+ optimization. My machine couldn’t keep up, I had to switch to a FastHTTPUser in the locust code. Your local client is rarely the bottleneck in loadtesting but if you design correctly it’s not impossible as you can see. Remember that our Python code is still the same, so it takes the server 1 second to calculate, but the bliss is you don’t have to recalculate all your page for every single request.

Scaling is all about right caching, and sometimes database index’es. - Eralp Bayraktar

Remember, we not only have improved your visitor’s experience but we also reduced the load on your server, this means a huge cost saving. If your visitor’s see mostly the same thing, then what’s the point of calculating everything from scratch when Cloudflare (or any other caching proxy inbetween) can do that for free?

2. Non-trivial Scenario

The previous case was trivial since every user was seeing the same exact content, what if you have localization? Like a user from the US would see Hello World whereas a user from Germany would see Hallo Welt! Or what if you want to display Welcome {first name}! on the top right corner if the user has logged in?

These are valid questions and sometimes it’s not trivial to answer them. It depends on how much personalization you have depending on the request, Cloudflare has advanced tools to cache per country or ignore the cache if a user is logged in and so on, so it’s still definitely doable. I won’t go into the details of those advanced features. I will instead show you how you can configure your application logic.

Even if it’s not trivial to plan this for logged-in users, everyone can go and turn this on for anonymous users in 5 minutes, there is no excuse. Here is a blog post by Cloudflare specifically for anonymous visits. https://blog.cloudflare.com/caching-anonymous-page-views/

Let’s go with the scenario you want to display Welcome {first name} on top of the webpage.


This is how I imagine your architecture. You do a GET request to load the mostly static content (which is cached), and then you do an additional POST (which is not cached) to get the personalized data on the webpage, and fill it in with some javascript code. This is trivial in React applications, but also easy in any type of frontend framework. I’m modifying the Homepage code as following:


<!DOCTYPE html>  
<html lang="en">  
 <meta charset="UTF-8">  
 <title>Hello World Title</title>  
    Hello World!  
    <div id="hello"></div>  
  function personalizePage(data) {  
        let welcomeDiv = document.getElementById('hello')  
        if (data['authenticated']) {  
            welcomeDiv.innerText = 'Welcome ' + data['username']  
        } else {  
            welcomeDiv.innerText = 'Welcome stranger, please sign up!'  
    fetch('/dynamic', {  
        credentials: "same-origin", // This includes user's cookies  
  method: "POST"  
  }).then(response => response.json())  
        .then(data => personalizePage(data));  

The idea is simply, loading the exact same content for everyone, but having this additional Ajax fetch with person’s cookies to personalize little bit on top. It doesn’t look amazing to instantly load new data, but you can do all sorts of loading indicators and fade ins to make it look pleasant.


class DynamicHomepage(TemplateView):  
    template_name = 'dynamicIndex.html'  
    def dispatch(self, request, *args, **kwargs):  
    return super().dispatch(request, *args, **kwargs)  
    def get(self, request, *args, **kwargs):  
        time.sleep(0.5)  # static content calculation  
        return super().get(request, *args, **kwargs)  
    def post(self, request, *args, **kwargs):  
        time.sleep(0.5)  # user profile calculations  
        if request.user.is_authenticated:  
            return JsonResponse({'authenticated': True, 'username': 'Eralp'})  
            return JsonResponse({'authenticated': False})

Here you can see get returns like it normally would, but post method is the one answering to ajax and it’s the one returning the personalised flavor on top. Here you can include the updating or per-user changing data. To make things equal, I dropped the sleep times to half a second each, because a user would need both the static and the personalised flavor, it adds up to previous 1 second where everything was calculated at once.

I’ve deployed this to https://httpcaching.eralpbayraktar.com/dynamic

I’m also changing the loadtesting tool to do get + post as following:

class HomepageUser(FastHttpUser):  
    def homepage(self):  

Let’s run the loadtest!

https://i.ibb.co/3Y5mz8W/image.png enter image description here

It’s 7.6 RPS, it’s not as fascinating as 2000+ RPS but considering the initial version was 2.0 RPS, this is almost a 4x speedup. That figure is incredible. 4x can range between 1.1X and 1000X depending on the amount of static content you serve on your webpage, and how personalised it is.

3. Closing thoughts

This might sound like magic, but it’s not. I’ve seen this work in multimillion user websites. It most likely will work for your current website as well. This is all about utilizing HTTP well, GET was designed to be cacheable on the protocol level.

I’ve deployed this to https://httpcaching.eralpbayraktar.com and https://httpcaching.eralpbayraktar.com/dynamic. The source code you can find on Github https://github.com/EralpB/httpcaching.


Congratulations! Keep caching like a champion, and please follow me on @EralpBayraktar :)