Make Your Own CDN with NetBSD

Make Your Own CDN with NetBSD

NetBSD
10 min read

Introduction

This article is a spin-off from a previous post on how to create a self-hosted CDN, based on OpenBSD, but this time we'll focus on using NetBSD. The idea is to create reverse proxies with local caching. These proxies would cache the content on the first request and serve it directly afterward. The proxies would be distributed across different regions, and the DNS would route requests to the nearest proxy based on the caller’s location. All this is achieved without relying on external CDNs, using self-managed tools instead.

NetBSD is a lightweight, stable, and secure operating system that supports a wide range of hardware, making it an excellent choice for a caching reverse proxy. Devices that other operating systems may soon abandon, such as early Raspberry Pi models or i386 architecture, are still fully supported by NetBSD and will continue to be so. Additionally, NetBSD is an outstanding platform for virtualization (using Xen or qemu/nvmm) and deserves more attention than it currently receives.

The choice of Varnish is based on several factors, with the main ones being the ability to keep the cache in RAM (which means it can run on read-only systems) and the ability to flush the cache remotely. For example, with each change to my blog, I can choose whether to perform an immediate flush (such as for a new article or an error) or wait for the cache's "natural" expiration (such as for a typo or minor, non-critical changes).

While I won't detail the installation process for NetBSD, as it depends heavily on the hardware you have available, I will guide you through setting up a self-hosted CDN using NetBSD, Varnish, nginx, and the acme.sh or lego tool for SSL certificate management.

Installation

During the installation of NetBSD, ensure that you enable support for binary package management. This will install pkgin, a tool that simplifies package management on NetBSD. If you skip this step during installation, you can still install pkgin later, but it's easier to let the installer handle it.

Once your system is up and running, use pkgin to install the necessary packages: Varnish, nginx, and, depending on your preference, either acme.sh or Go (if you plan to compile lego). Although lego is not available as a precompiled package, you can easily compile it locally using Go, but for simplicity, I recommend using acme.sh.

For this setup, I will present two methods for generating and renewing certificates: using acme.sh or compiling lego - to reach the same final outcome as the OpenBSD article.

Option 1: Using acme.sh (Recommended)

acme.sh is a simple, yet powerful, shell script that handles certificate generation and renewal with ease. To install acme.sh:

pkgin in acmesh varnish nginx

Once acme.sh is installed, you can proceed to configure it for certificate management. It supports DNS authentication and integrates with many DNS providers, making it a flexible choice.

Option 2: Compiling Lego

If you prefer to use lego, you will need to compile it manually, as it is not available as a precompiled package for NetBSD. First, install Go and other necessary packages:

pkgin in go varnish nginx

To compile lego, you’ll need some disk space. Since the /tmp directory in NetBSD is often mounted as tmpfs (using RAM), you may run out of space during compilation if your system has limited memory. You can temporarily disable tmpfs by editing /etc/fstab and commenting out the relevant line:

#tmpfs           /tmp    tmpfs   rw,-m=1777,-s=ram%25

After rebooting, compile lego:

export GO111MODULE=on
go122 install github.com/go-acme/lego/v4/cmd/lego@latest

cp go/bin/lego /usr/pkg/bin/

Once lego is compiled and installed, you can uncomment the /tmp line in /etc/fstab and reboot again.

Configuring Varnish and nginx

First, copy the necessary rc.d scripts for nginx and Varnish:

cp /usr/pkg/share/examples/rc.d/nginx /etc/rc.d/
cp /usr/pkg/share/examples/rc.d/varnishd /etc/rc.d/

Then, add the following to /etc/rc.conf to enable and configure nginx and Varnish:

nginx=YES
varnishd=YES
varnishd_flags="-f /usr/pkg/etc/varnish/default.vcl -T localhost:9999 -a "/var/run/varnish.sock",user=nginx,group=varnish,mode=660 -s default,500m"

In this configuration, Varnish listens on a Unix socket, and nginx connects to it. This approach is more efficient and helps avoid some issues that may arise when exposing Varnish over an IP/port.

Creating the Varnish VCL Configuration

Next, create the VCL configuration file for Varnish at /usr/pkg/etc/varnish/default.vcl:

vcl 4.1;
import std;

# Backend - it-notes.dragas.net
backend it_notes {
    .host = "myBackendIP";
    .port = "80";
}

# ACL - purge - it-notes.dragas.net
acl purge_it_notes {
    "allowedToPurge_IP";
}

sub vcl_recv {
    # it-notes.dragas.net
    if (req.http.Host == "it-notes.dragas.net") {
        set req.backend_hint = it_notes;
        set req.http.Host = "it-notes.dragas.net";

        # PURGE - it-notes.dragas.net
        if (req.method == "PURGE") {
            std.log("Purge request received for " + req.url);

            if (!std.ip(req.http.X-Forwarded-For, "0.0.0.0") ~ purge_it_notes) {
                return (synth(405, "Not allowed."));
            }

            if (req.url == "/" || req.url == "/*") {
                ban("req.http.host == " + req.http.host);
                return(synth(200, "Entire cache has been cleared."));
            }
            return (purge);
        }

    } else {
        # Other domains - 404
        return (synth(404, "Domain not found"));
    }

    if (req.method != "GET" && req.method != "HEAD") {
        return (pipe);
    }

    return (hash);
}

sub vcl_backend_response {
    # TTL - it-notes.dragas.net
    if (bereq.http.host == "it-notes.dragas.net") {
        if (bereq.url ~ "\.(gif|jpg|jpeg|png|webp|ico|css|js)$") {
            set beresp.ttl = 1w;
            set beresp.grace = 1d;
            set beresp.keep = 7d;
            unset beresp.http.Set-Cookie;
            unset beresp.http.Cache-Control;
            set beresp.http.Cache-Control = "public, max-age=604800";
        } else {
            set beresp.ttl = 15m;
            set beresp.grace = 48h;
            set beresp.keep = 7d;
        }
    }

    # Remove some headers
    unset beresp.http.Server;
    unset beresp.http.X-Powered-By;
    unset beresp.http.Via;

    return (deliver);
}

sub vcl_deliver {
    # Add X-Cache header
    if (obj.hits > 0) {
        set resp.http.X-Cache = "HIT";
    } else {
        set resp.http.X-Cache = "MISS";
    }

    std.log("Delivering content for " + req.url + " - Cache: " + resp.http.X-Cache);

    # Remove Varnish headers
    unset resp.http.Via;
    unset resp.http.X-Varnish;

    return (deliver);
}

sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
        hash_data(req.http.host);
    } else {
        hash_data(server.ip);
    }
    return (lookup);
}

sub vcl_hit {
    return (deliver);
}

sub vcl_miss {
    return (fetch);
}

sub vcl_purge {
    std.log("Purge executed for " + req.url);
    return (synth(200, "Purge successful"));
}

sub vcl_synth {
    set resp.http.Content-Type = "text/html; charset=utf-8";
    set resp.http.Retry-After = "5";
    synthetic({"<!DOCTYPE html>
        <html>
            <head>
                <title>"} + resp.status + " " + resp.reason + {"</title>
            </head>
            <body>
                <h1>Status "} + resp.status + " " + resp.reason + {"</h1>
                <p>"} + resp.reason + {"</p>
                <h3>Guru Meditation:</h3>
                <p>XID: "} + req.xid + {"</p>
                <hr>
                <p>Varnish cache server</p>
            </body>
        </html>
    "});
    return (deliver);
}

Configuring nginx

Now, modify the nginx configuration file at /usr/pkg/etc/nginx/nginx.conf. Set the number of worker processes to "auto" to take advantage of all server cores, and configure the reverse proxy for your site(s). Here's an example configuration:

[...]
worker_processes  auto;
[...]

server {
    server_name it-notes.dragas.net;

    location / {
        proxy_method $request_method;
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        proxy_pass http://unix:/var/run/varnish.sock;
    }

    access_log /var/log/nginx/access.it-notes.dragas.net.log;
    error_log /var/log/nginx/error.it-notes.dragas.net.log;

    listen [::]:443 ssl;
    listen 443 ssl;
    http2 on;
    # If you're using acme.sh, just change the location of the certificates
    ssl_certificate /root/.lego/certificates/it-notes.dragas.net.crt;
    ssl_certificate_key /root/.lego/certificates/it-notes.dragas.net.key;
}

server {
    if ($host = it-notes.dragas.net) {
        return 301 https://$host$request_uri;
    }
    server_name it-notes.dragas.net;
    listen 80;
    listen [::]:80;
    return 404;
}

Starting Varnish and nginx

Finally, start the Varnish and nginx services:

service varnishd start
service nginx start

If everything is configured correctly, both Varnish and nginx will be up and running, ready to handle incoming connections.

Conclusion

Congratulations, you have successfully set up your own CDN on NetBSD. This solution is lightweight, stable, and fully under your control, allowing you to break free from the constraints of major service providers. With NetBSD's broad hardware support and minimal overhead, this setup can run on a wide variety of devices, making it a versatile choice for self-hosted solutions.

I'm using it as a test and as a read-only root filesystem with a RAM-only local cache for my blog on a Raspberry Pi Zero W (first edition), and as soon as I get the new FTTH, I'll probably make it accessible via IPv6 for Italy, putting it physically into production.

If your goal is geo-replication, you can use DNS providers that offer location-based routing or set up your own DNS infrastructure to manage and resolve requests according to the user’s location. With multiple reverse proxies, separate DNS servers, and a well-configured cache, you can achieve a highly resilient system with minimal risk of a single point of failure.