Traffic Management with Caddy

Traffic Management with Caddy

A quick refresher on the gateway model I employ: one machine is accessible from the internet, minimising the exposed surface area of my network. This machine is the gateway and serves all requests. From a DNS perspective, all of my subdomains are CNAME records pointing to the gateway.

Nginx and Apache are the main names in the web server game, and while I've used them both, my eye was drawn to Caddy. It's sleek and modern and addresses most of my gripes with the older players. I quickly fell in love with Caddy and one day it will be my downfall, but until then, I'll use it as my hammer.

I began with a simple configuration and have since found myself running multiple layers of Caddy, but I'll get to that further down.

As I run Debian-based operating systems most of the time, I followed these instructions to install the stable release. If you're not familiar with systemd or just want a refresher, there's a page in the docs that explains how to manage it.

Configuring Caddy is done by loading a "Caddyfile" into the running service, or at process launch with caddy run --config /path/to/Caddyfile. It's easy enough to make a formatting mistake, so I operate a "working document" that resides in my home directory. Caddy provides a function to format the document into a valid Caddyfile which is invoked as caddy fmt working > Caddyfile. Copy that to /etc/caddy and reload and you're good to go.

#!/bin/bash

caddy fmt working > Caddyfile && \
sudo cp Caddyfile /etc/caddy/ && \
rm Caddyfile && \
caddy reload --config /etc/caddy/Caddyfile
update_Caddyfile_and_reload.sh

So what does my working config look like?

www.lchapman.dev, lchapman.dev {
        # Not doing a website yet, redirect to the blog
        redir https://blog.lchapman.dev
}

blog.lchapman.dev {
        reverse_proxy 10.8.0.2
}

# Other services in similar formats
# ...
working

You might find you don't need to use fmt as part of your workflow if you don't make mistakes, but I move fast and make plenty.

Now we get to the part where Caddy is brought into line with the likes of Nginx and Apache: the Caddyfile is just a shorthand for the real config. Use adapt function to parse your Caddyfile and return the result, and you'll be greeted with this:

{
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [":443"],
          "routes": [{
            "match": [{
              "host": ["blog.lchapman.dev"]
            }],
            "handle": [{
              "handler": "subroute",
              "routes": [{
                "handle": [{
                  "handler": "reverse_proxy",
                  "upstreams": [{
                    "dial": "10.8.0.2:80"
                  }]
                }]
              }]
            }],
            "terminal": true
          }, {
            "match": [{
              "host": ["www.lchapman.dev", "lchapman.dev"]
            }],
            "handle": [{
              "handler": "subroute",
              "routes": [{
                "handle": [{
                  "handler": "static_response",
                  "headers": {
                    "Location": ["https://blog.lchapman.dev"]
                  },
                  "status_code": 302
                }]
              }]
            }],
            "terminal": true
          }]
        }
      }
    }
  }
}

It is verbose but that's standard for web server configuration files. Caddy allows us to avoid maintaining this JSON behemoth by hand, and I am very grateful for it.

Note how line 6 shows that Caddy is listening to port 443. By failing to specify http or https in the Caddyfile, Caddy defaults to TLS via LetsEncrypt, handling the process of requesting, serving and updating HTTPS certificates behind the scenes. TLS is terminated at the gateway and all traffic in the lchapman.dev private network is passed around via HTTP, with the eventual response returned and encrypted by the gateway on the way out into the WAN.

This is a great configuration - it's simple, easy, and no-maintenance. But what if I don't want to terminate TLS at the gateway? Say, perhaps, some software running downstream expects encrypted traffic. Say, perhaps, the correct course of action is to modify an Nginx config in a container to expect plaintext traffic; but I don't feel like doing that, so what to do?

Caddy does offer a solution! Alas, it's not so simple. I'll do it anyway though because Caddy is my hammer and I spy a sghiny new nail.

If you refer above to the JSON config you'll see the "apps" key at the top level and "http" as its first and only child. The http app is great but it has limits and unfortunately we're looking for a feature that is out of scope. Enter layer4. Layer4, or caddy-l4, or Project Conncept, is an app written by Matt Holt (who is the major contributor to and sponsor of Caddy) to extend Caddy's capabilities to layer 4, where Caddy's base functionality mainly deals with layer 7. I expect layer4 will one day be shipped with Caddy proper, but for now we must build it ourselves.

Enter xcaddy - Custom Caddy Builder. They really have thought of everything. Install it with go or grab it with apt. Build your own caddy with it. It's too easy.

$ xcaddy build latest \ --output caddy-l4 \ --with github.com/mholt/caddy-l4

I opted to name the binary differently to the binary installed with apt and copied it to /usr/bin alongside apt-caddy. I also copied the service files in /lib/systemd/system (caddy.service and caddy-api.service) and made new ones for caddy-l4.

[Unit]
Description=Caddy Layer4
Documentation=https://caddyserver.com/docs/
After=network.target network-online.target
Requires=network-online.target

[Service]
Type=notify
User=caddy
Group=caddy
ExecStart=/usr/bin/caddy-l4 run --environ --config /etc/caddy/config.json
ExecReload=/usr/bin/caddy-l4 reload --config /etc/caddy/config.json --force
TimeoutStopSec=5s
LimitNOFILE=1048576
LimitNPROC=512
PrivateDevices=yes
PrivateTmp=true
ProtectSystem=full
AmbientCapabilities=CAP_NET_BIND_SERVICE

[Install]
WantedBy=multi-user.target
caddy-l4.service

The idea is that caddy-l4 listens to port 443, follows a minimal amount of rules, and delegates most of the work back to regular caddy which now listens to port 8443. Layer4 does not yet have Caddyfile support so we'll have to implement the config in JSON. I built on the examples on the Github repo and had help from the community server on Discord and came up with this:

{
  "apps": {
    "layer4": {
      "servers": {
        "first_layer": {
          "listen": [
            ":443",
            ":80"
          ],
          "routes": [{
              "match": [{
                  "tls": {
                    "sni": [
                      "nginx.service.lchapman.dev"
              ]}}],
              "handle": [{
                  "handler": "proxy",
                  "upstreams": [{
                      "dial": [
                        "10.8.0.2:443"
              ]}]}]},
            {
              "match": [{
                  "tls": {}
              }],
              "handle": [{
                  "handler": "proxy",
                  "upstreams": [{
                      "dial": [
                        "localhost:8443"
            ]}]}]},
            {
              "match": [{
                  "http": []
              }],
              "handle": [{
                  "handler": "proxy",
                  "upstreams": [{
                      "dial": [
                        "localhost:8080"
  ]}]}]}]}}}}
}
config.json

Please excuse the formatting atrocities committed here, I wanted to save you from endless scrolling and keep the meat of the content on one screen.

If the request is HTTPS with a particular SNI (FQDN), we match it and forward it to the next layer of Caddy with TLS intact. For everything else, if it's on port 80 then forward to port 8080 and 443 to 8443. Port 80 is interesting here because I won't receive web traffic on it via my domain because .dev domains implement HSTS at the domain level, meaning web browsers refuse to send an insecure HTTP request to lchapman.dev at all. This is good, because developers should be expected to lead the way with security. Why open port 80 at all then, and why bother forwarding it? Caddy uses ACME to manage certificates and the process to obtain or renew a certificate requires port 80 - it's difficult to bootstrap TLS if TLS is required to do so. Our outer caddy-l4 instance doesn't manage any certificates so we forward all port 80 traffic to the inner caddy, which does manage the certificates.

A slight modification to the original Caddyfile is required to make this work. Add the following to the top:

{
	admin 0.0.0.0:2020
	http_port 8080
	https_port 8443
}

The admin line prevents the two caddy processes from trying to bind to the same port (2019) to run their API services. The other two lines have hopefully been explained ahead of time.

That's the full explanation of the gateway's multi-Caddy configuration. To provide the full picture, I'll give a brief rundown of the third and final instance of Caddy which resides at 10.8.0.2. It runs a Caddyfile but it specifies http or https rather than omitting them.

# ______           __     __         ___ __ __        
#|      |.---.-.--|  |.--|  |.--.--.'  _|__|  |.-----.
#|   ---||  _  |  _  ||  _  ||  |  |   _|  |  ||  -__|
#|______||___._|_____||_____||___  |__| |__|__||_____|
#                            |_____|

#	port	subdomain	service
#	2368	blog		Ghost
#	3000	nginx.service	TLS-only service
#	...	...		...

# Blog
http://blog.lchapman.dev {
	reverse_proxy localhost:2368 {
		trusted_proxies private_ranges
	}
}

# TLS-only service
https://nginx.service.lchapman.dev {
	reverse_proxy https://localhost:3000 {
		trusted_proxies private_ranges
		transport http {
			tls_server_name nginx.service.lchapman.dev
		}
	}
}

# ...
Caddyfile (10.8.0.2)

Some services work without specifying trusted proxies, some don't. It never hurts to include the directive though, so go nuts with it. Ghost requires it. The TLS-only service also requires us to specify the tls_server_name otherwise it will send a request to localhost which will be received by localhost but the whole problem we're solving is that this specific Nginx setup is expecting all traffic to be addressed to nginx.service.lchapman.dev and be encrypted with the proper certificate. We overwrite it here and life is good.

That's all there is to it. Let's say I want to spin up a new container with a new service: hello-world. (Dockerhub link)

It's pretty easy:

  1. Set up a compose file with a hello-world service using the hello-world image and expose it on port 3001
  2. docker compose up -d
  3. Set up a DNS entry with type CNAME, hostname hello.lchapman.dev, destination gateway.lchapman.dev.
  4. Add an entry to the gateway working config with matcher hello.lchapman.dev and reverse_proxy to 10.8.0.2. Reload with reload.sh.
  5. Add an entry to the 10.8.0.2 working config with matcher http://hello.lchapman.dev and reverse_proxy to http://localhost:3001. Reload with reload.sh.
  6. Send a request to https://hello.lchapman.dev/ and receive an error because Caddy doesn't have a certificate for that domain yet
  7. Come back a minute later, refresh and see hello world

The thought of going back to Nginx for this sends shivers down my spine.

BONUS CONTENT: Caddy in Docker

Yes. Do it. I almost did do it, especially when I was in the process of deploying caddy-l4, but I ultimately didn't because my gateway is a tiny VM with minimum specs (for minimum pricetag) and I didn't want to install Docker on it at all.

The way I see it, anything Nginx/Apache can do, Caddy can probably do in a way that is nicer to work with. Better? I'll let somebody with a lot more knowledge decide that. Caddy is nicer though, I'll die on that hill.