How I Learned to Stop Worrying and Love Docker Networking (Or: A Tale of Relay Woes)

How I Learned to Stop Worrying and Love Docker Networking (Or: A Tale of Relay Woes)

A cautionary tale about deploying a Nostr relay, featuring duplicate routes, DNS shenanigans, and the eternal struggle between “it works on my machine” and “why is everything on fire?”

The Setup: What Could Possibly Go Wrong?

It all started innocently enough. I had a simple goal: deploy a personal Nostr relay with a nice web interface. You know, the kind of project that should take “an afternoon” and definitely won’t spiral into a weekend-consuming debugging marathon involving container networking, nginx configurations, and questioning my life choices.

Spoiler alert: It took considerably longer than the afternoon.

Chapter 1: The Great Scheduler Addition

Things were working beautifully. The relay was humming along, clients were connecting, and I was feeling pretty good about myself. Then I had what seemed like a perfectly reasonable idea: “You know what this needs? A background scheduler to update content automatically!”

Because apparently I subscribe to the philosophy of “if it’s not broken, add more moving parts until it is.”

I implemented a lovely little scheduler using Python’s threading module. Clean code, proper separation of concerns, the works. What could go wrong?

Chapter 2: The 502 Bad Gateway Blues

After deploying the scheduler, suddenly everything was returning 502 Bad Gateway errors. Classic. The kind of error message that’s about as helpful as a chocolate teapot.

The logs were helpfully reporting: AssertionError: View function mapping is overwriting an existing endpoint function: update_cache

Ah yes, the dreaded duplicate route. Turns out, in my enthusiasm to add scheduler functionality, I had somehow managed to define the same Flask route twice. Because apparently, like a bad pop song, once just wasn’t enough.

The Detective Work

This is where things got interesting. The web interface was down, but the containers were running. The relay process was starting, then immediately dying with the grace of a swan having an existential crisis.

[ERROR] Worker (pid:205) exited with code 3
[ERROR] Shutting down: Master  
[ERROR] Reason: Worker failed to boot.

Worker failed to boot? More like “developer failed to not duplicate code,” but who’s keeping track?

Chapter 3: The Great Container Rebuild

The fix was simple enough once I found it: delete the duplicate route definition. But here’s where Docker’s “helpful” caching behavior comes into play.

You see, the app code wasn’t mounted as a volume—it was baked into the container image during build. So my local fix was sitting there, mocking me, while the container cheerfully continued using the broken version like nothing happened.

Queue the Docker rebuild process, which in 2025 still feels like watching paint dry, but with more anxiety about whether it will actually work this time.

Chapter 4: The DNS Comedy Hour

With the duplicate routes fixed, the web app was working again. Victory! Time to check if the relay was accessible to Nostr clients.

“The relay is showing as disconnected on clients again.”

Record scratch. Freeze frame.

This is where our story takes a delightful detour into the wonderful world of Docker networking and nginx configuration.

The Investigation

WebSocket connections were failing with—you guessed it—another 502 Bad Gateway. But this time, it wasn’t the app. It was nginx.

The configuration looked perfectly reasonable:

proxy_pass http://nostr-relay:8080$1;

Except for one tiny detail: that hostname didn’t exist.

The Plot Twist

Docker Compose had named the container nostr-home_rnostr-relay_1, not nostr-relay. So nginx was essentially trying to phone a number that was disconnected.

$ docker exec nginx_container nslookup nostr-relay
** server can't find nostr-relay: SERVFAIL

The actual working hostname? rnostr-relay. Because of course it was.

Chapter 5: The Victory Lap

One simple find-and-replace later:

sed -i 's/nostr-relay:8080/rnostr-relay:8080/g' nginx.conf

And suddenly everything worked again. WebSocket connections returned HTTP 101 (the good kind of switching protocols), relay info was being served properly, and Nostr clients could connect without throwing tantrums.

Lessons Learned (The Hard Way)

  1. Docker networking is like a box of chocolates — you never know what hostname you’re gonna get, and it’s probably not the one you expected.

  2. Always check your container namesdocker-compose ps is your friend, even when it tells you things you don’t want to hear.

  3. Volume mounts are your friend — If you want to edit code and have it actually take effect, mount it as a volume. Revolutionary concept, I know.

  4. Duplicate code is the enemy — Flask will absolutely refuse to start if you define the same route twice, and it will do so with all the grace of a toddler having a meltdown in a grocery store.

  5. DNS resolution works until it doesn’t — And when it doesn’t, everything fails in the most spectacular way possible.

The Epilogue

After all the debugging, container rebuilding, and nginx configuration wrestling, I now have:

  • ✅ A working Nostr relay
  • ✅ WebSocket connections that actually connect
  • ✅ A background scheduler that doesn’t crash everything
  • ✅ A web interface with real-time statistics
  • ✅ Several new gray hairs

The relay is now happily serving connections at wss://nostr.pleb.one, the web interface shows live statistics, and the scheduler updates content every 30 minutes without setting anything on fire.

Was it worth the several hours of debugging? Ask me after I’ve had more coffee.

Final Thoughts

Deploying software is like trying to assemble IKEA furniture while blindfolded, using instructions written in a language you don’t speak, with tools that may or may not be the right ones.

But when everything finally clicks into place, and you see that beautiful HTTP/1.1 101 Switching Protocols response, all the pain seems worth it.

Until the next deployment, anyway.


The relay is live at wss://nostr.pleb.one if you want to test your own Nostr client’s patience with my networking skills. You can also visit the web interface to watch real-time statistics and marvel at how something so simple can be so complicated to deploy.

No containers were permanently harmed in the making of this deployment. Several developer sanity points may have been lost in the process.


Write a comment
No comments yet.