@frode
[32/97]
Feeling inspired today, so I've got some more work done: * Rate limiting of signin attempts * Improved error handling * Now that I have a working rate limiter, I've enabled signups! No button anywhere that will take you there, though, so if you're reading this head to /signup and fill in the form, and I'll probably activate your account. A side note: I have a linux desktop, and a macbook, and I've previously only built and deployed micronotal from the linux machine. Running the ansible playbook from my mac, I hit a snag with mattn/sqlite3 and CGO. Luckily, there's a very straight-forward how-to in the README (https://github.com/mattn/go-sqlite3?tab=readme-ov-file#cross-compiling-from-macos): $ brew install FiloSottile/musl-cross/musl-cross $ CC=x86_64-linux-musl-gcc CXX=x86_64-linux-musl-g++ GOARCH=amd64 GOOS=linux CGO_ENABLED=1 go build -ldflags "-linkmode external -extldflags -static"
Made a couple of incremental improvements today! Yet to actually deploy the updates, but the short changelist is: * Fixed pagination buttons. Renamed from next/previous to older/newer, as the sematics are a bit clearer. Also, it's no longer possible to paginate into the void, as the the buttons are hidden if there are no more results to paginate to. * Switched from storing passwords using bcrypt to argon2id. Added some logic to autoupgrade users on login (since that's the only point in time where I have access to the password in plain text). Not exactly going to be a big migration, as I'm the only user, still nice to do it properly. * Slight improvement of the profile page. Now displays some stats like when you joined, and how many threads/posts you've written.
Deployed, and it looks to be working great!
I've been thinking a lot about https://eieio.games/essays/scaling-one-million-checkboxes/ lately. I not only think it's a fascinating story, but also on a technical level it's got a bunch of interesting challenges to solve. So, naturally, I've been considering alternative solutions, and I'm starting to play with the idea of creating my own version of OMCB. Something I've read about, but never have actually had the need for myself, is sharding. So instead of doing the pragmatic thing and use redis like the original, I was thinking of creating a distributed monstrosity where state is spread out over several different instances. The basic architecture would consist of a 3 layers: 1) caddy to serve static content and act as a reverse proxy for 2) 2) webserver responsible for handling the requests from clients, including * establishing and maintaining websocket connections * forwarding updates to the correct shard * subscribing to incremental updates from all shards and forwarding to all connected clients * regularly broadcasting a full state to all connected clients to reset state in case we lose some incremental updates along the way 3) application responsible for maintaining the state of a given shard * stores state in-memory as a bitmap * regularly backs up state to disk so that we can recover in case of failure * has methods for updating shard state, fetching whole shard state, and broadcasting incremental updates to subscribed webservers There's probably a billion things that can go wrong with this architecture, and that is among the reasons why I'd like to try it out. It should provide ample opportunity for learning new things.
Some more random ideas: * Create an orchestrator/control plane that can handle redistribution of data if we increase or decrease the number of shards * The control plane can also be responsible for backing up the entire grid * It should be able to configure the web servers without having to restart them to let them know of the changes in shards * It should be possible to dynamically set the shard state so that we can recover the entire grid from a snapshot
I read a bit more on SQLite this morning, and came across this https://www.sqlite.org/np1queryprob.html. Being used to work with client/server databases and being accustomed to thinking that n+1 is the enemy, this was a little surprising. A quick refactor of the function responsible for getting threads, and it's now not only more readable, but threads and posts are even correctly limited when doing pagination.
One thing missing from micronotal is pagination. Let's see how quick I can get something working.
Turns out I'd actually already implemented pagination using thread UUIDs as cursors! So this should be quick. I'd managed to mix up > and < when comparing the UUIDs, so ?after=<uuid> gives you every thread _before_ the provided ID, but that's a quick fix.
No, I was actually right earlier. Got after and before semantics mixed up when combined with ordering on newest first.
Got a very basic implementation working now. It's far from flawless, as it allows you to navigate to empty pages, and pagination buttons are never hidden. But now it's actually possible to see old threads!
Got to do a bit more working on the styling as well.
Calling it good enough now. You can still paginate into the void, but the buttons look ok at least. Biggest issue remaining to solve with pagination is how the limiting in the query to fetch threads and posts is applied. Right now I’m joining thread and post tables and taking the limit on the entire result. This can cause the last thread in the result to not have all posts included. Not critical to fix, but it’s definitely on the todo list.
Fixed pagination by embracing n+1 https://micronotal.com/t/019289d4-e55b-7b19-b46a-215840607cf0
I don’t know exactly what it is about this microblogging style, but it’s got me started writing, and I’m enjoying it. I just finished and published a blogpost on my personal site https://frodejac.dev/blog/go-poor-mans-cron.html It’s not groundbreaking stuff, but it’s actually written by me (not a smidgen of gen AI involved)!
Speaking of throwaway projects; I’ve been working on setting up a little service that scrapes the access logs from Caddy and the block logs from UFW. For now I’m just pushing it into BoltDB, but I’m looking to use it to generate some stats on who’s accessing (and attempting to access) my VPS. Side note: is there really any difference between a honeypot and a VPS? There’s lots of small things to get right that I’ve stumbled a bit on: * Using journalctls cursor to retrieve the logs emitted since last time we stored something (as opposed to storing a timestamp which could lead to reading the same entries multiple times, or missing some entries) * Whatever is served up by the API should use a read-only transaction with Bolt * Only one process can access a Bolt database file at the time, so any cron-like job needs to run in the same process as the web server with the read-only connection. I’m using a separate goroutine with a time.Timer to check for new logs at a regular interval and write them to the DB. Works like a charm * Using journalctl -o json gives you access to lots of great stuff, plus it’s way easier to parse when you can just use json.Unmarshal
Note to self: A guide on how to set up a poor man’s cronjob with a goroutine and a ticker is a decent first blog post
I got a working POC for this running at https://honey.frodejac.dev. Next steps are to do geolocation on the IPs and group the request counts by country.
I’m really enjoying this whole having my own VPS-thing. I’ve now successfully moved my personal site frodejac.dev from fly.io to my VPS. It used to be a Go app that served some basic static pages, plus acted as a playground for small ideas I had. Now my personal site has been revamped, and is simply served as a static site by Caddy. No SSG, and no JS, just some handcrafted HTML+CSS. Fun thing about having a whole VPS to play with instead of just a free tier fly.io app is that I can now run all the little ideas as a separate systemd service and route traffic to them with subdomains instead of paths. Makes all the throw-away projects isolated and easy to take up and down.
Never used systemd to manage a service before. I'm liking it so far. It's really easy to set up a sandboxed environment using systemd-analyze <service>. It does a good job of listing the various security-related settings and how they're currently set, and calculating a rough 'exposure level' for the given service.
I'm always learning something new when using Go to build stuff. If you strive for minimal dependencies, it's fairly bare-bones, and you end up having to figure out and deal with a bunch of stuff that is typically hidden away in the bowels of web frameworks. Today it was browser caching of static files. The Go http.FileServer can use modification timestamps on the files to set cache headers. However, since I'm embedding the static files in the binary this info gets stripped away. This results in the fileserver not setting any cache headers at all, and the browser requests all static assets for every request. Not optimal. Most obvious symptom was a flash of unstyled content on every page load, as the fonts served up by my webserver was requested for every page load. To fix this, I implemented a small wrapper around http.FileServer that returns a http.HandlerFunc that sets Cache-Control, Expires, and Last-Modified headers. The timestamp used for Last-Modified is set at build time using the -ldflags option with go build. Found that to be a good combination of reasonable and easy, as static files cannot change unless the binary changes. Of courser, the binary might change more often than the static files, but I have a plan to solve that using etags. Just needed to get something basic and working out.