<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><title>IT Notes - fedimeteo</title><link>https://it-notes.dragas.net/categories/fedimeteo/</link><description>Articles in category fedimeteo</description><language>en</language><lastBuildDate>Mon, 25 May 2026 09:14:00 +0000</lastBuildDate><atom:link href="https://it-notes.dragas.net/categories/fedimeteo/feed.xml" rel="self" type="application/rss+xml"></atom:link><item><title>FediMeteo, timezones, and the art of not breaking what already works</title><link>https://it-notes.dragas.net/2026/05/25/fedimeteo-timezones-and-the-art-of-not-breaking-what-already-works/</link><description>&lt;p&gt;&lt;img src="https://unsplash.com/photos/ZVhm6rEKEX8/download?ixid=M3wxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNzQwNTEzNjE5fA&amp;force=true&amp;w=640" alt="FediMeteo, timezones, and the art of not breaking what already works"&gt;&lt;/p&gt;&lt;p&gt;I have already written about &lt;a href="https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-service-for-thousands/"&gt;how FediMeteo was born&lt;/a&gt;, and about how &lt;a href="https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/"&gt;HAProxy helps reduce the number of requests that reach snac&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Seen from the outside, FediMeteo almost seems still. There is a static homepage, regenerated every hour. There are the city pages, with their forecasts. There are RSS feeds waiting to be fetched, JSON objects waiting to be requested, Fediverse instances refreshing data, subscribing, unsubscribing, retrieving profiles, and reading notes.&lt;/p&gt;
&lt;p&gt;That is the visible part.&lt;/p&gt;
&lt;p&gt;Behind it, however, &lt;a href="https://fedimeteo.com"&gt;FediMeteo&lt;/a&gt; is much more than a homepage, a few ActivityPub accounts, and a well-behaved reverse proxy. It is a chain of small pieces, in proper Unix style, each trying to do one thing and do it as well as possible.&lt;/p&gt;
&lt;p&gt;That chain, although almost invisible from the outside, was not born already tidy. It changed, was rewritten, adapted to new countries, timezones, ambiguous city names, external service limits, and also to my own mistakes.&lt;/p&gt;
&lt;p&gt;Some mistakes were small. Others were much less so.&lt;/p&gt;
&lt;p&gt;Because FediMeteo is a human project and, as such, imperfect. Imperfect in the way humans are imperfect, which today almost seems unfashionable. I like that.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The first version of the bot was almost embarrassingly simple, and I was proud of that.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It took a city name as input, asked &lt;a href="https://nominatim.org"&gt;Nominatim&lt;/a&gt; for the coordinates through &lt;code&gt;geopy&lt;/code&gt;, called the &lt;a href="https://open-meteo.com"&gt;Open-Meteo&lt;/a&gt; API for the current weather and the next several days, and printed a markdown block with current conditions, the forecast for today, the next twelve hours, and the coming days. The text was in Italian. The cities were Italian. The timezone was &lt;code&gt;Europe/Rome&lt;/code&gt;. There was nothing to calculate.&lt;/p&gt;
&lt;p&gt;Around the script, a small &lt;code&gt;sh&lt;/code&gt; wrapper read a list of cities and, for each one, ran the Python program and piped its output into &lt;code&gt;snac note_unlisted&lt;/code&gt;. A cron job ran the wrapper every six hours. The output was loose markdown, which snac happily renders, and the integration was: standard output goes into standard input. Nothing fancier than that.&lt;/p&gt;
&lt;p&gt;I like this kind of design. It is the part of the Unix philosophy that survives even when fashions change.&lt;/p&gt;
&lt;p&gt;When I started adding other European countries, I did not need to change much. I separated the operational logic from the localized strings, moved the strings into one JSON file per country, and spread the cron entries so that not every country posted in the same minute. Each country had its own snac instance, in its own FreeBSD jail, with its own dataset. The bot, internally, was almost the same script as before.&lt;/p&gt;
&lt;p&gt;This worked because Europe is, in essence, two or three timezones across most of the countries I cared about. &lt;/p&gt;
&lt;p&gt;Then I added Germany, and Germany taught me my first lesson about names.&lt;/p&gt;
&lt;p&gt;There are several places called Neustadt in Germany. There is a Frankfurt am Main, and a Frankfurt an der Oder, and they are not the same city. There is a Halle in Saxony-Anhalt and a Halle in North Rhine-Westphalia. Asking Nominatim for "Frankfurt, Germany" produced one of the two, consistently, but not always the one I wanted. Some German users wrote to me, politely, to point out that the forecast for "their" Frankfurt was, in fact, for the other one.&lt;/p&gt;
&lt;p&gt;I started thinking about disambiguation, but only enough to fix the immediate cases. The bot still took a single city name. The ambiguous ones I worked around by editing the cities file and hoping for the best.&lt;/p&gt;
&lt;p&gt;In hindsight, this was the seed of what would happen later.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The United States broke every assumption the bot had grown up with&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The first problem was the number of cities. I wanted reasonable coverage at state level, which meant identifying the main cities for each of the fifty states. The list ended up at more than 1200 entries. That alone is more cities than every other country in the project combined.&lt;/p&gt;
&lt;p&gt;The second problem was timezones. The contiguous United States covers four of them, and Alaska and Hawaii bring the total to six. A "current weather at 12:00" line generated at the same instant for New York and for Los Angeles is technically the same instant, but the two cities are living different parts of the day, and the forecast for "today" is not even quite the same window. A bot that pretended every city was on the same clock would be wrong, sometimes embarrassingly so, every single day.&lt;/p&gt;
&lt;p&gt;The third problem was the name thing again, only larger. There are dozens of Springfields. There is a Portland in Oregon and a Portland in Maine. The Germany workaround - editing the cities file by hand and hoping Nominatim picked the right city - was clearly not going to scale to a country where the same name is also a state.&lt;/p&gt;
&lt;p&gt;I sat with this for a couple of days before admitting what I already knew.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The bot needed to be rewritten&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;What made this hard was not the rewriting itself. It was the requirement to do it without breaking everything else.&lt;/p&gt;
&lt;p&gt;By the time I decided to add the United States, the infrastructure around the bot had grown into something I trusted. Jails, snapshots, backup jobs, cron schedules, snac instances on production paths, the HAProxy layer, the homepage cron that aggregated follower counts, and a long list of cities being processed in series every six hours. None of that knew or cared about the bot's internal shape. All of it cared, very much, about the bot's external behavior: a city name and a country code go in, valid markdown comes out, and that markdown ends up in a timeline.&lt;/p&gt;
&lt;p&gt;So the contract was clear, even if I had never written it down anywhere. The command-line interface, the output format, the exit codes, the way the wrapper script invoked it, the structure of the JSON country configs - all of it had to keep working. Italian had to keep working. German had to keep working. The cron job that ran every six hours had to keep producing the same shape of output, just with new countries added.&lt;/p&gt;
&lt;p&gt;What I changed was almost everything below the surface.&lt;/p&gt;
&lt;p&gt;The city argument grew an optional &lt;code&gt;__state&lt;/code&gt; suffix, with a double underscore as separator:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-text"&gt;python3 main.py springfield__illinois us
python3 main.py springfield__massachusetts us
python3 main.py new_york__new_york us
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A city without the suffix continued to work exactly as before, which is what every European country needed. The country config gained a &lt;code&gt;timezone&lt;/code&gt; field that could be a fixed string or the literal &lt;code&gt;"auto"&lt;/code&gt;; when it was &lt;code&gt;"auto"&lt;/code&gt;, the bot used &lt;code&gt;timezonefinder&lt;/code&gt; against the resolved coordinates to determine the right zone for that specific city. Internally I separated the weather provider behind an interface, so Open-Meteo could remain the primary while MET Norway and &lt;code&gt;wttr.in&lt;/code&gt; sat behind as alternatives, with automatic fallback when the primary failed. Units became configurable per country: temperature, wind speed, precipitation. The United States needed Fahrenheit, miles per hour, and inches. Most of Europe wanted Celsius, kilometers per hour, and millimeters. The bot now does either, on a per-country basis, without caring which is which.&lt;/p&gt;
&lt;p&gt;I am skipping a lot of small detail here, but the principle was always the same: every new degree of freedom had to be expressible as an optional field in the config or as an optional CLI flag. If a country did not set the new field, the old behavior continued, identical to before.&lt;/p&gt;
&lt;p&gt;I tested this by running the new bot against the old country configs and comparing the output line by line. Where it differed, it was a bug in the new bot. Not in the test.&lt;/p&gt;
&lt;p&gt;The first cycle after deploying the rewrite was, for every country except the United States, indistinguishable from the cycle before. That was the point.&lt;/p&gt;
&lt;p&gt;This is the part of the story I dislike telling, which is precisely why I should tell it.&lt;/p&gt;
&lt;p&gt;At some point during the development, while debugging an Open-Meteo response that did not look right, I added a &lt;code&gt;print&lt;/code&gt; statement to the error path that dumped the full request URL whenever something went wrong. The full URL of the Open-Meteo customer endpoint includes the &lt;code&gt;apikey&lt;/code&gt; query parameter. The print was meant for development. I forgot to remove it.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;I deployed&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The next time Open-Meteo had an outage - and small ones happen, sometimes for several minutes at a time - the bot dutifully printed the failing request URL into the post body. For every city. For every cycle that ran during the outage. The wrapper script piped the output into &lt;code&gt;snac note_unlisted&lt;/code&gt; without complaint. The posts went out, federated across the Fediverse, with my API key sitting in the text for anyone who cared to read.&lt;/p&gt;
&lt;p&gt;Some users were kind enough to write me and tell me. Others were less kind, and made fun of me. Both groups were correct. This should not have happened.&lt;/p&gt;
&lt;p&gt;I reported the incident to the Open-Meteo team, who were extremely understanding. They rotated the key immediately and gave me a fresh one. I removed the debug print, and then I did the slightly more useful thing, which was to add redaction at multiple layers - in the bot's output, in the daemon's logging, and in the debug helpers themselves. URL query parameters that look like API keys are masked. Environment variables and config keys named &lt;code&gt;apikey&lt;/code&gt; or &lt;code&gt;OPEN_METEO_APIKEY&lt;/code&gt; are redacted before any string reaches stdout or a log file. Even JSON-like fields that include &lt;code&gt;open_meteo_apikey&lt;/code&gt; are scrubbed if they ever appear in something the program prints.&lt;/p&gt;
&lt;p&gt;The lesson is not "be more careful." The lesson is that debug paths leak, sooner or later, so the secrets have to be unreachable from the debug paths in the first place. Now they are.&lt;/p&gt;
&lt;p&gt;That afternoon, when I realised what was happening, I closed everything for a minute and looked out of the window. Then I started fixing.&lt;/p&gt;
&lt;p&gt;Nominatim is a public service, and it is generous, but it is not infinite. Every city in the project needs coordinates, and at the start of the project every cycle would re-ask Nominatim for every city. Most of the time this worked. Sometimes it did not.&lt;/p&gt;
&lt;p&gt;There was one cycle, before I added caching, when Nominatim simply did not respond for one of my queries. The geopy call timed out. The bot raised an exception. The wrapper script gave up on that city and moved on to the next one. A few users noticed that a particular city had not received its forecast that day, and asked what had happened.&lt;/p&gt;
&lt;p&gt;I added a coordinate cache, and I am still grateful that I did.&lt;/p&gt;
&lt;p&gt;The cache is intentionally boring. The first time the bot resolves a city, it writes the latitude and longitude into a small file under &lt;code&gt;/tmp&lt;/code&gt;, named after the city, and the state when present. Every subsequent run reads the file. If the file exists, no Nominatim call is made. If the file is missing, the bot calls Nominatim and writes the file. After the first successful lookup, the cache becomes the source of truth for the coordinates of that city.&lt;/p&gt;
&lt;p&gt;This is lighter on Nominatim, faster for every cycle, and much more resilient against transient failures. It is also nice for a reason I did not anticipate.&lt;/p&gt;
&lt;p&gt;Nominatim is a geocoder, and like every geocoder it has opinions.&lt;/p&gt;
&lt;p&gt;I live in Ferrara, so when I added Italy I made sure Ferrara was in the list, and I checked the first cycle to make sure everything looked right. The forecast came out fine. The temperature was reasonable. The icon matched the sky outside my window. I closed the laptop and forgot about it.&lt;/p&gt;
&lt;p&gt;Then, one evening months later, I looked more carefully at the coordinates Nominatim had returned for "Ferrara, Italy", and I realised they did not point to the city. They pointed to a location closer to the centroid of the &lt;em&gt;province&lt;/em&gt;, which is a much larger area and mostly countryside. The forecast had been, on average, for a field somewhere outside town, not for the city center.&lt;/p&gt;
&lt;p&gt;I am not entirely sure why I had not noticed earlier. Probably because the weather in Ferrara and the weather in the fields outside Ferrara is, on most days, indistinguishable to anyone who is not paying attention. But this is the kind of detail I do not want to leave wrong, especially for my own city.&lt;/p&gt;
&lt;p&gt;There are other places where geocoding lands slightly off. Sometimes it is a few kilometers, sometimes a different neighborhood, sometimes genuinely the wrong place.&lt;/p&gt;
&lt;p&gt;Because the cache is just a file per city, the fix is also just a file per city. I open the cache file, replace the latitude and longitude with the correct values, save. The next cycle uses the corrected coordinates. No code change, no redeploy, no special tooling. I keep a small list of patched cities in a separate text file, so that if I ever rebuild the cache, I do not lose the manual corrections.&lt;/p&gt;
&lt;p&gt;This is the kind of operational simplicity I like. A cache made of plain files costs almost nothing and quietly pays back every time a small problem appears.&lt;/p&gt;
&lt;p&gt;For every report it generates, the bot also writes a simplified English text snapshot to &lt;code&gt;/tmp/&amp;lt;city&amp;gt;.txt&lt;/code&gt;, or &lt;code&gt;/tmp/&amp;lt;city&amp;gt;__&amp;lt;state&amp;gt;.txt&lt;/code&gt; when there is a state.&lt;/p&gt;
&lt;p&gt;This is intentional, and it is not a debug artifact. I am not ready to say what I am doing with it yet, but it is part of a future direction for the project. Text is a useful intermediate format, and having a clean, language-neutral representation of every forecast sitting on disk costs almost nothing and might be worth a great deal later.&lt;/p&gt;
&lt;p&gt;I prefer to let ideas mature in private before I commit to them in public. So I will leave it at this for the moment.&lt;/p&gt;
&lt;p&gt;A full cycle for the United States takes hours.&lt;/p&gt;
&lt;p&gt;It is not because the work is heavy. It is because I deliberately inserted a small &lt;code&gt;sleep&lt;/code&gt; between cities, to give snac time to dispatch the previous post before the next one is generated. With more than 1200 cities in series, even a short pause adds up. I am not in a hurry. Forecasts that arrive a few minutes apart from each other are not a problem, and the bot was already a polite citizen elsewhere. A polite cycle is fine.&lt;/p&gt;
&lt;p&gt;The problem with a slow cycle is not the duration. The problem is what happens to it.&lt;/p&gt;
&lt;p&gt;In the original design, the cycle was launched by cron. Every six hours, cron called the wrapper script, the wrapper iterated through the cities file, and for each city it ran the bot and piped the output into snac. There was no scheduler in the project at all. Cron was the scheduler. The wrapper was just a loop.&lt;/p&gt;
&lt;p&gt;Restarting snac was harmless. The wrapper would call &lt;code&gt;snac note_unlisted&lt;/code&gt; per city, and if snac happened to be unavailable for a moment, that single call might fail, but the loop kept moving and snac was usually back within seconds. Snac itself was not what held the cycle together.&lt;/p&gt;
&lt;p&gt;What held the cycle together was the wrapper process. And the wrapper process lived inside the jail.&lt;/p&gt;
&lt;p&gt;If the FreeBSD jail was restarted while the wrapper was running, the loop stopped wherever it happened to be. The cron schedule did not care. Six hours later, the next cron tick started a new cycle from the first city, and the cities that had been about to be processed at the moment of the restart were simply skipped for that window. For the United States, this could mean several hundred cities going without an update.&lt;/p&gt;
&lt;p&gt;There was a worse case, and it took me longer than it should have to recognise it. If the host was rebooting &lt;em&gt;exactly&lt;/em&gt; in the minute when cron should have fired, cron simply did not fire. There was no daemon waiting to pick up the missed tick. The cycle never even started. Six hours of forecasts would be lost, in silence, with nothing in any log to suggest anything had gone wrong.&lt;/p&gt;
&lt;p&gt;I lived with this for a long time. Reboots were rare, the impact was limited, and adding state was the kind of thing I always meant to do "next week."&lt;/p&gt;
&lt;p&gt;What finally changed it was not a dramatic incident. It was the slow accumulation of small ones. A scheduled VPS reboot. A jail restart after an upgrade. Each one on its own was nothing. Together, they were a steady drip of missed cycles.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;So I wrote a daemon&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The crontab entries for the bot went away. There is now a long-running process inside the jail, started at boot, and it does the scheduling itself. The schedule is a list of hours and a minute, read from a JSON config. The daemon wakes up once a minute, checks whether it is time to start a cycle, and either starts one or waits.&lt;/p&gt;
&lt;p&gt;The interesting part is the state file.&lt;/p&gt;
&lt;p&gt;As the daemon walks through the cities file, it writes its position to a small JSON file: which cities file it is processing, and the index of the next city to handle. The write happens at the boundary between one city and the next, because that is the only place where resuming makes sense. If the daemon is interrupted mid-city, that city is retried on resume; no half-finished post escapes.&lt;/p&gt;
&lt;p&gt;When the daemon starts, it reads the state file. If it finds one matching the current cities file, it resumes from the saved index. If the cities file has changed since the state was written, the daemon starts fresh. The check is deliberately conservative: a renamed or modified cities file is treated as a different cycle, because the indices would otherwise be meaningless.&lt;/p&gt;
&lt;p&gt;The result is the behavior I should have had from the start. If the host reboots while the United States cycle is running, the daemon comes back up with the jail, reads the state, and continues from where it left off. Every city still gets its update, just with a small gap corresponding to the reboot itself. The cycle finishes. The state file is reset. Life goes on.&lt;/p&gt;
&lt;p&gt;And the worst case from the cron days is gone. The daemon does not need anyone to fire it. As long as the jail is running, the daemon is running, and the next scheduled cycle will happen when its hour comes, regardless of what was happening at any specific minute.&lt;/p&gt;
&lt;p&gt;Of all the changes I have made to the project, this is the one I like most. It is not exciting work. It is the kind of thing that earns no applause because, when it works, it produces no visible event. But it removes a whole class of small daily annoyances, and it makes a slow process robust against the boring kind of failure: the kind nobody plans for, but that always eventually happens.&lt;/p&gt;
&lt;p&gt;The current bot does considerably more than the original Italian script. It handles per-city timezones, three weather providers with automatic fallback, unit conversion for temperature, wind, and precipitation, optional air quality, pressure trend indicators when the provider supplies pressure data, a simplified English text snapshot for future use, a coordinate cache that can be patched by hand, secret redaction at multiple layers, a heartbeat that adapts to whichever HTTP client is installed on the host, and a scheduler-and-resume daemon that survives reboots.&lt;/p&gt;
&lt;p&gt;But from the outside, almost nothing has changed.&lt;/p&gt;
&lt;p&gt;The European country configs work the same way they always did. The wrapper scripts are unchanged. The snac integration is the same one-line pipe. The HAProxy layer in front does not know or care that the bot was rewritten. The homepage cron that counts followers and regenerates the static page works exactly as before.&lt;/p&gt;
&lt;p&gt;The original Italian script does not exist as a file anymore, but it survives as a default. A country config with &lt;code&gt;timezone&lt;/code&gt; set to &lt;code&gt;Europe/Rome&lt;/code&gt; and no special options behaves, today, exactly as the first version of the bot would have. Everything else is opt-in.&lt;/p&gt;
&lt;p&gt;I like this kind of work.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefano Marinelli</dc:creator><pubDate>Mon, 25 May 2026 09:14:00 +0000</pubDate><guid isPermaLink="false">https://it-notes.dragas.net/2026/05/25/fedimeteo-timezones-and-the-art-of-not-breaking-what-already-works/</guid><category>server</category><category>networking</category><category>fediverse</category><category>snac</category><category>jail</category><category>ownyourdata</category><category>snac2</category><category>web</category><category>social</category><category>fedimeteo</category></item><item><title>FediMeteo, HAProxy, and the art of not wasting snac threads</title><link>https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/</link><description>&lt;p&gt;&lt;img src="https://unsplash.com/photos/ZVhm6rEKEX8/download?ixid=M3wxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNzQwNTEzNjE5fA&amp;force=true&amp;w=640" alt="FediMeteo, HAProxy, and the art of not wasting snac threads"&gt;&lt;/p&gt;&lt;p&gt;When I wrote about &lt;a href="https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-service-for-thousands/"&gt;FediMeteo&lt;/a&gt; for the first time, I told the story from the beginning: the idea born almost by chance while checking the weather for a holiday, the memory of my grandfather, who for years had been my personal meteorologist, the decision to build something small and useful, and then the surprise of seeing people actually use it. What began as a personal experiment quickly became a small global service, still running with the same philosophy: FreeBSD, jails, simple scripts, snac, text, emoji, and a lot of small pieces doing their work quietly.&lt;/p&gt;
&lt;p&gt;That article was mostly about the birth and growth of the project. This one is about one of the less romantic parts of the same story, although I have to admit that I find a certain beauty in it too: keeping the service light as it grows.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://fedimeteo.com"&gt;FediMeteo&lt;/a&gt; is still intentionally simple from the outside. A homepage, some numbers, a list of countries, and many ActivityPub accounts publishing weather forecasts. The posts are text and emoji. There is no JavaScript requirement to read the pages, no heavy frontend, no unnecessary media attached to every forecast, and no dynamic homepage recalculated at every visit just to show the same numbers. This is not accidental. It is the way I wanted the service to behave from the beginning.&lt;/p&gt;
&lt;p&gt;But the more the service is used, the more the small details matter. A request that looks harmless when there are ten followers may become a repeated request when there are thousands of followers, remote instances, crawlers, previews, and other servers fetching the same public objects. In the Fediverse, the same small thing can be asked many times by many different places, each one with a perfectly legitimate reason. The backend doesn't care: it just needs to deal with the requests.&lt;/p&gt;
&lt;p&gt;And in FediMeteo, the backend is &lt;a href="https://codeberg.org/grunfink/snac2"&gt;snac&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I like snac very much precisely because it is small, clear, and efficient. It is not a giant application that tries to be everything. It does a focused job and does it well. But this also means that I want to respect its shape. I do not want to waste its threads on work that the reverse proxy can safely do. A snac thread serving the same public avatar again and again is not a tragedy, but it is still a waste. A snac thread answering the same public ActivityPub object several times in the same minute is doing real work, but often not necessary work.&lt;/p&gt;
&lt;p&gt;This is the reason behind the &lt;a href="https://www.haproxy.org"&gt;HAProxy&lt;/a&gt; tuning I am currently using in front of FediMeteo.&lt;/p&gt;
&lt;p&gt;It is not about making the configuration look clever. It is about keeping snac quiet.&lt;/p&gt;
&lt;h2&gt;A continuation of the same idea&lt;/h2&gt;
&lt;p&gt;I had already explored the same problem with snac and nginx in two previous posts:  &lt;a href="https://it-notes.dragas.net/2025/01/29/improving-snac-performance-with-nginx-proxy-cache/"&gt;Improving snac Performance with Nginx Proxy Cache&lt;/a&gt;  and  &lt;a href="https://it-notes.dragas.net/2025/02/08/caching-snac-proxied-media-with-nginx/"&gt;Caching snac Proxied Media with Nginx&lt;/a&gt;. In both cases, the idea was that the reverse proxy should absorb repeated public requests instead of letting them consume snac resources.&lt;/p&gt;
&lt;p&gt;This is especially important because snac uses a limited number of threads. I like that. Limits are healthy. They force us to understand what the service is doing, and they prevent a small program from pretending to be an infinite resource. But limits also make waste visible. If a few threads are busy serving files that could have been served from cache, those threads are not available for something more useful.&lt;/p&gt;
&lt;p&gt;With FediMeteo the implementation is different because the reverse proxy is HAProxy, but the reasoning is the same. I have many small snac instances, each one in its own FreeBSD (&lt;a href="https://github.com/BastilleBSD/bastille"&gt;Bastille&lt;/a&gt;) jail, and one public entry point that has to route, terminate TLS, compress, cache, and generally remove as much repetitive work as possible from the backends.&lt;/p&gt;
&lt;p&gt;This is, in a way, the natural continuation of the original FediMeteo design. In the first article I wrote that I wanted to manage everything according to the Unix philosophy: small pieces working together. This is another piece of that same puzzle. HAProxy does the edge work. snac does the ActivityPub work. Scripts generate forecasts. cron launches updates. ZFS gives me snapshots. FreeBSD jails keep countries separated. Nothing is particularly heroic by itself, but the whole system becomes pleasant because each part has a clear responsibility.&lt;/p&gt;
&lt;h2&gt;Why there is almost no media&lt;/h2&gt;
&lt;p&gt;Before talking about HAProxy, it is worth mentioning one of the most important optimizations, which is not in the proxy configuration at all.&lt;/p&gt;
&lt;p&gt;FediMeteo does not use media in its forecasts.&lt;/p&gt;
&lt;p&gt;No images attached to the posts, no generated weather cards, no maps for each city, no decorative banners. The forecasts are text and emoji. This was a deliberate decision. Weather information does not become more useful just because it is put inside an image, and every media file used by the service would become something to store, serve, cache, federate, expire, back up, and occasionally debug.&lt;/p&gt;
&lt;p&gt;Text and emoji are enough. They are accessible, light, readable in text browsers, friendly to timelines, and understandable even when someone does not know the local language perfectly. This was one of the original design principles of FediMeteo, and it also helps the infrastructure. Less media means less work, fewer cache entries, fewer repeated fetches, fewer surprises.&lt;/p&gt;
&lt;p&gt;There is one exception: the avatar.&lt;/p&gt;
&lt;p&gt;All FediMeteo accounts use the same avatar, and this is also intentional. I could have used a different avatar for each country, or for each city, or created something visually richer. It would have been nicer in some screenshots, perhaps. It would also have been operationally worse.&lt;/p&gt;
&lt;p&gt;With one shared avatar, the reverse proxy has one very useful object to cache. It is public, identical for everyone, small, requested often, and therefore almost always hot in cache. HAProxy can serve it directly instead of asking each snac instance to return the same file. Since avatars are requested by remote instances, browsers, profile previews, and all sorts of federation-related fetches, this single decision removes a surprising amount of pointless backend traffic.&lt;/p&gt;
&lt;p&gt;So the avatar is not only a visual identity. It is part of the architecture.&lt;/p&gt;
&lt;p&gt;This is the kind of optimization I like most, because it starts before the software. It starts with deciding not to create a problem.&lt;/p&gt;
&lt;h2&gt;The homepage is static because it can be static&lt;/h2&gt;
&lt;p&gt;The main homepage follows the same logic.&lt;/p&gt;
&lt;p&gt;It is a static HTML page generated from a template. Once per hour, a cron script updates the numbers and statistics. It counts the data I want to show, regenerates the page, and then the page remains static until the next run.&lt;/p&gt;
&lt;p&gt;This is not because I cannot make a dynamic page. It is because I do not need one. Boring is good.&lt;/p&gt;
&lt;p&gt;The homepage does not need to query all the country instances on every visit. It does not need a database request for each user who opens it. It does not need to ask snac anything in real time. The numbers are useful, but they do not need to be updated every second. Once per hour is enough, and it also fits the spirit of the whole project: do the work when it is needed, then serve the result cheaply.&lt;/p&gt;
&lt;p&gt;I have seen too many small services become heavy because the first implementation was convenient rather than appropriate. A cron job and a template are not fashionable, but they are often exactly what a page like this needs.&lt;/p&gt;
&lt;h2&gt;Many countries, one entry point&lt;/h2&gt;
&lt;p&gt;FediMeteo is made of many country instances. Each one runs in its own jail and listens on its own internal address and port. From the outside, however, they all live under the same domain structure:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-text"&gt;fedimeteo.com
www.fedimeteo.com
it.fedimeteo.com
uk.fedimeteo.com
jp.fedimeteo.com
us.fedimeteo.com
usa.fedimeteo.com
can.fedimeteo.com
canada.fedimeteo.com
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And many more.&lt;/p&gt;
&lt;p&gt;At the beginning, it is always tempting to write one ACL after another in the HAProxy frontend. It is quick, it is explicit, and for five hostnames it is perfectly fine. But FediMeteo did not remain at five hostnames. As countries and aliases grew, a long chain of ACLs would have turned the frontend into a list of names instead of a description of how the proxy behaves.&lt;/p&gt;
&lt;p&gt;So I moved the hostname to backend mapping into a map file:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-text"&gt;fedimeteo.com        backend_fedimeteo
www.fedimeteo.com    backend_fedimeteo
it.fedimeteo.com     backend_it
uk.fedimeteo.com     backend_uk
jp.fedimeteo.com     backend_jp
us.fedimeteo.com     backend_us
usa.fedimeteo.com    backend_us
can.fedimeteo.com    backend_ca
canada.fedimeteo.com backend_ca
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The frontend then needs only one rule:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;use_backend %[req.hdr(host),field(1,:),lower,map(/usr/local/etc/fedimeteo.map,backend_fedimeteo)]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This reads the  &lt;code&gt;Host&lt;/code&gt;  header, removes the port if present, lowercases the result, and looks it up in  &lt;code&gt;/usr/local/etc/fedimeteo.map&lt;/code&gt;. If nothing matches, it falls back to the main FediMeteo backend.&lt;/p&gt;
&lt;p&gt;I like this because it keeps the configuration honest. The frontend contains the policy. The map contains the data. Adding a country means adding an entry to the map and defining a backend. I do not need to make the frontend more complicated every time the service grows.&lt;/p&gt;
&lt;h2&gt;Backends as small compartments&lt;/h2&gt;
&lt;p&gt;The country backends are deliberately plain:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;backend backend_it
    mode http
    http-reuse safe
    server srv1 10.0.0.2:8001 maxconn 30

backend backend_uk
    mode http
    http-reuse safe
    server srv1 10.0.0.7:8001 maxconn 30

backend backend_jp
    mode http
    http-reuse safe
    server srv1 10.0.0.32:8001 maxconn 30
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;One backend, one jail, one snac instance. This is exactly the same organizational principle as the rest of the project. If I need to reason about Italy, I look at the Italian jail. If I need to reason about the United Kingdom, I look at the UK jail. If one day I need to move a country elsewhere, the separation is already there.&lt;/p&gt;
&lt;p&gt;The  &lt;code&gt;maxconn 30&lt;/code&gt;  value is not a magic number. It is a ceiling. I want each small backend to have a visible limit in front of it. If something starts hammering a country instance, I prefer the pressure to appear at the HAProxy layer instead of becoming unlimited concurrent work inside snac.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;http-reuse safe&lt;/code&gt;  lets HAProxy reuse backend connections where appropriate. This is another small reduction in unnecessary work. Opening connections repeatedly is not the biggest problem in the world, but avoiding it is still better, especially when many small services sit behind the same proxy.&lt;/p&gt;
&lt;h2&gt;The front door&lt;/h2&gt;
&lt;p&gt;The HTTPS frontend listens on IPv4 and IPv6 and offers both HTTP/2 and HTTP/1.1:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;frontend https_in
    bind :::443 v4v6 ssl crt /usr/local/etc/certs/ alpn h2,http/1.1
    mode http
    option http-keep-alive
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;TLS defaults are set globally:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Port 80 only redirects to HTTPS, except for Let's Encrypt challenges:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;acl letsencrypt-acl path_beg /.well-known/acme-challenge/
http-request redirect scheme https code 301 unless letsencrypt-acl
use_backend letsencrypt-backend if letsencrypt-acl
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In the HTTPS frontend I also set the usual forwarding headers:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request set-header X-Real-IP %[src]
http-request set-header X-Forwarded-Proto https
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And I add HSTS:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-response set-header Strict-Transport-Security &amp;quot;max-age=31536000; includeSubDomains; preload&amp;quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;None of this is unusual, and that is fine. The interesting parts of an infrastructure are not always the parts that should be unusual.&lt;/p&gt;
&lt;h2&gt;Two caches, because the requests are different&lt;/h2&gt;
&lt;p&gt;The HAProxy configuration defines two caches:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;cache mediacache
  total-max-size 128
  max-object-size 10000000
  max-age 3600
  process-vary on
  max-secondary-entries 12

cache jsoncache
  total-max-size 16
  max-object-size 1000000
  max-age 60
  process-vary on
  max-secondary-entries 12
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I keep media and ActivityPub JSON separate because they are not the same kind of traffic.&lt;/p&gt;
&lt;p&gt;The media cache is larger and has a longer maximum age. In FediMeteo, this mostly means the shared avatar and a few static-looking objects. Since there is intentionally almost no media, the important cached object is requested very often and remains warm.&lt;/p&gt;
&lt;p&gt;The JSON cache is smaller and short-lived. It is there for public ActivityPub GET requests, not to store federation state forever. A 60 second cache is enough to collapse many repeated requests that arrive close together in time, without pretending that ActivityPub responses should be treated like immutable files.&lt;/p&gt;
&lt;p&gt;This distinction is important. Caching is not one decision. It is a set of small decisions about what a response means, who can see it, how often it changes, and what happens if it is served again.&lt;/p&gt;
&lt;h2&gt;Recognizing media&lt;/h2&gt;
&lt;p&gt;For media, the ACL is based on file extensions:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;acl is_media path_end -i .jpg .jpeg .png .gif .webp .svg .ico .mp4 .webm .mp3 .ogg .wav .flac .mov .avi .mkv .m4v
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then I store the result in a transaction variable:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request set-var(txn.is_media) bool(true) if is_media
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The cache lookup is straightforward:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request cache-use mediacache if { var(txn.is_media) -m bool true }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And on the response side:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-response set-header Cache-Control &amp;quot;max-age=3600, public&amp;quot; if { var(txn.is_media) -m bool true }
http-response del-header Set-Cookie if { var(txn.is_media) -m bool true }
http-response del-header Vary if { var(txn.is_media) -m bool true }
http-response cache-store mediacache if { var(txn.is_media) -m bool true }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The  &lt;code&gt;Cache-Control&lt;/code&gt;  header makes the intent explicit.  &lt;code&gt;Set-Cookie&lt;/code&gt;  is removed because a public media object should not carry session information.  &lt;code&gt;Vary&lt;/code&gt;  is removed because I do not want the same avatar to fragment into many cache entries because of harmless header differences.&lt;/p&gt;
&lt;p&gt;This is aggressive only if removed from its context. In this service, with this media policy, it is a reasonable choice. FediMeteo is not serving private media under these paths. It is mostly serving the same public avatar over and over.&lt;/p&gt;
&lt;p&gt;For the same reason, I clean the request before it reaches the backend:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request del-header Authorization if { var(txn.is_media) -m bool true }
http-request del-header Cookie        if { var(txn.is_media) -m bool true }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I would not do this globally. I do it after deciding that the request is media. Scope is what makes these rules safe.&lt;/p&gt;
&lt;p&gt;The result is exactly what I want: the shared avatar becomes an almost perfect cache object. Small, public, repeatedly requested, and served by HAProxy instead of snac.&lt;/p&gt;
&lt;h2&gt;ActivityPub JSON microcaching&lt;/h2&gt;
&lt;p&gt;The ActivityPub side starts from the  &lt;code&gt;Accept&lt;/code&gt;  header:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;acl is_ap_json   req.hdr(Accept),lower -m sub application/activity+json
acl is_ap_ldjson req.hdr(Accept),lower -m sub application/ld+json
acl is_outbox    path_end /outbox
acl is_get       method GET
acl has_auth     req.hdr(Authorization) -m found
acl has_cookie   req.hdr(Cookie) -m found
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This part matters because ActivityPub uses content negotiation. The same path may return HTML to a browser and JSON to a remote instance. If the proxy pretends that a URL is always one thing, it will eventually cache the wrong representation.&lt;/p&gt;
&lt;p&gt;So I only mark public ActivityPub GET requests as cacheable:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request set-var(txn.is_activitypub) bool(true) if is_get !is_outbox is_ap_json !has_auth !has_cookie
http-request set-var(txn.is_activitypub) bool(true) if is_get !is_outbox is_ap_ldjson !has_auth !has_cookie
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;There are several decisions here, all important.&lt;/p&gt;
&lt;p&gt;It must be a  &lt;code&gt;GET&lt;/code&gt;, because I am not caching deliveries or anything that changes state. It must not be  &lt;code&gt;/outbox&lt;/code&gt;, because outbox collections are not the traffic I want to cache here. It must not have  &lt;code&gt;Authorization&lt;/code&gt;, and it must not have cookies, because authenticated or user-specific requests do not belong in a shared public cache.&lt;/p&gt;
&lt;p&gt;Then the cache can be used and populated:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-request cache-use jsoncache if { var(txn.is_activitypub) -m bool true }

http-response set-header Cache-Control &amp;quot;max-age=60, public&amp;quot; if { var(txn.is_activitypub) -m bool true }
http-response cache-store jsoncache if { var(txn.is_activitypub) -m bool true }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Sixty seconds is short, but useful. Federation often creates small clusters of identical requests. A remote server fetches an actor, another fetches the same actor, something asks for the same object, something retries. I do not need to cache these responses for hours. I only need HAProxy to answer the second and third identical request during the same small burst.&lt;/p&gt;
&lt;p&gt;This is microcaching in the most practical sense. It reduces repeated work without changing the nature of the service.&lt;/p&gt;
&lt;h2&gt;Static media paths&lt;/h2&gt;
&lt;p&gt;There is also a rule for static paths:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;acl is_short_path path_reg ^/[^/]+/s/
http-request cache-use mediacache if is_short_path
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This comes from the same observation that led me to cache snac media with nginx. snac uses static media paths, and those paths often represent the kind of public, repeatable traffic that should not consume backend threads if the proxy can serve it. I call them "short", not because they are, but because the first time I saw them, I thought the 's' stood for "short", not "static". The name just stuck.&lt;/p&gt;
&lt;p&gt;In FediMeteo this is less central than on a normal social instance, because I deliberately do not use media except for the avatar and basic static objects. Still, the rule fits the general policy: let HAProxy handle repeatable edge work, and let snac spend its threads where they are actually needed.&lt;/p&gt;
&lt;h2&gt;&lt;code&gt;Vary&lt;/code&gt;, but not without limits&lt;/h2&gt;
&lt;p&gt;Both caches have:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;process-vary on
max-secondary-entries 12
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I want HAProxy to process  &lt;code&gt;Vary&lt;/code&gt;, because content negotiation is real, especially when ActivityPub is involved. But I also want variation to be bounded. If every slightly different header creates another cache entry, the cache becomes a complicated way to miss.&lt;/p&gt;
&lt;p&gt;For media, I remove  &lt;code&gt;Vary&lt;/code&gt;  before storing the response. A shared avatar does not need to vary by  &lt;code&gt;Accept&lt;/code&gt;. For ActivityPub JSON, I am more careful because the representation matters.&lt;/p&gt;
&lt;p&gt;Again, the important thing is not the number itself. It is the decision to make variation explicit and limited.&lt;/p&gt;
&lt;h2&gt;Seeing whether it works&lt;/h2&gt;
&lt;p&gt;During rollout, I like to expose a very small diagnostic header:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;http-response set-header X-Cache-Status HIT if !{ srv_id -m found }
http-response set-header X-Cache-Status MISS if { srv_id -m found }
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is intentionally simple. If HAProxy selected a backend server, I call it a miss. If no backend server was selected, the response came from cache, so I call it a hit. It is not a complete observability system, but it is enough to answer the first question I usually have after changing a cache rule.&lt;/p&gt;
&lt;p&gt;Did this request reach snac?&lt;/p&gt;
&lt;p&gt;A test can be as simple as:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-sh"&gt;curl -I https://it.fedimeteo.com/path/to/avatar.png
curl -I https://it.fedimeteo.com/path/to/avatar.png
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The second request should be a hit.&lt;/p&gt;
&lt;p&gt;For ActivityPub JSON, the test must use the right  &lt;code&gt;Accept&lt;/code&gt;  header:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-sh"&gt;curl -I \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And I also want to verify that cookies and authorization prevent public caching:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-sh"&gt;curl -I \
  -H 'Cookie: test=value' \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object

curl -I \
  -H 'Authorization: Bearer fake' \
  -H 'Accept: application/activity+json' \
  https://it.fedimeteo.com/some/activitypub/object
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;A cache that works should be visible. A cache that is invisible can be correct, but it can also be silently wrong. I prefer to know.&lt;/p&gt;
&lt;h2&gt;Compression and operational paths&lt;/h2&gt;
&lt;p&gt;HAProxy also handles gzip compression:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;filter compression
compression algo gzip
compression type text/css text/html text/javascript application/javascript text/plain text/xml application/json application/activity+json
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This keeps another common responsibility at the edge. The country instances can stay focused on snac and the forecast data, while HAProxy deals with client-facing compression for HTML, JSON, and ActivityPub responses.&lt;/p&gt;
&lt;p&gt;There is also a local Prometheus exporter:&lt;/p&gt;
&lt;pre class="highlight"&gt;&lt;code class="language-haproxy"&gt;frontend prometheus
  bind 127.0.0.1:8405
  mode http
  http-request use-service prometheus-exporter
  no log
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And I keep internal operational paths, such as statistics and Grafana, handled before the hostname map. These are small details, but ordering matters. Special paths should be explicit and early. The hostname map is for FediMeteo routing, not for every internal tool I happen to expose behind the same proxy.&lt;/p&gt;
&lt;h2&gt;What this changes in practice&lt;/h2&gt;
&lt;p&gt;The nice thing about this configuration is that none of its parts is particularly surprising.&lt;/p&gt;
&lt;p&gt;The map keeps hostname routing manageable. The backend definitions keep each country isolated and limited. The static homepage avoids dynamic work for something that changes once per hour. The shared avatar gives HAProxy one very hot media object to serve directly. The media cache keeps public files away from snac. The JSON microcache absorbs short ActivityPub bursts. Header cleanup prevents useless variation. Connection reuse avoids unnecessary backend connection churn.&lt;/p&gt;
&lt;p&gt;But all of this is only a longer way of saying one thing:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;fewer requests reach snac&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;That is the metric I care about here.&lt;/p&gt;
&lt;p&gt;Not because snac is slow. If anything, FediMeteo exists in its current form because snac is efficient enough to make this kind of project possible on a very small VPS. But precisely because the whole architecture is small and pleasant, I do not want to waste resources where there is no need.&lt;/p&gt;
&lt;p&gt;This is also consistent with the rest of the project. Forecasts are serialized by scripts. Updates happen every six hours. The homepage is regenerated hourly. Countries live in separate jails. Snapshots and backups are handled outside the application. No single component tries to be the entire system.&lt;/p&gt;
&lt;p&gt;HAProxy is just another small piece, but it sits in the right place to remove a lot of repeated work.&lt;/p&gt;
&lt;h2&gt;Caveats&lt;/h2&gt;
&lt;p&gt;This configuration is not a universal HAProxy recipe for ActivityPub services.&lt;/p&gt;
&lt;p&gt;It matches FediMeteo as it is now: almost no media, one shared avatar, static homepage, public forecasts, many small snac instances, and ActivityPub traffic that can benefit from a short public cache when there are no cookies or authorization headers.&lt;/p&gt;
&lt;p&gt;If I decide one day to use media in forecasts, the media cache rules will need to be reviewed. If I use different avatars for each city or country, the cache will still work, but I will lose the very nice property of one shared, always-hot avatar. If ActivityPub responses become actor-dependent, public JSON caching must be reconsidered. If one country grows a very different traffic pattern from the others, it may deserve a different limit or policy.&lt;/p&gt;
&lt;p&gt;This is why I do not like presenting configurations as magic. A good configuration is a written form of the assumptions behind a service. When the assumptions change, the configuration must change too.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;FediMeteo started as a small idea and became larger than I expected, but I still want it to feel small in the right ways. Small does not mean fragile. Small means understandable. It means that each part has a reason to exist, and that unnecessary work is removed before it becomes a problem.&lt;/p&gt;
&lt;p&gt;The HAProxy layer follows this idea. It terminates TLS, routes hostnames through a map, reuses backend connections, serves the shared avatar from cache, microcaches public ActivityPub JSON, avoids authenticated and cookie-based traffic, and gives me a small diagnostic header to see what is happening.&lt;/p&gt;
&lt;p&gt;There is no single brilliant directive here. There is only the usual work of matching infrastructure to reality.&lt;/p&gt;
&lt;p&gt;FediMeteo publishes weather forecasts as text and emoji. The homepage is static HTML updated every hour. The accounts share the same avatar because it is enough, and because it is better for the cache. Each country has its own snac instance in its own FreeBSD jail. HAProxy stands in front of them and tries, quietly, not to bother them unless it has to.&lt;/p&gt;
&lt;p&gt;I like this kind of infrastructure.&lt;/p&gt;
&lt;p&gt;Not because it is invisible, but because when it works well, it leaves very little to say.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefano Marinelli</dc:creator><pubDate>Mon, 18 May 2026 09:44:00 +0000</pubDate><guid isPermaLink="false">https://it-notes.dragas.net/2026/05/18/fedimeteo-haproxy-and-the-art-of-not-wasting-snac-threads/</guid><category>freebsd</category><category>haproxy</category><category>server</category><category>networking</category><category>hosting</category><category>fediverse</category><category>snac</category><category>jail</category><category>ownyourdata</category><category>snac2</category><category>web</category><category>social</category><category>fedimeteo</category></item><item><title>FediMeteo: How a Tiny €4 FreeBSD VPS Became a Global Weather Service for Thousands</title><link>https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-service-for-thousands/</link><description>&lt;p&gt;&lt;img src="https://unsplash.com/photos/ZVhm6rEKEX8/download?ixid=M3wxMjA3fDB8MXxhbGx8fHx8fHx8fHwxNzQwNTEzNjE5fA&amp;force=true&amp;w=640" alt="FediMeteo: How a Tiny €4 FreeBSD VPS Became a Global Weather Service for Thousands"&gt;&lt;/p&gt;&lt;h2&gt;Personal Introduction&lt;/h2&gt;
&lt;p&gt;Weather has always significantly influenced my life. When I was a young athlete, knowing the forecast in advance would have allowed me to better plan my training sessions. As I grew older, I could choose whether to go to school on my motorcycle or, for safety reasons, have my grandfather drive me. And it was him, my grandfather, who was my go-to meteorologist. He followed all weather patterns and forecasts, a remnant of his childhood in the countryside and his life on the move. It's to him that I dedicate &lt;a href="https://fedimeteo.com"&gt;FediMeteo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The idea for &lt;a href="https://fedimeteo.com"&gt;FediMeteo&lt;/a&gt; started almost by chance while I was checking the holiday weather forecast to plan an outing. Suddenly, I thought how nice it would be to receive regular weather updates for my city directly in my timeline. After reflecting for a few minutes, I registered a domain and started planning.&lt;/p&gt;
&lt;h2&gt;Design Principles&lt;/h2&gt;
&lt;p&gt;The choice of operating system was almost automatic. The idea was to separate instances by country, and FreeBSD jails are one of the most useful tools for this purpose.&lt;/p&gt;
&lt;p&gt;I initially thought the project would generate little interest. I was wrong. After all, weather affects many of our lives, directly or indirectly. So I decided to structure everything in this way:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;I would use a test VPS to see how things would go. The VPS &lt;em&gt;was a small VM on a German provider with 4 shared cores, 4GB of RAM, 120GB of SSD disk space, and a 1Gbit/sec internet connection&lt;/em&gt; and now is a 4 euro per month VPS in Milano, Italy - 4 shared cores, 8 GB RAM and 75GB disk space.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I would separate various countries into different instances, for both management and security reasons, as well as to have the possibility of relocating just some of them if needed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Weather data would come from a reliable and open-source friendly source. I narrowed it down to two options: &lt;a href="https://wttr.in/"&gt;wttr.in&lt;/a&gt; and &lt;a href="https://open-meteo.com/"&gt;Open-Meteo&lt;/a&gt;, two solutions I know and that have always given me reliable results.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I would pay close attention to accessibility: forecasts would be in local languages, consultable via text browsers, with emojis to give an idea even to those who don't speak local languages, and everything would be accessible without JavaScript or other requirements. One's mother tongue is always more "familiar" than a second language, even if you're fluent.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I would manage everything according to Unix philosophy: small pieces working together. The more years pass, the more I understand how valuable this approach is.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The software chosen to manage the instances is &lt;a href="https://codeberg.org/grunfink/snac2"&gt;snac&lt;/a&gt;. Snac embodies my philosophy of minimal and effective software, perfect for this purpose. It provides clear web pages for those who want to consult via the web, "speaks" the ActivityPub protocol perfectly, produces RSS feeds for each user (i.e., city), has extremely low RAM and CPU consumption, compiles in seconds, and is stable. The developer is an extremely helpful and positive person, and in my opinion, this carries equal weight as everything else.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I would do it for myself. If there was no interest, I would have kept it running anyway, without expanding it. So no anxiety or fear of failure.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Technical Implementation&lt;/h2&gt;
&lt;p&gt;I started setting up the first "pieces" during the days around Christmas 2024. The scheme was clear: each jail would handle everything internally. A Python script would download data, city by city, and produce markdown. The city coordinates would be calculated via the &lt;a href="https://geopy.readthedocs.io/en/stable/"&gt;geopy&lt;/a&gt; library and passed to &lt;a href="https://wttr.in/"&gt;wttr.in&lt;/a&gt; and &lt;a href="https://open-meteo.com/"&gt;Open-Meteo&lt;/a&gt;. No data would be stored locally. This approach gives the ability to process all cities together. Just pass the city and country to the script, and the markdown would be served. At that point, snac comes into play: without the need to use external utilities, the "snac note" command allows posting from stdin by specifying the instance directory and the user to post from. No need to make API calls with external utilities, having to manage API keys, permissions, etc.&lt;/p&gt;
&lt;h3&gt;Setting Up for Italy&lt;/h3&gt;
&lt;p&gt;To simplify things, I first structured the jail for Italy. I made a list of the main cities, normalizing them. For example, La Spezia became la_spezia. Forlì, with an accent, became forli - this for maximum compatibility since each city would be a snac user. I then created a script that takes this list and creates snac users via "snac adduser." At that point, after creating all the users, the script would modify the JSON of each user to convert the city name to uppercase, insert the bio (a standard text), activate the "bot" flag, and set the avatar, which was the same for all users at the time. This script is also able to add a new city: just run the script with the (normalized) name of the city, and it will add it - also adding it to the "cities.txt" file, so it will be updated in the next weather update cycle.&lt;/p&gt;
&lt;h3&gt;Core Application Development&lt;/h3&gt;
&lt;p&gt;I then created the heart of the service. A Python application (initially only in Italian, then multilingual, separating the operational part from the text) able to receive (via command line) the name of a city and a country code (corresponding to the file with texts in the local language). The script determines the coordinates and then, using API calls, requests the current weather conditions, those for the next 12 hours, and the next 7 days. I conducted experiments with both wttr.in and Open-Meteo, and both gave good results. However, I settled on Open-Meteo because, for my uses, it has always provided very reliable results. This application directly provides an output in Markdown since snac supports it, at least partially.&lt;/p&gt;
&lt;p&gt;The cities.txt file is also crucial for updates. I created a script - post.sh, in pure sh, that scrolls through all cities, and for each one, launches the FediMeteo application and publishes its output using snac directly via command line. Once the job is finished, it makes a call to my instance of &lt;a href="https://it-notes.dragas.net/2024/07/22/install-uptime-kuma-freebsd-jail/"&gt;Uptime-Kuma&lt;/a&gt;, which keeps an eye on the situation. In case of failure, the monitoring will alert me that there have been no recent updates, and I can check.&lt;/p&gt;
&lt;p&gt;At this point, the system cron takes care of launching post.sh every 6 hours. The requests are serialized, so the cities will update one at a time, and the posts will be sent to followers.&lt;/p&gt;
&lt;h2&gt;Growth and Unexpected Success&lt;/h2&gt;
&lt;p&gt;After listing all Italian provincial capitals, I started testing everything. It worked perfectly. Of course, I had to make some adjustments at all levels. For example, one of the problems encountered was that snac did not set the language of the posts, and some users could have missed them. The developer was very quick and, as soon as I exposed the problem, immediately modified the program so that the post could keep the system language, set as an environment variable in the sh script.&lt;/p&gt;
&lt;p&gt;After two days, I decided to start adding other countries and announce the project. And the announcement was unexpectedly well received: there were many boosts, and people started asking me to add their cities or countries. I tried to do what I could, within the limits of my physical condition, as in those days, I had the flu that kept me at home with a fever and illness for several days. I started adding many countries in the heart of Europe, translating the main indications into local languages but maintaining emojis so that everything would be understandable even to those who don't speak the local language. There were some small problems reported by some users. One of them: not all weather conditions had been translated, so sometimes they appeared in Italian - as well as errors. In bilingual countries, I tried to include all local languages. Sometimes, unfortunately, making mistakes as I encountered dynamics unknown to me or difficult to interpret. For example, in Ireland, forecasts were published in Irish, but it was pointed out to me that not everyone speaks it, so I modified and published in English.&lt;/p&gt;
&lt;h3&gt;A Turning Point&lt;/h3&gt;
&lt;p&gt;The turning point was when FediFollows (&lt;a href="https://social.growyourown.services/@FediFollows"&gt;@FediFollows@social.growyourown.services&lt;/a&gt; - who also manages the site &lt;a href="https://fedi.directory/"&gt;Fedi Directory&lt;/a&gt;) started publishing the list of countries and cities, highlighting the project. Many people became aware of FediMeteo and started following the various accounts, the various cities. And from here came requests to add new countries and some new information, such as wind speed. Moreover, I was asked (rightly, to avoid flooding timelines) to publish posts as unlisted - this way, followers would see the posts, but they wouldn't fill local timelines. Snac didn't support this, but again, the snac dev came to my rescue in a few hours.&lt;/p&gt;
&lt;h2&gt;Scaling Challenges&lt;/h2&gt;
&lt;p&gt;But with new countries came new challenges. For example, in my original implementation, all units of measurement were in metric/decimal/Celsius - and this doesn't adapt well to realities like the USA. Moreover, focusing on Europe, almost all countries were located in a single timezone, while for larger countries (such as Australia, USA, Canada, etc.), this is totally different. So I started developing a more complete and global version and, in the meantime, added almost all of Europe. The new version would have to be backward compatible, would have to take into account timezone differences for each city, different measurements (e.g., degrees C and F), as well as, initially more difficult part, being able to separate cities with the same name based on states or provinces. I had already seen a similar problem with the implementation of support for Germany, so it had to be addressed properly.&lt;/p&gt;
&lt;p&gt;The original goal was to have a VPS for each continent, but I soon realized that thanks to the quality of snac's code and FreeBSD's efficient management, even keeping countries in separate jails, the load didn't increase much. So I decided to challenge myself and the limits of the economical 4 euros per month VPS. That is, to insert as much as possible until seeing what the limits were. Limits that, to date, I have not yet reached. I would also soon exhaust the available API calls for Open-Meteo's free accounts, so I tried to contact the team and explain everything. I was positively surprised to read that they appreciated the project and provided me with a dedicated API key.&lt;/p&gt;
&lt;p&gt;Compatible with my free time, I managed to complete the richer and more complete version of my Python program. I'm not a professional dev, I'm more oriented towards systems, so the code is probably quite poor in the eyes of an expert dev. But, in the end, it just needs to take an input and give me an output. It's not a daemon, it's not a service that responds on the network. For that, snac takes care of it.&lt;/p&gt;
&lt;h2&gt;Expansion to North America&lt;/h2&gt;
&lt;p&gt;So I decided to start with a very important launch: the USA and Canada. A non-trivial part was identifying the main cities in order to cover, state by state, all the territory. In the end, I identified more than 1200 cities. A number that, by itself, exceeded the sum of all other countries (at that time). And the program, now, is able to take an input with a separator (two underscores: __) between city and state. In this way, it's possible to perfectly understand the differences between city and state: new_york__new_york is an example I like to make, but there are many.&lt;/p&gt;
&lt;p&gt;The launch of the USA was interesting: despite having had many previous requests, the reception was initially quite lukewarm, to my extreme surprise. The number of followers in Canada, in a few hours, far exceeded that of the USA. On the contrary, the country with the most followers (in a few days, more than 1000) was Germany. Followed by the UK - which I expected would have been the first.&lt;/p&gt;
&lt;h2&gt;System Performance&lt;/h2&gt;
&lt;p&gt;The VPS held up well. Except for the moments when FediFollows launched (after fixing some FreeBSD tuning, the service slowed slightly but didn't crash), the load remained extremely low. So I continued to expand: Japan, Australia, New Zealand, etc.&lt;/p&gt;
&lt;h2&gt;Current Status&lt;/h2&gt;
&lt;p&gt;At the time of the last update of this article (21 May 2026), the supported countries are 42: Argentina, Australia, Austria, Belgium, Brazil, Bulgaria, Canada, Cyprus, Croatia, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, India, Indonesia, Ireland, Italy, Japan, Latvia, Lithuania, Malta, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, South Africa, Spain, Sweden, Switzerland, Taiwan, Turkey, the United Kingdom, and the United States of America (with more regions coming soon!).&lt;/p&gt;
&lt;p&gt;Direct followers in the Fediverse are around 8,889 and growing daily, excluding those who follow hashtags or cities via RSS, whose number I can't estimate. However, a quick look at the logs suggests there are many more.&lt;/p&gt;
&lt;p&gt;The cities currently covered are 3602 - growing based on new countries and requests.&lt;/p&gt;
&lt;h2&gt;Challenges Encountered&lt;/h2&gt;
&lt;p&gt;There have been some problems. The most serious, by my fault, was the API key leak: I had left a debug code active and, the first time Open-Meteo had problems, the error message also included the API call - including the API key. Some users reported it to me (others just mocked) and I fixed the code and immediately reported everything to the Open-Meteo team, who kindly gave me a new API Key and deactivated the old one.&lt;/p&gt;
&lt;p&gt;A further problem was related to geopy. It makes a call to Nominatim to determine coordinates. One of the times Nominatim didn't respond, my program wasn't able to determine the position and went into error. I solved this by introducing coordinate caching: now the program, the first time it encounters a city, requests and saves the coordinates. If present, they will be used in the future without making a new request via geopy. This is both lighter on their servers and faster and safer for us.&lt;/p&gt;
&lt;h2&gt;Infrastructure Details&lt;/h2&gt;
&lt;p&gt;And the VPS? It has no problems and is surprisingly fast and effective. FreeBSD 15.0-RELEASE, BastilleBSD to manage the jails. Currently, there are 43 jails - one for haproxy, the &lt;a href="https://fedimeteo.com"&gt;FediMeteo website&lt;/a&gt;, so nginx, and the snac instance for &lt;a href="https://fedimeteo.com/fedi/admin"&gt;FediMeteo announcements and support&lt;/a&gt; - the other 41 for the individual instances. Each of them, therefore, has its autonomous ZFS dataset. Every 15 minutes, there is a local snapshot of all datasets. Every hour, the homepage is regenerated: a small script calculates the number of followers (counting, instance by instance, the followers of individual cities, since I don't publish except in aggregate to avoid possible triangulations and privacy leaks of users). Every hour, moreover, an external backup is made via &lt;a href="https://it-notes.dragas.net/2022/05/30/how-we-are-migrating-many-of-our-servers-from-linux-to-freebsd-part-2/"&gt;zfs-autobackup&lt;/a&gt; (on encrypted at rest dataset), and once a day, a further backup is made in my datacenter, on disks encrypted with geli. The occupied RAM is 501 MB (yes, exactly: 501 MB), which rises slightly when updates are in progress. Updates normally occur every 6 hours. I have tried, as much as possible, to space them out to avoid overloads in timelines (or on the server itself). Only for the USA, I added a sleep of 5 seconds between one city and another, to give snac the opportunity to better organize the sending of messages. It probably wouldn't be necessary, with the current numbers, but better safe than sorry. In this way, the USA is processed in about 2 and a half hours, but the other jails (thus countries) can work autonomously and send their updates.&lt;/p&gt;
&lt;p&gt;The average load of the VPS (taking as reference both the last 24 hours and the last two weeks) is about 25%, as it rises to 70/75% when updates occur for larger instances (such as the USA), or when it is announced by FediFollows. Otherwise, it is on average less than 10%. So, the VPS still has huge margin, and new instances, with new nations, will still be inside it.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;This article, although in some parts very conversational, aims to demonstrate how it's possible to build solid, valid, and efficient solutions without the need to use expensive and complex services. Moreover, this is the demonstration of how it's possible to have your online presence without the need to put your data in the hands of third parties or without necessarily having to resort to complex stacks. Sometimes, less is more.&lt;/p&gt;
&lt;p&gt;The success of this project demonstrates, once again, that my grandfather was right: weather forecasts interest everyone. He worried about my health and, thanks to his concerns, we spent time together. In the same way, I see many followers and friends talking to me or among themselves about the weather, their experiences, what happens. Again, in my life, weather forecasts have helped sociality and socialization.&lt;/p&gt;
&lt;p&gt;Thank you, Grandpa.&lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Stefano Marinelli</dc:creator><pubDate>Wed, 26 Feb 2025 07:00:00 +0100</pubDate><guid isPermaLink="false">https://it-notes.dragas.net/2025/02/26/fedimeteo-how-a-tiny-freebsd-vps-became-a-global-weather-service-for-thousands/</guid><category>fediverse</category><category>snac</category><category>snac2</category><category>hosting</category><category>server</category><category>freebsd</category><category>networking</category><category>web</category><category>social</category><category>fedimeteo</category></item></channel></rss>