I am writing today laying on the couch both a little bit sick and with a sore knee (best to double up these things up to save on recovery time right?). This does present me with an opportunity to write about my budget running timing gate project.
At large running, cycling and probably countless other events, participants are usually given a bib containing their name / race number. Inside this bib is a little UHF chip/tag which transmits as it crosses a gate such as the start or finish line. The gates themselves are usually a mat that contain the antenna within. Many races usually contain multiple gates and recently have provided websites to monitor the progress of participants throughout the event.
Prior to messing up my knee, then falling off my bike… again, I regularly took part in a run club near me. The run involves following the river then a short city section back to the start, totalling about 7.5km. It’s a really lovely run with a small group of people. At one point we joked about people cutting one of the corners and that they would be penalised. It got me thinking - could we build a low budget gate system for fun. Even +/- 10 seconds is going to be fine for this.
So when we think about these gates there are three main components, the gate, the tag, and the server/site. I wanted this to be super low budget so for the tag I decided to pick something that we all already had - smart watches. In this case specifically Garmin, however I think this approach could work for others as well. The idea here is to set up the watch to broadcast heart rate over bluetooth. The gate would monitor bluetooth signal strength and record a crossing when the bluetooth signal is strongest.
Some quick tests showed that this approach was possible. The above diagram shows my signal strength (in blue) and in yellow when I pressed the lap button as I cross the gate. Someone else (red) happened to be running at the same and were accidentally included into the experiment - I guess they had heart rate transmitting turned on.
For the gate, the constraints are even harder. It needs to be installed in public space, ideally without asking for permission, not getting in the way (eg no mat), be left unattended (so not expensive if stolen/lost), have some form of live communication. My first idea here was cheap smart phones. However I had concerns about them being stolen and the cost of mobile plans. My next thought was using LoRa WAN devices. Originally I went with a LILYGO TTGO but switched to a T-Beam.
The T-Beam gives me a GPS (good for time!), WiFi/Bluetooth, LoRa transmitter, 18650 battery, and a little display for diagnostics. These are used a lot with Meshtastic folk so there’s also lots of 3d printed case designs as well.
I wasn’t sure if I wanted to hide these or make them look like they belong there.
In terms of communication I’ve been using the “The Things Network”. Likely Helium network would provide more coverage but fuck crypto. As a user crosses through the gate it records the time (this is why GPS was handy) and MAC address. After a waiting period it’ll transmit the seen bluetooth mac address and the time of the strongest signal. The Things Network receives this and triggers a web hook for my API.
Fun little side note here. My backend API doesn’t have any real code. There’s no Lambda function or container running. It’s just API gateway mapping the request into a DynamoDB request.
Before we get too far I wanted to talk about privacy. This system doesn’t really have an opt-in function - apart from turning bluetooth on or off on your smart watch. Any Garmin MAC address detected will get forwarded over the The Things Network and to my backend. The Things Network is encrypted so other people sniffing around won’t be able to see that data. I’ve had a think about how this might be able to be improved and the best I can do is a sort of registration process, where a user might sign up and register their MAC address with the service ahead of time. Possibly even require them to register at a specific gate. Because this is just a fun little project with friends I haven’t implemented any of that, but I thought it would be worth mentioning if anyone wanted to build this into a bigger system.
So I built four of these little gates. I think four is sort of the minium you can get away with for this system. You probably need one for the start and unless your start is also the finish you need 1 for the finish. Then you need one fairly close to the start so you have an initial pace estimate (I always wondered why there was a gate really close to the start at Run Melbourne). Which leaves one more for a half way point.
I haven’t had much luck with actually doing practical tests with these, but on the weekend I got the opportunity to. Alex donated her time to help me, and I even did two laps of the test course myself.
So this is what the website looks like after the run. While running the red marker will indicate estimated runner position and lap times are updated once received. The route is programmed in via a small JSON config file like so:
And a GeoJSON file is used to provide the route for the map.
As you can see the screenshot, a number of gates were missed. One of the gates had a failed battery when we went to do the test so we had to only use 3 gates. One of the gates position didn’t have any The Things Network coverage, so never reported crossings. And just to make things worse, someone parked a car right where the start/finish gate was located. This stress tested the algorithm used for lining up the gate crossings with the route - something that has taken a lot of thinking. Handling missing gate crossings can be a bit tricky to get right.
Even though not many data points were recorded in this test it was still a good outcome showing that the system can work. The coverage issue is somewhat known. I was running a gateway recently but didn’t like the overall setup so I’m in the process of rebuilding it which should improve that specific test track. Looking forward to trying this on a much longer track when I’m up for long runs again.
I usually don’t write blog posts on things I’ve been doing at work, however AWS Backup is just so truly bad I want to warn people of some of the gotchas should you want to perform your own migration to it. Lets start with what AWS Backup thinks it is. (Emphasis by me)
AWS Backup is a fully managed service that centralizes and automates data protection across AWS services and hybrid workloads. It provides core data protection features, ransomware recovery capabilities, and compliance insights and analytics for data protection policies and operations. AWS Backup offers a cost-effective, policy-based service with features that simplify data protection at exabyte scale across your AWS estate.
Now before I tear AWS Backup apart let me start with the parts that I think are good and why you’ll probably use it.
Automated and fairly straight forward copy jobs to other AWS regions and accounts
Very easy to demonstrate to auditors that your backups are happening and working
Can backup resources based on filters - no need to create a job for each resource
Various methods of protecting the backups from tampering
Continuous backup modes*
Cost effective
AWS product page mentions the cost-effectiveness of AWS Backup and this might be partly true depending on how you are currently doing your backups or what resources you are backing up. For example say you had a script that copied from one S3 bucket to another as your backup. In us-east-1 you would pay $0.023 per GB for that backup. Now that same backup using AWS Backup would cost just $0.05 per GB…. wait what. You are paying effectively 2.2 times more than S3 to backup using AWS Backup. What. DocumentDB, RedShift and Aurora are cheaper to backup using AWS Backup than S3. In Sydney region EBS volumes are cheaper to backup than S3. This makes no sense to me.
But wait, there’s more. When you perform the initial backup for S3 (and for changes) AWS Backup performs the requests to S3 like a normal client, this means you get stung with KMS, CloudTrail, S3 requests and GuardDuty api calls. There is no way to filter our AWS Backup from CloudTrail - so if you have have a lot of objects you could get up for thousands of dollars for that initial sync. AWS Backup team solution is to “turn of CloudTrail” or “don’t use AWS Backup”. Amazing. GuardDuty isn’t even listed in the support docs on possible costs.
Now before we leave the cost effectiveness section of this blog we should talk about budgeting. It is impossible to estimate how much AWS Backup will cost. I lodged 2 tickets to get an idea of how continuous backup mode works and how much it would cost to backup RDS and Aurora instances. The responses were both unhelpful and misleading. Copy jobs for Aurora and RDS are only snapshots, not PITR - this is not made clear in the docs. Trying to estimate this cost is near impossible because not even AWS knows how to do this.
Fully managed
There’s a bunch of fundamentals that are just missing from AWS Backup compared to pretty much any other backup solution. For example want to test that a backup plan works correctly? Guess what, you can’t trigger a manual start. You have to wait for the scheduled run time. So you want to get a list of jobs that are not successful - the UI only lets you filter by one state, and there is like 4 different failed states.
But it’s ok because you have useful AWS features like sending SNS notifications when backups fail… except for some reason you can’t send that notification failure SNS to another AWS account for some unknown reason….
Restore testing sounds great. You schedule it to test restores, you can even run your own validation scripts as Lambda functions. However it was clearly a hacked on feature. For S3 you have to restore the entire bucket. Have a large bucket, that’s gonna cost you. For Aurora and DocumentDB the restore test doesn’t even start an instance. It just creates a cluster. What’s being tested? Then to top it all off, S3 buckets linger around for days because to clean up the S3 bucket AWS Backup uses a lifecycle rule to delete all the objects (I know this is a good move for cost effectiveness but that’s an internal AWS thing!).
If you are configuring restore testing, much like backups there’s no way to trigger a test now. Hope your IAM, Lambda function, Event Bridge rules are perfect. Oh btw, since you have create resources for DocumentDB and Aurora - hope you can handle waiting longer than the Lambda timeout to do the test. Restore testing doesn’t even try to restore with the same configuration - so you have to manually define the VPC and SGs for the databases as well - otherwise it will try to use the default VPC and SG.
And how do you define those parameters for setting security group and vpc? As a JSON object? As a string? Nope, a JSON object as a string. There’s no validation on this, so if you send the wrong object, or send an array instead, the UI explodes.
It’s ok, restore testing will clean up your resources once the test is done…. nope if you created instances you need to delete them before it will clean up the cluster.
Simplify data protection
Here’s an incomplete list of gotchas for AWS Backup
Remember to exclude the CloudTrail logging bucket so you don’t make loops
Remember to exclude the server access logging bucket so you don’t make loops
Don’t remove S3 Event Bridge notifications on buckets that have been configured with AWS Backup otherwise the backup has to start again (terraform does this by default when you have a notification policy configured)
If the bucket is empty or not files that will be backed up - the job will be marked as failed
Don’t configure AWS Backup for the same backup window as RDS backup window otherwise backups will branch creating conflicting backups
Can’t create overlapping schedules for continuous backups - so be careful with your resource selection
Restore testing UI doesn’t display what resource it restored either. Just what the restore point ARN is. This makes it hard to demonstrate to an auditor that you tested resourcing of that resource.
Compliance insights
So after all of this you think, at least I have audit frameworks. Except if you are doing restore testing you quickly find out that the restore test resources are included in the report - suggesting you should backup your restores. And there isn’t a good way to exclude them because it’s really just AWS Config in a trench coat.
Project Horus is a high altitude balloon project. My first memory of Project Horus was when they launched Tux into near space at Linux Conf Au 2011 - my first Linux Conf Au. It would be awhile before I would start working on high altitude balloon related things. In that time Project Horus moved from using RTTY (radio teletype) for the payload telemetry to a 4FSK mode called horusbinary. Today horusbinary is used by many different amateur high altitude balloon payloads and supports additional custom fields.
Typically horusbinary payloads are decoded using the “horus-gui” app by connecting a radio to a laptop or computer via soundcard interface.
Mark’s done a great job packaging this up to make it easy for receiving stations to get up and running quickly.
And apparently I made the python to c wrapper using clib… I don’t even remember this.
For more casual users or people using phones and tablets could we build something more accessible? Recently I played around making a web frontend for freeselcall using web assembly which prompted Mark to ask the question:
but yeah, this is interesting, and suggests a horus decoder is probably possible
since i think you wrapped fsk_demod in a similar way for freeselcall as you did for horusdemodlib?
Introducing webhorus
webhorus is a browser based recreation of horus-gui allowing anyone to decode horusbinary balloon payloads from a browser providing a “no install” method and usable from mobile phones. It’s even possible to decode telemetry by holding up a mobile phone to the speaker of a radio.
webhorus is fully functional and is fairly close in feature parity to horus-gui.
How it was built
So horus-gui uses python library horusdemodlib which in turn uses the horus_api c library. That means if we build a website we’ll end up with the trifecta of Python, C and JavaScript in a single project.
horusdemodlib
I decided that I wanted to do minimal changes to the existing C and Python library and decided to include horusdemodlib as a git submodule. I had to request some changes via PRs to fix some bugs and make things easier to work with but it was fairly minimal.
One thing I didn’t want to use however was ctypes. The existing module uses ctypes and it has some drawbacks. The two biggest for me is that its very easy to get things wrong with ctypes leading to memory corruption and the other is that the current approach requires a dynamic linked library thats built beside the module, not with.
Since I’m packaging this for the web I decided to use cffi . cffi lets you define a c source file - usually you can just use your .h file1 - that it uses to build a compiled python module with.
struct horus *horus_open_advanced_sample_rate (int mode, int Rs, int tx_tone_spacing, int Fs);
voidhorus_set_verbose(struct horus *hstates, int verbose);
inthorus_get_version (void);
inthorus_get_mode (struct horus *hstates);
inthorus_get_Fs (struct horus *hstates);
....
horusdemodlib is simple enough that I could replicate the cmake process inside FFI().set_source()
ffibuilder.set_source("_horus_api_cffi",
"""
#include "horus_api.h" // the C header of the library
""",
sources=[
"./horusdemodlib/src/fsk.c",
"./horusdemodlib/src/kiss_fft.c",
"./horusdemodlib/src/kiss_fftr.c",
"./horusdemodlib/src/mpdecode_core.c",
"./horusdemodlib/src/H_256_768_22.c",
"./horusdemodlib/src/H_128_384_23.c",
"./horusdemodlib/src/golay23.c",
"./horusdemodlib/src/phi0.c",
"./horusdemodlib/src/horus_api.c",
"./horusdemodlib/src/horus_l2.c",
],
include_dirs = [ "./horusdemodlib/src"],
extra_compile_args = ["-DHORUS_L2_RX","-DINTERLEAVER","-DSCRAMBLER","-DRUN_TIME_TABLES"]
) # library name, for the linker
Using this approach along with setuptools (for me via poetry) means that the c library is built as part of your normal pip install or package build.
I wrote a very small wrapper around the cffi library
This is considerably easier than ctypes as you typically have to define the args and the return types. Any mistake will lead to at best a segfault and at worst undefined behaviour.
I utilised existing helper functions from horusdemodlib python library to ensure easy maintainability and decoder parity.
From here I can perform a pip install ./ and obtain a functional modem. So we’ve replicated what the horusdemodlib python library could do, using horusdemodlib and in Python…. wait what were trying to do again… right we needed static link python library - check.
from webhorus import demod
horus_demod = demod.Demod()
frame = horus_demod.demodulate(data)
pyodide
Next step is getting this from a compiled python library into web assemble to call from the browser javascript. Since we have python building the c library for us pyodide pretty much takes care of the entire process.
Everytime I’ve performed this I’ve found a handful of compiler issues to clean up. Nothing major. In this case it was just ensuring that one of the functions always had a return path.
The output of this build process is a python whl file compiled as wasm32. I always think this step will be the hardest but its actually been the easiest so far.
Python in the browser
From the Javascript side with Pyodide installed2 we can start accessing Python. I opt’d for putting the python code in its own file so that my IDE stopped getting confused. One important note here is that the pyodide library “.wasm” file needs to be served with mime type “application/wasm”
constpyodide=awaitloadPyodide();
// so we can actually load up micropip and install our packages and dependencies via pip - great for development however for production release I didn't want to rely on a third party server.
pyodide.loadPackage("./assets/crc-7.1.0-py3-none-any.whl")
...
pyodide.loadPackage("./assets/webhorus-0.1.0-cp312-cp312-pyodide_2024_0_wasm32.whl")
pyodide.runPython(await (awaitfetch("/py/main.py")).text());
What’s really neat is just how good the integration between python and Javascript is. Pyodide have done an amazing job with this. Using things like getElementById from within python just feels so wrong - but works so well.
The last piece of this puzzle is audio. We need to capture microphone input to feed it into the modem. This was so challenging to get working. Jasper St. Pierre wrote a good blog post on “I don’t know who the Web Audio API is designed for” and I definitely feel that frustration on this project.
Now there’s some extra frustrations with what we are doing. We need as clean unprocessed audio as possible. I found out the hard way that Chrome tries to optimise audio for video calls. But before we get to that we need to get access to the audio device. This must happen through user interaction - so we force the user to hit a “Start” button, create an AudioContext and eventually perform a getUserMedia. On most browsers this will pop up a consent dialog requesting access to microphone. Prior to this we can’t even list the audio devices.
Now that we have consent and we’ve opened and audio device, then we list what audio devices the user has. To get the unprocessed audio we have to perform another getUserMedia with constraints set to remove any filters. The ones we care about are echoCancellation, autoGainControl and noiseSuppression. However if we blindly request these constraints turned off we’ll break Safari. So first we do a getCapabilities on the device first to see what we can set.
It’s all a bit of a hassle and getting the order of operations correct so that each browser is happy is a bit frustrating. Opening the microphone was only half the battle, we also need to obtain PCM samples to put into the modem. The only3 non deprecated way to do this seems to be via the AudioWorkletNode. This runs a bit of Javascript in a node / sandbox / thread. This actually sounds ideal - I could run the modem in its own thread and just send back decoded messages. However the sandbox is so limited that loading the modem into that environment is just too hard. Instead we just buffer up audio samples and when we have enough use this.port.postMessage to send buffers data back to the client. Oh and because this is the Web Audio Api we need to convert -1 to 1 floats to shorts. What fun.
Anyway, that’s all the tricky parts glued together. The other 80% is building out a UI with Bootstrap, Plotly, and Leaflet + figuring out how to get Vite to build it into a solution. The code is up on GitHub if you’d like to check it out, including a GitHub Actions workflow so you can see the entire build process.
cffi doesn’t exactly have a preprocessor so many of the things to expect to be able to do in a .h file won’t work, so some manual manipulation is required. ↩︎
something that is trivial when operating in plain JS but with a bundler is very frustrating ↩︎
AnalyserNode does seem to allow grabbing PCM samples via the getByteTimeDomainData() method however I’m not sure you can ensure that you obtain all the samples or sync it up? ↩︎