webhorus - inflating Javascript big and round

Project Horus is a high altitude balloon project. My first memory of Project Horus was when they launched Tux into near space at Linux Conf Au 2011 - my first Linux Conf Au. It would be awhile before I would start working on high altitude balloon related things. In that time Project Horus moved from using RTTY (radio teletype) for the payload telemetry to a 4FSK mode called horusbinary. Today horusbinary is used by many different amateur high altitude balloon payloads and supports additional custom fields.

An amateur weather balloon being filled

Typically horusbinary payloads are decoded using the “horus-gui” app by connecting a radio to a laptop or computer via soundcard interface.

Horus-gui user interface showing configuration options for the modem and graphs of the signal

Mark’s done a great job packaging this up to make it easy for receiving stations to get up and running quickly.

Authors

Written by: Python Library - Mark Jessop vk5qi@rfhead.net

FSK Modem - David Rowe

FSK Modem Python Wrapper - XSSFox

And apparently I made the python to c wrapper using clib… I don’t even remember this.

For more casual users or people using phones and tablets could we build something more accessible? Recently I played around making a web frontend for freeselcall using web assembly which prompted Mark to ask the question:

but yeah, this is interesting, and suggests a horus decoder is probably possible

since i think you wrapped fsk_demod in a similar way for freeselcall as you did for horusdemodlib?

Introducing webhorus

webhorus being used to receive telemetry

webhorus is a browser based recreation of horus-gui allowing anyone to decode horusbinary balloon payloads from a browser providing a “no install” method and usable from mobile phones. It’s even possible to decode telemetry by holding up a mobile phone to the speaker of a radio.

webhorus is fully functional and is fairly close in feature parity to horus-gui.

How it was built

So horus-gui uses python library horusdemodlib which in turn uses the horus_api c library. That means if we build a website we’ll end up with the trifecta of Python, C and JavaScript in a single project.

horusdemodlib

I decided that I wanted to do minimal changes to the existing C and Python library and decided to include horusdemodlib as a git submodule. I had to request some changes via PRs to fix some bugs and make things easier to work with but it was fairly minimal.

One thing I didn’t want to use however was ctypes. The existing module uses ctypes and it has some drawbacks. The two biggest for me is that its very easy to get things wrong with ctypes leading to memory corruption and the other is that the current approach requires a dynamic linked library thats built beside the module, not with.

Since I’m packaging this for the web I decided to use cffi . cffi lets you define a c source file - usually you can just use your .h file1 - that it uses to build a compiled python module with.

struct        horus *horus_open_advanced_sample_rate (int mode, int Rs, int tx_tone_spacing, int Fs);
void          horus_set_verbose(struct horus *hstates, int verbose);
int           horus_get_version              (void);
int           horus_get_mode                 (struct horus *hstates);
int           horus_get_Fs                   (struct horus *hstates);
....

horusdemodlib is simple enough that I could replicate the cmake process inside FFI().set_source()

ffibuilder.set_source("_horus_api_cffi",
"""
     #include "horus_api.h"   // the C header of the library
""",
      sources=[
        "./horusdemodlib/src/fsk.c",
        "./horusdemodlib/src/kiss_fft.c",
        "./horusdemodlib/src/kiss_fftr.c",
        "./horusdemodlib/src/mpdecode_core.c",
        "./horusdemodlib/src/H_256_768_22.c",
        "./horusdemodlib/src/H_128_384_23.c",
        "./horusdemodlib/src/golay23.c",
        "./horusdemodlib/src/phi0.c",
        "./horusdemodlib/src/horus_api.c",
        "./horusdemodlib/src/horus_l2.c",
      ],
       include_dirs = [ "./horusdemodlib/src"],
       extra_compile_args = ["-DHORUS_L2_RX","-DINTERLEAVER","-DSCRAMBLER","-DRUN_TIME_TABLES"]
     )   # library name, for the linker

Using this approach along with setuptools (for me via poetry) means that the c library is built as part of your normal pip install or package build.

I wrote a very small wrapper around the cffi library

import _horus_api_cffi

.....

self.hstates = horus_api.horus_open_advanced_sample_rate(
	mode, rate, tone_spacing, sample_rate
)
horus_api.horus_set_freq_est_limits(self.hstates,100,4000)
horus_api.horus_set_verbose(self.hstates, verbose)
.....

This is considerably easier than ctypes as you typically have to define the args and the return types. Any mistake will lead to at best a segfault and at worst undefined behaviour.

I utilised existing helper functions from horusdemodlib python library to ensure easy maintainability and decoder parity.

From here I can perform a pip install ./ and obtain a functional modem. So we’ve replicated what the horusdemodlib python library could do, using horusdemodlib and in Python…. wait what were trying to do again… right we needed static link python library - check.

from webhorus import demod
horus_demod = demod.Demod()
frame = horus_demod.demodulate(data)

pyodide

Next step is getting this from a compiled python library into web assemble to call from the browser javascript. Since we have python building the c library for us pyodide pretty much takes care of the entire process.

Once pyodide is installed and activated its literally just a case of doing:

pyodide build 

Everytime I’ve performed this I’ve found a handful of compiler issues to clean up. Nothing major. In this case it was just ensuring that one of the functions always had a return path. The output of this build process is a python whl file compiled as wasm32. I always think this step will be the hardest but its actually been the easiest so far.

Python in the browser

From the Javascript side with Pyodide installed2 we can start accessing Python. I opt’d for putting the python code in its own file so that my IDE stopped getting confused. One important note here is that the pyodide library “.wasm” file needs to be served with mime type “application/wasm”

const pyodide = await loadPyodide();

// so we can actually load up micropip and install our packages and dependencies via pip - great for development however for production release I didn't want to rely on a third party server.
pyodide.loadPackage("./assets/crc-7.1.0-py3-none-any.whl")
...
pyodide.loadPackage("./assets/webhorus-0.1.0-cp312-cp312-pyodide_2024_0_wasm32.whl")
pyodide.runPython(await (await fetch("/py/main.py")).text());

What’s really neat is just how good the integration between python and Javascript is. Pyodide have done an amazing job with this. Using things like getElementById from within python just feels so wrong - but works so well.

horus_demod = demod.Demod(tone_spacing=int(document.getElementById("tone_spacing").value), sample_rate=sample_rate)

Adding audio

The last piece of this puzzle is audio. We need to capture microphone input to feed it into the modem. This was so challenging to get working. Jasper St. Pierre wrote a good blog post on “I don’t know who the Web Audio API is designed for” and I definitely feel that frustration on this project.

Now there’s some extra frustrations with what we are doing. We need as clean unprocessed audio as possible. I found out the hard way that Chrome tries to optimise audio for video calls. But before we get to that we need to get access to the audio device. This must happen through user interaction - so we force the user to hit a “Start” button, create an AudioContext and eventually perform a getUserMedia. On most browsers this will pop up a consent dialog requesting access to microphone. Prior to this we can’t even list the audio devices.

Now that we have consent and we’ve opened and audio device, then we list what audio devices the user has. To get the unprocessed audio we have to perform another getUserMedia with constraints set to remove any filters. The ones we care about are echoCancellation, autoGainControl and noiseSuppression. However if we blindly request these constraints turned off we’ll break Safari. So first we do a getCapabilities on the device first to see what we can set.

It’s all a bit of a hassle and getting the order of operations correct so that each browser is happy is a bit frustrating. Opening the microphone was only half the battle, we also need to obtain PCM samples to put into the modem. The only3 non deprecated way to do this seems to be via the AudioWorkletNode. This runs a bit of Javascript in a node / sandbox / thread. This actually sounds ideal - I could run the modem in its own thread and just send back decoded messages. However the sandbox is so limited that loading the modem into that environment is just too hard. Instead we just buffer up audio samples and when we have enough use this.port.postMessage to send buffers data back to the client. Oh and because this is the Web Audio Api we need to convert -1 to 1 floats to shorts. What fun.

Comic: So how do I process the audio? You don’t, you use a AudioWorkletNode. Ok I have a AudioWorkletNode, How do I get the PCM samples? You write a map function to multiple each value by 32767. Did you just tell me to go fuck myself? I believe I did bob.

Anyway, that’s all the tricky parts glued together. The other 80% is building out a UI with Bootstrap, Plotly, and Leaflet + figuring out how to get Vite to build it into a solution. The code is up on GitHub if you’d like to check it out, including a GitHub Actions workflow so you can see the entire build process.


  1. cffi doesn’t exactly have a preprocessor so many of the things to expect to be able to do in a .h file won’t work, so some manual manipulation is required. ↩︎

  2. something that is trivial when operating in plain JS but with a bundler is very frustrating ↩︎

  3. AnalyserNode does seem to allow grabbing PCM samples via the getByteTimeDomainData() method however I’m not sure you can ensure that you obtain all the samples or sync it up? ↩︎


Bad Smart Watch Auth

(tl;dr - If you are using a cheap smart watch, probably don’t. If you want to skip to the juicy bit jump to “Authentication”)

As with most complicated things in my life, it started with a simple question.

Do you know how to get the data off this watch?

Surely there’s an export feature in the app to a common file format like GPX, TCX or FIT. That’s pretty standard. Yet we couldn’t find a way. The only share and export functionality just provided an image with some stats on it.

Also given Android’s lockdown on the file system (to the point that you can no longer backup your phone!) there was no way to access the DB to obtain data (without root anyway).

Investigating the Ryze Android app I quickly discovered that this is a white label of an IDO smart watch.

Screenshot of the IDO and Ryze website showing practically the same watch

Screenshot of the IDO and Ryze website showing the same smart band

It got me thinking - could I pick up one of these watches for a lot cheaper to play with. Eventually I landed on searching Amazon. The hardest part was working out if it was actually an IDO based device. I discovered what worked well was looking for watches that use the “VeryFit” app which is the IDOs stock app. There’s actually a bunch of watches that use this app. I found one for a low price of $39 (AUD) and free next day shipping.

Telegram message : So why did you buy a knockoff apple watch from amazon?
My partner however was a little confused. The watch came in some shiny packaging without mentioning its brand. To be honest I’m pretty impressed with what you can get for $39 these days.

Box with shiny logo that says Smart Watch on my desk

The watch on my wrist

Hilariously I paired my IDO watch with the Ryze app and it just worked fine. I wasn’t actually expecting that to work but it seemed ok with it.

Would I buy one? yes - as a Bluetooth Low Energy reverse engineering example device otherwise no. Onwards to reverse engineering!

To the cloud

The first thing I looked at was syncing data from the cloud. This was pretty easy and basically just required following the same request flow as the phone. It however has some drawbacks, you still need to use the app, and you need to have an account. I didn’t really progress any further with this - but it is a viable way forward with those limitations.

Getting data off watch directly

I’m going to skip past dealing with authentication for the moment as that gets a little spicy and we don’t want to ruin the standard story telling plot framework.

Once connected and authenticated the device presents a bluetooth service at 0AF0 which provides several characteristics. The import ones to us are a subscription to 0AF7 - this is where data from the watch to the phone is sent while 0AF6 allows us to send commands.

adb logcat and BlueSee.app were lifesavers here. The app logs a lot of information about what its doing and how its working. You can pretty much work out everything from using the app with adb logcat running then emulating that flow with BlueSee.app. Also what helps is that the developers seemed to have a scattering of their documentation and source code on GitHub and some of the other vendors shipping these have done the same. Not all the source code is available which would have made reverse engineering just a little bit easier but its a good start.

So we send commands by writing to 0AF6. The easiest first set is 0x0201 which provides some basic device info. The device will write back to 0AF7 with the returned data. For these 0x02.... commands the return data will be structured like:

02 - Header
01 - CMD (get basic info)
.. - Reply data structure (device)
.. -                      (firmware version )
..
..

Data is little endian.

V3 protocol

Getting activity data we need to use the “v3” protocol. These commands have the first byte a 0x33 or “3” in ASCII.

The v3 protocol can have data be split over multiple transmissions - a requirement of using this style BLE. So that the watch and phone can know that a new command has started rather than a continuation there’s a preamble DAADDAAD. This is just an utterly weird design choice given all the other features in BLE that could have been used. There’s then a service selectorish thing, length, command, sequence fields, followed by data and eventually a CRC.

To send a request to get the last activity I use the following command:

0x33 - '3' version protocol
0xDA - this is preamble to detect new command midway through
0xAD - preamble
0xDA - preamble
0xAD - preamble
0x01 - health?
0x10 - length (without '3' and checksum (last two bytes) - aka minus 3 on single packets)
0x00 - length
0x04 - cmd
0x00
0x0B - sequence - can just pick anything
0x01 - sequence
0x00 - start / stop
0x04 - cmd
0x00
0x00
0x00 - checksum
0x00 - checksum

Checksum calculation

I think its just some sort of CRC16 however I’m not sure which because the device I have never checked it so I never bothered working it out.

V3 return data

The return contains much of the same header. You can use the sequence numbers to line things up with you have to handle queueing of data and multiple requests.

Activity data is nearly always split over multiple packets so just read in the subscription until the data received matches the provided length in the header or until you see another preamble.

Then its a matter of unpacking the data. Noting that you’ll have arrays of varying lengths.

You might be thinking that this protocol will be stripped down to save transfer speed, however the protocol sends stuff like pace in several different units.

Online activity downloader

I built a really rough and dirty website for talking to my device and getting the health data off it. You can play around with it at https://sprocketfox.io/ido/. It’s happy path’d for my $39 crappy watch so probably won’t work for most people. It generates a TCX file that can be used on Strava or Garmin.

Screenshot showing an activity being downloaded from the watch

Authentication

When I originally paired the device I scanned a QR code on the watch. Turns out this probably just had the mac address on it or something because after several hours of getting device info and pulling activity data I realised that I had never performed any authentication step. Resetting the app and repairing also revealed that there's no pairing code. Nothing. The device isn't locked or secured once connected. Anyone can connect to the device at any time and start issuing commands.

You can pull activity data, sleep data, heart rate data, menstrual data. You can perform firmware updates. I don’t have GPS in my version of the watch but I bet you can access that as well. Whatever you want you can get it - there’s no protection. I can only test with my watch, but I suspect this applies to most devices that have been white labeled.

The watch I bought off Amazon has 5000+ ratings and the VeryFit app has over 5 million downloads. There’s 10k Ryze Connect app downloads according to Google Play. From what I can tell Cove, BoAT and bfit Move are also using the same app/protocol - I imagine there is a lot more as well. These are sold in retail stores like JB Hifi and Harvey Norman. There’s even kids versions of these watches which adds to a creepyness factor.

I pondered about how to disclose this and decided to go with the “fuck it” approach. The reasoning is that it’s going to be hard to find all the different vendors to let them know and I suspect they won’t care. I just don’t have the time go talk to AI chat bots, lodge support forms, have support people not understand my concern, have multiple generic replies saying “it’s not an issue our vendor said its secure” ect…. Is “JJMJKJ” (yes thats who sells it on Amazon) going to do anything - no. Not a chance. So rather than go through that entire process, get burnt out, and still be left with no actual device security I’m taking the approach of letting people know as soon as possible.

If you’re in Australia you can request a refund as the device isn’t fit for purpose or of acceptable quality.

Git

Details about this can be found at https://github.com/xssfox/idowatch

Unexplored security fun

  • File transfer / bluetooth serial endpoints: There’s some file upload for firmware features that could be interesting
  • Hijacking the app by pretending to be a watch: Perform attacks the reverse way, pretend to be someone else’s watch. Maybe can make phone calls, take pictures.
  • Overflows: There’s lots of places that smell like overflows can happen.

The National Electricity Market: This post will probably leave you more confused

Unless you actively work across all layers of Australia’s national electricity market (NEM) there’s a high likelihood that this blog post will make you less confident in understanding the dynamics of the NEM.

I’ve worked at a aluminum smelter (one of the biggest energy loads you can find on the grid), a energy distributor and a power station. My first introduction was an aluminum smelter. The smelter wanted to change its load based on the spot power price. This dabbling in the NEM lead me to investigating the NEM’s open data and I eventually setup a website to visualise it. For a long time it was one of the only free and accessible ways of seeing the spot price, generation / demand and station generation. The websites long shutdown now. Often I had requests from people thinking I was from the AEMO/NEM which was concerning…

Even with this knowledge I still made bad assumptions and mistakes when talking about the NEM. From a high level it seems simple, you have loads and generators, people bid, you get a power price. There are soooooo many more variables to it than that though. It’s highly likely that I’ll still make mistakes in this post. Also the rules, requirements and the dynamics of the market change frequently.

Complication 1 - Interconnectors

The NEM currently has the following regions, VIC, QLD, NSW, TAS and SA. I say currently as the Snoy Hydro used to be its own region. Each Interconnector has its own limitations based on technology (AC, or HVDC).

It’s easy to simplify this as each region is its own market and the interconnectors are just generators and loads. This gets you so far. However where this starts to break down is that the interconnectors pass power through them, not cash. All going well power flows from low price region to high price region. However there’s a variety of factors that cause the fiscal model of the NEM not to match the actual model - for example data errors could cause the interconnector to fiscally run the opposite way to physically. This makes people sad.

Bonus complication

Interconnectors aren’t necessarily a single physical interconnector and could be made up of multiple connectors. This is called a notional interconnector. A directional interconnector is all the interconnectors flowing in a single direction to another region.

So you have to consider not just the limits of a single interconnector but all the interconnectors between the regions

Complication 2 - Regional losses

Lets make a simplified version of the NEM with a single generator and a single load. The load will be in Melbourne and the generator will be in Townsville. The loss in power from cabling will be significant.

The NEM “solves” this by using regional reference nodes and marginal loss factors to help calculate how much the power is worth.

Complication 3 - Grid constraints

Every piece of equipment has limits. The amount of current it can handle. How much can it handle at $x temperature. Best case when equipment hits its limit, it will trip, worse case it will be damaged.

Given the risk it’s important that these are modelled so that the grid can’t put a single piece of equipment at risk of hitting these limits.

What does this mean? Well even if we have plenty of generation, if its in the wrong place or other equipment is fault AEMO might need to request changes to ensure power system security, which includes things like voltage and frequency.

Check Out Constraints

The AEMO publish a constraint library to check out.. Might take awhile to get through. (Correction, the link provided here is for the WEM however it gives you an idea of the types of constraints and how many there can be)

Complication 4 - FCAS

Frequency Control Ancillary Services. FCAS. A very important part of keeping the grid at 50hz. This is an extra set of rules that are followed to bring the grid back to 50hz. More recently it was discovered that we needed even faster recovery from faults. So we also have Very Fast FCAS.

Bonus complication

Understanding AEMOs 3d FCAS trapezoid

A 3d line chart of the FCAS trapezoid. It’s very confusing at first glance with lines going everywhere
(it’s actually not as bad as it looks at first glance…. but still)

Complication 5 - RERT

Reliability and Emergency Reserve Trader. This is effectively AEMO having their own contracts with companies to provide generation or load reduction for when the market isn’t able to respond.

Complication 6 - Time scales

So I haven’t talked about much about bidding. Bidding happens every 5 minutes. So we have Very Fast FCAS and FCAS that happens in seconds to a minute, the spot price at every 5 minutes, RERT that happens at much longer. Forecasting also happens at different time scales. At any one time there are lot of different processes happening to determine who pays what and how much.

Watt Clarity has a good overview.

Bonus complication

You can nag the ACCC to make the 5 minute rule not apply to you.

Complication 7 - Demand

There’s a lot of demand we don’t see. There’s a lot of contracts we don’t see. It’s hard to know why a generator is bidding at a specific price.

Demands and loads often change based on spot price as well.

Funnily enough while doing some research here I found Watt Clarity’s explainer on demand which includes many more complexities!

Bonus Complications - Demand and generation in other contracts

Many contracts existed before the NEM. . See also Long-term Energy Service Agreements.

Complication 8 - Administered Pricing

When the sum of the price intervals equals a fixed amount administered pricing kicks in. This is a damage control system to prevent the spot price from bankrupting everyone.

Complication 9 - Outages are complex

A single unplanned outage can enforce a whole bunch of new constraints and limits. Suddenly the model has to reconfigure itself and make the network safe by imposing the correct constraints. Often outages are multi-faceted - eg a storm impacting multiple transmission lines, generators and loads. Or an under frequency / over current event could cause generators to tripped in parallel unexpectedly. You might lose six units at a powerstation due to a workers strike.

Complication - Closing….

The NEM is complex. I’ve only just scratched the surface here. Like the very very very surface. If you want to learn more much of the NEM is documented by AEMO and publish online. WattClarity also have very good learning resources and explain interesting events.

Bonus complication

Understanding Infoserver DB format