Identity in Australia is a mess and it's putting people at risk

This post is going to be broken up into 3 parts. Why you shouldn’t listen to me. A brief history and current status of identity in Australia. Why the current state is a problem and some possible ways out of this mess.

Part 1. Don’t listen to me.

This topic is so far outside my typical domain that I’m going to be missing a bunch of issues, concerns and subtle quirks. This in an opinion piece and shouldn’t be taken as fact. I wasn’t even born in 1985 (this is important later) so I can only go by information passed down to me.

Where I do have experience is that I have had the joy of updating all my identity documents. I have been victim of identity theft. And I have had the joy of myID (formerly myGovID) not working. I also work in tech and often have consulted on design decisions regarding verification and identity platforms.

Part 2. WTF happened and why is this like this.

For non Australians (Australians can probably skip this paragraph) reading this blog we have a system of authenticating to government services called “myID” (I’m going to ignore just how horrible this name is). myID is an app that can be installed on a phone and is tied to an email address. A government website (and soon non gov) can use myID to authenticate a user. It has various levels of identity strength a user can have, Basic, Standard, and Strong. Basic is practically useless - it means you installed the app and filled in some forms. Standard you have validated some documents like a drivers licence, passport, medicare card. Strong is where your photo is validated. The thing about myID is that many government services don’t require it. You can often sign up an account without a myID.

Now when you setup a new phone with myID you have to validate all those documents again. Why? Surely I could just sign into the account again. Or maybe another question is - why is myID even different account then myGov.

My assumption here is that is we need to take a trip all the way back to 1985, before the concept of a digital ID was even considered. The “Australia Card” idea was floated. The idea of the Australia card was to replace several government identity solutions and create a single solution that worked across both federal and state services. A big part of the Australia Card system was to crack down on fraud, tax evasion, terrorism and illegal immigrants. Effectively this was meant to be a magical cure for society. Every transaction would need an Australia Card number.

Now I should be clear here, I do NOT support the Australia Card proposal. The amount of tracking and overreach with that solution is immense, and I’m glad the proposal was eventually dropped. However it wasn’t without consequence.

The tax file system was extended, and every government service went along it’s merry way using their own identity systems, often having soft links to others.

At common law an adult may assume any surname by using such name and becoming known by it. A surname is not a matter of law but a matter of repute … The law of this country allows any person to assume and use any name, provided its use is not calculated to deceive and to inflict pecuniary loss.

New South Wales Law Reform Commission, Names: Registration and Certification of Births and Deaths, Report No 61 (1988)

This leads to an interesting outcome in Australia. We don’t really have a concept of a “legal name”. It’s a bit more like “whatever the service your trying to use is willing to accept”, and since federal and state government services don’t have a central identity system, you instead have an identity at a service level. Your name might be different (either intentionally, book keeping error, or system design issue) on all of these:

  • State transport department (eg, vicroads)
  • State services (Service Vic, Service NSW)
  • Medicare
  • MyGov
  • Tax office
  • ACMA
  • CASA
  • Local council
  • Births / Deaths / Marriages
  • Passport Office
  • Many many many more

And this is before we even get to private companies trying to validate ID.

Part 3. Why is this a problem

A large part of the Australia Card (valid) opposition is that data tracking is an invasion of privacy. Additionally the threat of cutting off services to those refusing to use it or don’t have access to it hostile. It risks cutting off communities from services they need to live.

At the end of the day though, even without the Australia Card, the government still kind of “won” and fucked everything up. Sure, not every transaction was tracked, but we have tax file numbers and business transactions are heavily tracked. Banks are required to report suspicious transactions, as low as $10,000. Every service we interact with today asks for a birth certificate number, or drivers licence. Our laws require companies to store KYC (know your customer) data for long periods of time. The government will happily destroy the lives of hundreds of thousands of lives through shitty data matching - regardless of any sort of Australia Card.

When my identity was stolen in 2018 and used for phone toll fraud I was informed to update my drivers licence number (side note here, if you don’t have a drivers licence - identity becomes hard mode in Australia. Fuck cars.). This is good advice - however - the transport departments create drivers licences to identify who has a drivers licence. Yes a bit of a tautology, but the key point here is that the transport department is not designing drivers licences to be a generic identity document. Their use case is for police to check if you are allowed to drive. This has likely changed since my identity theft incident due to more recent data breaches, but at the time QLD did not allow changing the drivers licence number (CRN) unless a police report recommended it - and police in Australia do not investigate or write reports about fraud that has happened from overseas (amazing I know). I was unable to change my CRN. From the point of view of Transport and main roads QLD, my drivers licence could still be used for the purpose it was designed for. They would happily reissue me a new drivers licence with the same number…

So this is why myID asks for a bunch of documents to validate your identity. It’s trying to be the Australia Card scheme, without being the Australia Card scheme. A trojan horse of central identity. But it falls short because of our mess.

  • Often fails to link identities because the data is slightly wrong or different between services
  • Name changes can only be verified in NT, SA, TAS and ACT. lol.
  • Even name changes in those states don’t work work if you don’t have a name change certificate (often the case for people who have updated their birth certificate during gender transition)
  • You used different names between services
  • We’ve created a link between all the accounts, thus defeating the point of opposing Australia Card
  • Fraudulent users can just choose not to use myID in many cases and falling back to uploading legit looking documents
  • myID’s approach is a security nightmare in it’s core design
  • myID branding and usage in gov services is confusing at best. There’s no way for a user to learn what is safe and normal
  • DID YOU KNOW THAT MYID ISN’T EVEN THE ONLY POSSIBLE PROVIDER??!?. That’s right, we might see more. Because free market and such, Australians need to know that they aren’t just looking for “myID” but also “Australia’s Digital ID System” tick. Make sure that your signing an “Australia’s Digital ID System” by checking this completely unhackable image. I swear to god.
Digital ID accreditation bade

So today what happens when a user is trying to access government services is a mishmash of identity verification methods, sometimes myID, sometimes optional, sometimes custom built. Users are trained to just send pictures and numbers of their identity documents to sites and services without much thought. Every state has implemented it’s own poorly implemented digital drivers licence.

Data matching is ripe. And fuck ups are common.

Then we have the private sector.

Hotels will photocopy, scan, and in some cases save into a public file share your drivers licence when you check in. The same identity document that can be used to access all other gov and private services. When identities get compromised we can’t even trace back to where they were stolen from.

With the government imposed social media ban for under 16s more and more Australians are having to validate their age online - often using the same identity documents that allow all access to their life. Each week we receive a “privacy and security of our users is a top priority.” email and wonder what our identities will be used for this time.

The state today is:

  • The government still doesn’t care about our privacy and tracks us anyway
  • All our identities are linked if we like it or not
  • For many people, proving our identities is hard or painful
  • The digital ID solutions today don’t provide a secure way of verifying our identity or anonymously (both parties) proving our age to third parties
  • Companies are expected to keep copies of our identity documents

The political nightmare that was the Australia Card has naturally tainted any possible fix to this problem, but I think we can fix this.

  1. (the big ask) Government repairs it’s social contract, stops treating it’s citizens like criminals and actually provide services to citizens. Provide consent models and allow people to opt out of things. A digital ID should be optional. Using one should be privacy preserving and require consent.
  2. myID needs to be the one and only way of proving identity online, both for gov and private sector
  3. myID should provide a consent model of what each service can see. In most cases myID should only allow verify, and not exchange (at the moment myID can provide data to every gov service)
  4. myID should not provide any mechanism that could allow data matching of identities between services and should minimise the ability to identify users when performing verification tasks like age checks.
  5. Private sector should be legally required to provide myID verification option. (there’s some caveats and concerns around this. The intention here is to provide a known secure way for people to verify identity while ensuring the private sector doesn’t store documents)
  6. Drop drivers licences as a form of identity and allow myID to be a real identity, not just a meta identity.

I think it’s possible to build digital ID systems that help users be more secure and more private online, but it also relies of the government not fucking it up - which I know if a big ask.

Part 100 point check

The 100 point check “system” is the dumbest concept. Lets compare some 100 point checks across services

Renting - seemingly different per agent
Passport: 30
Drivers licence: 40
Birth Cert: 10 ????
Bank statement: 20

NSW Health, AFP
Passport: 70
Drivers licence: 40
Birth Cert: 70
Bank statement: 35

AusTrac / Banks
Passport: 70
Drivers licence: 70
Birth Cert: 70
Bank statement: 25

Equifax
Passport: 70
Drivers licence: 40
Birth Cert: 70
Bank statement: 25

Why is this like tennis scoring. You could normalise all these values down. No point system has an identity document that is worth “1” point. Further to this, nearly all the 100 point systems require $x category A documents and $y category B documents. The numbers don’t even matter at that point!


YaCy - The search engine I thought I would love

Do you want a search engine that works like old Google? Distributed and run by the people? Free and open source? No ads or tracking? Mastodon for search engines????

I’m going to introduce to YaCy, which is exactly that. Then break your heart and tell you why it doesn’t work.

YaCy is a distributed search engine and crawler which uses similar tech to torrents (distributed hash tables). It allows for anyone to start crawling websites and running a search frontend. The peer to peer nature of it means that while your node might not have crawled a specific site or information, another node might have - allowing for searches to be distributed across the network and hopefully returning a result for your query.

Some time ago I wondered, on modern internet connections and todays cheap storage, how hard could it be to run your own search engine. YaCy popped up in my searches and I thought I would give it a go. Estimates for Google’s index size vary from 30 billion to 400 billion. That might seem like big numbers but for computers these a tiny. Also to make things easier we can limit ourselves to purely text documents and if we avoid indexing useless or low value websites we can bring that number right down. My finger in the air estimate is that a search engine with 3 billion indexed high quality pages is going to be just as useful for most people.

Number of documents indexed, showing 300k indexed in the last day

With that in mind I bought a beefy box from OVH, configured YaCy and set it crawling the web. In less than a month its indexed 24 million pages. while the entire YaCy network has around 2.4 billion indexed documents. My node can easily index 1000 documents a minute if given the chance. One of the neat things about YaCy is the concept that a website can run their own node to provide search results for their site search and any other remote queries on the network. For example Wikipedia could run their own official Wikipedia YaCy instance.

It all falls apart when you try to use it however.

Search results from YaCy. Many of which are random zip and tarball files

Search results are often lacking useful content and more often than not unofficial/low quality pages will rank higher than official or trusted sites. Often I just random ftp servers, tar balls and zips. Now there’s probably some settings I can tweak to make site ranking a bit better - but it’s not a good start.

Slight tangent here. While playing around with YaCy I also found SearXNG. It’s an opensource metasearch engine. You configure multiple search backends and when you search it performs that search across all of them. So you can get Google, Bing, YaCy, DDG all in one. If using YaCy I suggest setting this up. I’d love to have the results from other search engines then get fed back into YaCy to index.

SearXNG results which shows many search engines results

This brings me towards why YaCy isn’t really usable today (or probably ever). Google Search from yesteryear just wouldn’t work today. A lot of content is behind walled gardens - such as Facebook and X. But ignoring these areas modern webpages are JavaScript heavy - often empty pages that are loaded through API requests.

I hear you screaming “what about SEO?!?”. Apart from SEO being silly, it’s now GEO, Google Engine Optimisation. Displaying or rendering content specifically for Google. If you attempt to scrape websites using the YaCy user agent you are often left with disappointment. If you think you can just switch to a Googlebot user agent your left with being blocked by WAFs and CloudFlare for not coming from the right IP / AS number or other types of fingerprinting. Places like StackOverflow try very hard to not have their content scrapped as it would destroy their business model.

Today we have a new problem, AI scraping. YaCy practically appears no different to an AI scrapper when using a Googlebot user agent. The AI scraping shitstorm has effectively stopped another search engine crawler from existing.

Regardless of the crawling issues, the way we use search engines, and the quality they provide has also shifted. YaCy’s basic search algorithm just isn’t suitable. The key component to this is having the search engine understand not just the word but the context of the word. For example “monitor” could mean a computer monitor, or to watch something. The search engine should use the other words in the query to determine which pages relate to the type of monitor. This is especially important given the rise of keyword stuffing and AI generated slop.

Side note about AI slop. One of the things I have found surprising is that YaCy has very rarely given me results for AI slop. I’m not sure if this is because AI slop is less indexed, or the SEO optimisation that AI slop performs isn’t effective with YaCy, but it is an interesting observation.

We also expect additional features, for example I often search for “time in $x” and “weather” to get quick previews. Maybe for these I should move to tools outside a search engine.

YaCy itself is dated. It’s a fairly old project and development has slowed down. Slow development itself isn’t a problem but it’s design and architecture leaves a lot of be desired. I get the vibe that there’s probably some security issues hidden in the old code base waiting to be discovered. The other part is that I don’t think there’s been enough attention to privacy and moderation. The controls they have today kind of work, but it’s not something I’d suggest using if you want your queries to be kept private. It’s very easy for crappy spam to end up indexes as well.

My other concern is that if everyone was to rush out and install this software, we’d have a ton of people scraping popular (or even non popular) websites like Wikipedia unnecessarily. How to balance freedom and coordination here is a little tricky.

That doesn’t mean YaCy is all bad. It can run in several modes - the one I’ve been talking about is the “community based web search”. While I haven’t tested it out yet, there is also “Intranet indexing” which you may find useful for indexing your local file server.

YaCy remains a project I want to succeed and work well. The dream and concept is great. Reality unfortunately places it in the not very useful category.


Around the Bay 2025 100km

If you follow my partners YouTube channel, Drop Table Adventures, you’ll be well aware that they have been entering the “Around the Bay” cycling event for several years now. Specifically “The Classic” route. This is a 220km circuit….. around the (Melbourne) bay.

This year on a bit of a whim, and within 5 days until the event I decided to buy a ticket for the 100km route. Then I checked the weather. I should have checked the weather first.

Around the bay 100km map showing the path from Albert park to Williamstown and out to Mordialloc

The 100km route leaves Albert Park, over the West Gate bridge to Williamstown, back across West Gate, follows the bay towards Mordialloc before returning to Albert Park.

A 100km ride is big for me. The last time I rode 100km was in high school. Made even more challenging in that I hadn’t trained for this. But I felt strong when I purchased the ticket.

Challenges:

  • Scoliosis - riding long distances has historically been extremely hard on my back
  • No training - I did a 40km social and slow ride about a month prior, and a 20km about 2 weeks prior
  • Still somewhat recovering from my last bike crash
  • I don’t really have much experience in group riding
  • 3 days prior I managed to give myself food poisoning (stomach related fun seems to be a reoccurring problem when I enter events it seems)
  • The 100km route requires crossing the West Gate twice - so a bit of elevation change compared to just riding the bay trail

On my side:

  • Have been keeping fairly fit, swimming, short runs
  • Competing in the commuter cup 2-3 days a week

Prep

To give myself the best shot of making the distance I stripped my bike of the lock, front and rear racks. Given the weather I left the mud guards on (more of a courtesy to other riders but seems I was alone here). With no racks, I had a small saddle bag which carried a toolkit, bacon strips, tyre levers, a tube, my phone and a pump.

For hydration and fuel I carried… an AARNet water bottle. I decided to rely on the rest stops for fuelling needs. In hindsight I probably should have packed at least a single gel.

Go

Since I had traveled with Droppy and Alex to the event, I was very early for the 100km start. This meant I found myself at the start of the pack. It’s not a race so it doesn’t matter so much, but it certainly grouped me with some riders that were probably a little bit fitter than me.

Leaving the start line I was filled with “oh god what have I gotten myself into” as I kept up with some very expensive bicycles being ridden by people that at the very least looked the part. I on the other hand was wearing a loose fitting running top and running tights with my budget gravel bike. There was some slight drizzle but it was otherwise not bad conditions to ride in. Moving speed was around 30km/hr.

Riding up the west gate bridge

Hitting the west gate though and that speed dropped right back to 16-20km/hr, but I was still keeping up with the main pack. The pack was cruising at closer to 28km/hr after the bridge which was a bit closer to what I could handle.

At 18km we hit the first rest stop - I decided to keep riding through. The pack thinned at this point but I clearly wasn’t the only one continuing through. Back over the bridge and on the other side I felt that my legs were a bit more like Jell-O™ now (40km in). Unfortunately the pack split up a bit due to traffic lights and eventually I was practically alone. At this point I was considering skipping the next stop and continue straight onto Mordialloc (70km in).

Riding along beach road, waving at camera

However at 45km logic prevailed and I realised I needed fuel and a top up of my water. So when the 50km Elwood rest stop came up I entered. At this point I found myself in an interesting situation - I was too fast to be in the casual riding groups and too slow to keep up with the serious groups. At Elwood rest stop I entered while they were practically still setting up. I think I saw two other cyclists. I was still able to grab a banana and some water. Was hoping that there would be some hydralite. Onwards!

This next section between Elwood and Mordialloc was horrible, the worst even. I’m not sure if it was just because I was low on energy or the way the elevation changes occur, but speed dropped significantly on the slight uphills. The downhills didn’t feel like they lasted for long. A couple of packs over took me here, I tried to keep up for a bit but just couldn’t. Regardless still averaging around ~25km/hr. I was riding alone for pretty much all of this section.

Arriving at Mordialloc (70km) I was sore. I knew I had to spend some time at this stop. I grabbed some crisps, trail mix, and a gel. While eating I spent some time sitting and standing to give the muscles that keep my back in check a break. Legs were feeling ok, but not strong. During this time I overheard that rain would be coming at 11am. At this point I decided that maybe, just maybe I could make it back dry.

While the Elwood/Mordialloc stretch was the hardest, the Mordialloc to finish stretch was the sorest. At 80km my back just said no. I considered stopping for a break. I felt I was close enough that I could just suffer a bit more.

Riding back towards the start, It’s looking a lot more wet

I noticed on my bike computer that my average speed was around 25.3km/hr. This shocked me a little as I went into this thinking that my speed would be closer to 20km/hr. My immediate thought was “maybe I can get 100km in under 4 hrs”. Could I keep up my pace?

No. By 90km it had lowered to 25.0km/hr. All was not lost though, the elevation flattens out and I was able to keep my speed up. Just one tiny problem. The much more casual 50km riders were now mixed in, combined with more traffic lights. The average speed kept slipping lower and lower, the rain started, and my focus became just making it back. I’m glad I made this decision - the traffic was just too busy and the interruptions too great to make it.

Riding through one of the finish archways

Coming in to Albert Park for the finishing lap, it was wet. I was cold from waiting for traffic lights. My Garmin read 99.3km. This was going to be unsatisfying. So instead of ending the ride, I quickly funneled through the finish then rode to our car. Making up the missing 700m + giving me shelter from the rain.

The pain

Surprisingly my back recovered fairly quickly. Just being off the bike was enough. My right leg however was not. Any slight twist on it when load bearing was extremely painful. I was extremely concerned that I had undone a bunch of recovery.

The next day was a relief. While there was some amount “somethings not right here” feelings, the issue had mostly gone away. Hopefully this trend continues. I generally feel like my body should feel a lot worse than it is today.

The bike

So how did the bike hold up? It was a dream with one tiny hiccup. Shifting became unreliable about 50km in. I think this is just a cable tension issue and is easily fixed, but I didn’t want to risk making it worse during the day. I figured I could just double shift then back one when it didn’t change. Could be as simple as the wheel sitting slightly differently after being reassembled post transported.

Again?

I won’t say no, but it’s unlikely. This really pushed the limits of my back, just a bit more than I feel comfortable with.

The event was extremely enjoyable. Maybe I’ll sign up for the 50km instead.

The stats

100km Time 4:07:13
Average speed 24.2 km/hr
Ascent 538 m
Passing Vehicles 204
Pedal strokes 17,992

One more thing

Remember that desire to get under 4 hours? Well, since I get to set the rules… if you look at Garmin’s recorded moving time (aka, removing all the traffic light stops) we get:

Moving time 3:59:18