This is a pretty great, long form post about the structure of Bluesky, and how it’s largely kinda pretending to be decentralized at the moment. I’m not trying to make a dig at it. I’ve enjoyed the platform myself for a while, but it’s good to learn more about how it actually works.
This article was shared on Mastodon via its author here.


I saw a comment the other day about this saying you’d need like over 4terrabytes of storage to run a BlueSky instance of your own, and that it’s growing every day. That’s fucking insane.
That’s addressed in the blog post. She was saying it was currently 5TB and growing. So anyone wanting to set up a server would need to pay for that space, and that’s not cheap.
It’s also not, like, unattainable
But it’s definitely well beyond what any hobbyist is going to set up in a whim
Meh, homelab storage and FTTH are reasonably cheap. Or rented iron like Hetzner.
I thought it takes that much storage to run a relay, not an instance. (Which Bluesky calls a “Personal Data Store.”)
Maybe this is just my ignorance showing, but this seems like a really archaic way to design something like this in 2024. Dump all the data into a central repository and then have clients pull from that?
Bluesky (well, atproto, bluesky is the twitter clone running on atproto as a demo app) doesn’t actually have instances in the mastodon sense, it’s a more modular design for better scaling (because it was designed from the start to replace twitter)
Here’s a good article with illustrations https://atproto.com/articles/atproto-for-distsys-engineers
It’s not exactly decentralized if you use the official relay only, just distributed which is a different concept entirely