Please Help Me With Activity Pub
Blender Dumbass
June 08, 2025👁 78
https://blenderdumbass.org/ : 👁 5
https://lemmy.dbzer0.com/ : 👁 2
https://discuss.tchncs.de/ : 👁 1
https://community.nodebb.org/ : 👁 1
https://mastodon.online/ : 👁 1
https://blenderdumbass.org/articles : 👁 2
https://mastodon.social/ : 👁 3
https://blenderdumbass.org/articles/clarifying_costs_of_running_the_fediverse_with_jerry_from_infosec.exchange : 👁 2
#activitypub #fediverse #mastodon #bdserver #python #programming #webdev #federation #API
Please excuse me for not recording a voice version.
This article is published on a website which is powered by
BDServer. And I'm trying to make this website support ActivityPub, so you could for example, subscribe to me from your
Mastodon account. Yet it is easier said than done.
If you have any experience with ActivityPub, web-development or Python, please consider helping me. We have
BDServer Matrix Chatroom.
Now let's go over the stuff I understand / did so far.
How computers communicate?
Contrary to believes of most internet users these days, internet is not living in the cloud. When you go to a website, you are not downloading it from the ether, or the air, or an actual cloud. You are not communicating with space. You are calling another computer.
You know how phones work? ... Wait... Everybody has a cell-phone these days, damn!
You know how a string-telephone works? The thing where you take two cups and string them together using a rope. And when it's tide enough, you can speak into one of the cups, and on the other side a different person can hear you?...
So this works because sound is a wiggle of air molecules. It's a vibration. It's a movement of air, that the ear-drums in your ears are tuned to feel.
So when you have a tight string, and something that can focus those vibrations onto the string ( aka a cup ), the string will start wiggling, roughly in the same way as the air does. And then the other cup will wiggle with the string, and so it will wiggle the air around it, and therefor produce sound. Which sounds very similar to the sound on the other end of this telephone.
A normal electric telephone is doing the same thing, but instead of a string it uses an electrical wire. And instead of cups it uses a microphone ( which converts movement of air into electrical charge ) and a speaker, which is a membrane that moves air, powered by electricity.
Internet is two computers using a telephone.
Of course sound is pretty much useless for computers. So instead they encode data directly into electrical signal. They send the ones and zeros. For example 1 could be slightly more voltage, and a 0, slightly less voltage. And so on. And using zeros and ones you can communicate numbers. For example you can agree that you communicate in chunks of 8 distinct bits. Each is either a 0 or a 1. And then some combination of those bits is some number.
Actually you can do even better, you can use a binary number system which makes a hell of a lot of sense. You know how you count to 9 and then you add an number in the beginning to make it 10? So you count 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and then there is no number for 10. You have to use 1 and 0 to write 10. Because we only have ten distinct symbols for it. But we could only have nine symbols and the same logic would still apply. For example 0, 1, 2, 3, 4, 5, 6, 7, 8 and because we have no "9" we write it as 10. And then 18 goes to 20 and 28 goes to 30. And so on and so forth. So we could have any number of numbers. Say we have only 0, 1, and 2. Well let's count: 0, 1, 2, 10, 11, 12, 20, 21, 22, 30... you get the point. With binary we only have 0 and 1. So we count: 0, 1, 10, 11, 100, 101, 110 and so and so forth.
Using this system we can make the computer agree that any number between 00000000 and 11111111 is a number between 0 and 255. Because if you do the math, 255 ends up being 11111111 in binary, or for example 00000101 would be 5. And then we can assign text characters to those numbers. So A could be 1 and B could be 2 and so on. Actually the real way to get text through requires you to know text encoding. In
ASCII encoding A is 65 because they decided that some common functions should a part of it. And things like
!
( which is 33 ) come before A.
In any case, using this method we can use electrical signals and a few wires to connect multiple computers together. And so if they have the software that knows to listen to whats coming in on the wire and to respond to that something, you have yourself a server.
So when you opened this article in your browser, your browser software told your computer to send my computer a request, kind of like on a telephone, but way faster, because computers are not as slow as humans with responses. My computer knows to respond to this request by sending you back this article. And then your browser knows to display it for you so you could read it.
What is Activity Pub?
Mastodon uses ActivityPub to communicate with other Mastodons. I guess... ah... well. I will need to explain that too.
Mastodon is a social network. Kind of. You can say that really the social network is ActivityPub, or more broadly the Fediverse. Mastodon is a part of it.
With regular social media, say Ex-Twitter, when you write a Tweet ( I don't know if they are still called "tweets" ), you are communicating with a server ( a computer ) at the Ex-Twitter company headquarters somewhere ( maybe, if they don't rent a computer somewhere else ). Sorry for my pedantry, but there might be some reader that doesn't know the nuances.
Basically all the Tweets are stored on one computer ( or like at one place, it is probably more than one computer ). Tweets from you, from other people, from Elon Musk. Are all stored in one place. And Elon Musk can mess with the files on that server thingie. So maybe next time people load Ex-Twitter in their browser, your Tweet doesn't look the way you intended, or it isn't there at all because Elon Musk ( or anybody else who has this access really ) deleted the wrong folder, or intentionally messed something up.
ActivityPub is trying to solve this by making it so you can control your side of the bargain, while other people can communicate with you, like on a normal social media.
Just Mastodon has something like
almost 9 million users as of now, but that is not all in one place. Those 9 million users are spread over almost 10 thousand servers. Each are operated by somebody else. For example
mastodon.social and
mastodon.online have reported 90,849 users in their
2023 annual report. Which are not 9 million. The other people are coming from other servers, operated by other organizations. Mastodon developers even operate a website that shows you
all those servers so you can know what is going on.
With this platform there is not only the communication between you and the server. But also between one server and another server. I have an
account on mastodon.online but if you go to my followers, you can see people from all over the place. You will see places like
mastodon.social and
universeodon.com and
social.firesidefedi.live ( which is by the way not a Mastodon at all ).
And so when I post something to my account on
mastodon.online the server of
mastodon.online sends this message over to all the other mastodons. Using a predefined thing called an ActivityPub protocol.
You know how I talked about binary and how two computers can agree that 65 is A in the Latin alphabet. Well ActivityPub is a set of rules to make this kind of interaction between servers possible, but not just to communicate simple text messages, but communicate social media concepts. Like a post, or a like or a user account. And stuff like that.
This is how somebody from
social.firesidefedi.live which is not even a mastodon can follow me on
mastodon.online. The software they are using is called
GoToSocial and it is a different implementation of something that uses ActivityPub. And because both mastodon and gotosocial understand the language of ActivityPub, people from both of those things can interact with one another.
So I thought it would be great to add this ActivityPub thingie to blenderdumbass.org so you could, say, comment on this article, from your mastodon account, or something.
Actual Activity Pub
Trying to read the
ActivityPub technical manual struck me as extremely unhelpful. They don't do what I did in this article. They don't explain anything. It is a bit more helpful now, when I understood the fundamentals, I suppose. But it is not a good place to start.
After looking at a few places, one being a very well put together
Tutorial by the Mastodon CEO himself. I started understanding it somewhat.
The state of it ( on this website as of now ) is this. When you try looking up
@blenderdumbass@blenderdumbass.org
from Mastodon. You will not find it there. But I do see proper Activity Pub related requests to my server.
On the other hand trying both
lemmy and
misskey, I can find this account. The user name and bio renders normally. From misskey I can even follow this account. And the follow goes through and gets accepted. If you go to
@blenderdumbass on this website, you would see the test followes from misskey.
I see it is trying to request the list of followes and the list of people I follow. And based on what I know so far, I should have implemented those 2 correctly. But that list doesn't show up in misskey. The same with my outbox, which is the list of articles I made.
The source code that I use for this website's ActivityPub could be found using this link:
https://codeberg.org/blenderdumbass/BDServer/src/branch/main/modules/ActivityPub.py
JSON
JSON is a very simple text based data formatting format. Let me explain you JSON very quickly.
{"hello":"world"}
Here is a very basic JSON database. It has only one key ( as in one variable ) called
hello
and the data of this
hello
key is text that say
world
.
Here is a bit more interesting example. Let's make 2 JSON databases each representing some data about a person.
Here is person A.
{
"name":"Steven Spielberg",
"profession":"Film Director",
"films":["ET", "Saving Private Ryan", "Schindler's List"]
}
And here is a person B.
{
"name":"Richard Stallman",
"profession":"Programmer",
"programs":["Emacs", "GCC", "GNU"]
}
You can see those are similarly composed data things. Each having keys like
name
and
profession
and then some data that comes with it. Now in this example, I did something that would work but that is kind of a bad idea. Steven Spielberg got a list of
films
and Richard Stallman got a list of
programs
. Instead we could generalize and have it be
works
. Like so:
{
"name":"Richard Stallman",
"profession":"Programmer",
"works":["Emacs", "GCC", "GNU"]
}
This way, a program that will read this file, will need to do less work. It doesn't need to know, based on the professions what kind of key it needs to access. It just knows that there should be a data-point called
works
and that's it.
Here are some general things you can do with JSON. You can stuff databases inside of databases like so.
{
"name":"Blender Dumbass",
"likes_a_lot":{
"name":"Steven Spielberg",
"profession":"Film Director"
}
}
You can do lists with
{
"name":"Blender Dumbass",
"likes_a_lot":[
"Steven Spielberg",
"Richard Stallman"
]
}
Or you can do lists of data...
{
"list_of_people":[
{"name":"Blender Dumbass"},
{"name":"Steven Spielberg"},
{"name":"Richard Stallman"},
]
}
And you have other types of data, than just text, like so:
{
"name":"Steven Spielberg",
"film-director":true,
"films":{
"name":"ET",
"year":1982
}
"programs":null
}
In which case numbers you can write without the
""
and you can also use either
true
or
false
for data that could be either yes or no. And
null
for missing data.
Okay...
Problem 1: Everything is Type-Obsessed.
At first I didn't understand why doing this kind of web-finger thing to things on mastodon gets me the links that it gets me. Like you can access
https://mastodon.online/.well-known/webfinger?resource=acct:blenderdumbass@mastodon.online
but the
activity+json
link is just a link to my mastodon profile. Until I understood that I have to specifically request a JSON version of the page in order for the JSON data to be rendered.
Mastodon provides a stupidly simple way to do this in a browser. You can just add
.json
in the end of the url and you are done. So say going to
https://mastodon.online/@blenderdumbass.json
will give you ActivityPub data about my mastodon account. But that is kind of just nice that they did this. Other ActivityPub related projects only care to provide the data in JSON if I ask it through the headers. And some ( like GoToSocial ) want me to properly sign my request with an ActivityPub account.
So for this to work, you need to specifically add HTTP header of the type
Accept:
which will specifically tell the server that you are looking for
application/activity+json
.
HTTP and headers
Ah... If we already doing it, so lets do it. So HTTP.
When you load this website a number of things happen. First there is DNS. Which you can think of a phone-book. Every server is on some IP address. Those used to be numbers. Now with new version of IPv6 it is text and numbers. But it's besides the point. It is a long string that is hard to remember. DNS is a server that you can ask: What's the IP of blenderdumbass.org and it will return to you with the IP address.
So as soon as your computer knows the IP address to access my website it connects through it to my computer. Well technically it connects to a server operated by
@Madiator2011 and that server connects to my computer via Tor. And so on and so forth. But that's besides the point.
You computer establishes a TCP connection to my computer. And then my computer is waiting for your computer to send over some text, which is written in a specific language called HTTP.
Here is an example.
GET /articles/please_help_me_with_activitypub
host: blenderdumbass.org
accept: text/html
date: Fri, 06 Jun 2025 10:17:55 GMT
user-agent: Mozilla/5.0 (Android 4.4; Mobile; rv:41.0) Gecko/41.0 Firefox/41.0
First it starts with the type of request. In this case you want to get something from me, so the type is
GET
. And then the page you are looking for. You are also sending other types of data, like
host:
which is the website name you are trying to connect,
date:
which is the date and time of the connection, and
user-agent
which is the thing that lets my server know how to render the page, so it will work on your device. As you can see in this example it is sent from Firefox on Android.
But then there is also
accept:
which tells the server what kind of data you are looking for. In this case you want to get a page on a website. Which is usually done in HTML code. So you would accept an HTML code back.
Back to ActivityPub
But if the request came in with
accept:
saying
application/activity+json
instead. It means what they are looking for is not a page, but rather the ActivityPub protocol data written into JSON format.
This is why it is not easy to do that from a normal web-browser, because a normal web-browser always wants HTML files as responses. Good that at least Mastodon thought about poor people like me that might want to figure it all out and include an option to just add
.json
in the end of the request.
But because it is understood that both ActivityPub data and the page itself is technically one URL but with a different
Accept:
header, some platforms don't even think there could be any other way to implement it. That is why if you go to that link in the webfinger of my account on this website, you will be redirected to the page itself. I had to make it this way because some websites put the data link as if it is a link to the page.
In any case when I figured it out. And made a JSON thing spit out of my browser for the user data. And that made some websites find that user, other still can't find it for some reason. I wanted to make following work.
ActivityPub security
Here is something really stupid about Activity Pub that perhaps makes a lot of sense to some people. But not to me. I think it is absolutely insane that they did it like so.
So here is a problem they were trying to solve. When you click to follow my account. Your server sends a
POST
request to my server (
POST
is similar to
GET
in HTTP, but it means that there is also payload attached. It could be anything really. In this case it is some additional data written in JSON ).
But my server needs to know for sure that this request came from your server. And not only that. But that it came from you specifically. Because anybody technically can make a
POST
request to my server and pretend to be you.
So they are using RSA signatures. Which is encryption territory. Basically they are using math, something to do with prime number factorization, which is a math problem that computers have a hard time doing, to generate two blobs of text. One is called a private key and another is a public key. Using some more math that I personally don't really know much about you can either sign or encrypt some data with that private key. And either verify or decrypt with public key.
Because HTTPS ( HTTP with security ) already encrypts requests, we only care about the signature. The verification part. Basically the server that you are on signs some data with that private key which it knows about. And then tells everybody what is the public key. So my server can get it and download it. Actually if you look into
https://mastodon.online/@blenderdumbass.json
you will see the public key of my mastodon account is there in the JSON.
Using that public key I can cryptographically verify that the message was signed by the private key from the same private-public pare. Basically if the verification algorithm tells me that it's verified, somebody who knows your private key has signed it. And if you don't allow willy-nilly access to files on your server to Elons and Musks I can bet that this message came from you.
By the way Email verification is done kind of in the same way. You write your email, but you want to make sure people on the other side can trust that you actually wrote it. So you publish your Public Key, and sign your email with your Private Key. Which ads a little blob of seemingly random characters to the end of the email, with which the receiver can verify it is you who sent it.
This is by the way why
Free Software Foundation Staff have links to their public keys under their names on the staff page. You can go and click on those and see them for yourself.
In any-case to verify a message from you what I need is 3 things:
- The message itself.
- The signature.
- The public key.
The public key is simple. You already know how to get it if you paid attention. It is in the ActivityPub data of the account. So this is not a big deal.
The signature is also kind of not a big deal. It is stuffed into the headers.
Here is an example from an HTTP request from misskey:
host: blenderdumbass.org
accept: */*
accept-encoding: gzip, deflate, br
content-type: application/activity+json
date: Fri, 06 Jun 2025 10:17:55 GMT
digest: SHA-256=+86edz+P4M8ZLhYN2d+pUIzbzO2fRMAhsEHzX4BgHnU=
signature: keyId="https://shark.madiator.com/users/a8nizbgfpor600c4,algorithm="rsa-sha256",headers="(request-target) date host digest",signature="N5EMd0tUvFJZaGKW5e8h6TvzwOlV3hQhBA7N59cXQkpjBdv01dkZg8/CQY1N7Bh6k/tPSCQIZ4QaDbwmMSLv4TMTQstxABMTQyTxbZ15DhTtmlJvubhHQ9uhtWRUFBGBquFyyHftDCq4xQCEet+7X3/GvShmwRQbekglzYV4rHAcJrOqojqmG5SqiSR7wvLMJrGW/oe5vUIVeVTm7uDQkWdNWti184tXnNPwhT34XqjWb0qzxtgK0DQ8wwJ9GbIEoqv8OPTVxI17iSP0Sj09L//B7sURIos+6QOlayD0DBw1/Pvbm+4Lq3wUNPs05KFDeExWEHGM7ezJHbR1Ogp50w=="
user-agent: Misskey/2025.4.2 (https://shark.madiator.com)
You can see it doesn't care what it accepts because it sent me basically just a message that somebody wants to follow me. The
content-type
is
application/activity+json
so I can accept a properly formatted JSON data. Then there is
digest
which we will talk about. And finally the
signature
which links to where I can get the public key for it, how useful, then it talks about the specific algorithm. I hope nobody actually signs stuff with more than just one algorithm. And then there is a signature itself.
So we have the payload ( or the JSON data ), the signature and the public key. But if you try to verify it, it will fail the verification. This is because JSON data is not what was signed. In fact nothing in this request is what was signed. Your server need to recreate what was signed based on the context of this whole thing.
Fucking insanity!
So I figured it out. And here is what I found. The
digest
is a base64 encoding of the SHA256 hash of the JSON payload. If you do a SHA256 hash of the JSON payload on the server side. And convert the hash to base64 encoding of it, and the digest string matches the stuff you got exactly, you know that the JSON stuff was not tempered with on the way to your server. But how do you know that the
digest
is not tempered with?
Well I overlooked a little thing from the
signature
header. It is called
headers
. And as you can see it says
(request-target) date host digest
.
What it wants me to do, is it to take the headers and generate this string from them:
(request-target) post /activitypub/inbox/blenderdumbass
date: Fri, 06 Jun 2025 10:17:55 GMT
host: blenderdumbass.org
digest: SHA-256=+86edz+P4M8ZLhYN2d+pUIzbzO2fRMAhsEHzX4BgHnU=
As you can see is in the order that it told me.
(request-target)
means get the first line of the HTTP request please. Which is usually
GET
and then the page. But in this case they are including a payload, so it is
POST
and then the page. And in this case the page is
/activitypub/inbox/blenderdumbass
because they are trying to post a Follow to my account specifically.
Then it wants me to add the
date
header, then the
host
header and finally the
digest
header. Which we know is telling us that JSON was not tempered with. And then THIS IS WHAT THEY SIGNED!!!
If even 1 character is not correct in this recreation, the signature will not be verified. This is some insane bullshit!
I mean you could include a signature of the JSON payload in the header. I don't know. Something about the way they did it rubs me the wrong way...
GoToSocial security
Now funny enough, I was trying to get data from various ActivityPub accounts, just so they could be rendered nicely. I tried myself, it worked, I tried
@ozoned@social.ozoned.net and it gave me an error...
It gave me an error! I looked at the
documentation of GoToSocial and apparently they want me to sign ( you know the crazy way that I just explained ) absolutely any request that I make to them through ActivityPub. But, because there is no payload on
GET
I should not include a
digest
in the headers and should therefor not include
digest
in the headers list hint thingie in the signature.
On the page about
HTTP Signatures they literally show a different way to set it all up just to make a basic request for data.
Now okay, fine, whatever. They are trying to make it resilient against some bots. You have to have a proper domain and a server to make such a request. It makes some sense. What I was afraid of, was the fact that BDServer was not multi-threaded.
Say
@ozoned@social.ozoned.net follows
@blenderdumbass on this website. If you click on the account you should therefor see Ozoned's account in the list of followers. To render that I need to know what the name is and what the icon is and so on and so forth, for that my server needs to get this data from his server.
Now all of this is happening as my server is preparing the page for you who clicked on my account. So if the server is single threaded it will not be able to receive any other requests until the page is sent to you.
So lets think about this. You open the page, on the page there is a new reference to
@ozoned@social.ozoned.net which needs to be rendered. My website signs a
GET
request to social.ozoned.net and sends it over. But ozoned's server need to verify my request. So it sends a request to me to get my account data with the public key in it. But it can't until I send you the page first, because it is just one thread. So if fails verification. I fail to get the data and you see some generic text instead of the fully rendered account link.
This prompted me to actually multi-thread my server. This made everything a lot faster. But I need to worry about thread-safety now.
Thread-Safety
Multi-threading is a complex problem in computers. It is very nice to separate processes that can be separated to be done in the same time. It is good if I can answer your request and somebody's else request in the same time. It makes everything faster.
But there could a problem with that. For example, when you load a page, the counter of the views updates by one. This means the server opens a JSON file somewhere on my computer, changes the data in that file, and writes it back to the disk.
Now consider a possibility of 2 people loading the same page at the same time. If I have enough traffic, that is a possibility that might happen. Two requests that happen simultaneously. Both read a file, change something in it and write it back. What if I start writing one version if the file and then another thread start writing over it, while the first one is not done yet. There could be a possibility of this file getting all corrupted and weird looking. And therefor it will no longer function as intended.
For this just in case something like this happens, I had to go over the entire BDServer source code and introduce a Thread-Lock anywhere where I write a file. A Thread-Lock is basically a thing that makes it so no two threads can execute the same part in the code in the same time.
It's like, the fist thread that gets to the thread-lock says: Hey it's mine! I do it.
And then until the thread releases the lock, all other threads ( that want to do the same thing ) wait in line basically.
I don't know if I explained it well...