Making a Discord library: part 1
Created 2023-7-10
Related to 1 project.
I've been focusing on https://github.com/fleuralice/discord-docs recently. It's time to put that to the test.
I believe that a set of JSON schemas can help easily create a basic Discord library, especially for those who have a good idea of what they want to make. Today, I'm going to work on my own library (bloom
) and use the following components for it:
attrs
for data models cause I absolutely loveattrs
!anyio
for concurrency as while I would prefertrio
, people preferasyncio
:(discord-docs
for the schemas I will use, for convenience's sake.- my own websocket mini-library, which uses
wsproto
andhttpx
under the hood cattrs
for converting things from the gateway into my data modelsisort
,black
,ssort
,pyupgrade
for making generated files nice to look at (and make them valid in the process)
My goal is to connect to Discord's gateway and log any seen messages. HTTP routes aren't in discord-docs
yet and ratelimiting is complex: I want to make a post all about ratelimiting!
I already have a basic module structure with _websocket.py
and a couple irrelevant modules. Let's start!
Step 1: connect to Discord
In my console I get this:
{"t":null,"s":null,"op":10,"d":{"heartbeat_interval":41250,"_trace":["[\"[trimmed]\",{\"micros\":0.0}]"]}}
Essentially, I just connected my websocket to wss://gateway.discord.gg/?v=10&encoding=json
and printed any messages I got out from it. Here's the pertinent code:
async def connect(client: AsyncClient):
async with ws_connect("https://gateway.discord.gg/?v=10&encoding=json", client) as ws:
async for message in ws.read_messages():
print(message)
Step 2: heartbeating
Essentially, I am just going through this flowchart:
This means the next step is to start heartbeating (with some jitter) the moment I get the opcode 10 event. Can do! (NOTE: the Discord's docs note "In the first heartbeat, jitter
is an offset value between 0 and heartbeat_interval
" which is just wrong. Don't trust the documentation!!)
While I'm waiting for my heartbeat code to actually send something, I believe it works. As such, let me write up this part now!
The relevant bits were:
- creating a
anyio.TaskGroup
to spawn off the heartbeating task - spawning off the heartbeating task via
tg.start_soon(heartbeat_loop, ws, msg["d"]["heartbeat_interval"] / 1000)
- handling if the opcode I receive is
1
- the actual heartbeating loop, as below:
async def heartbeat_loop(ws: WebsocketConnection, interval: float) -> None:
await anyio.sleep(interval * random.random())
while True:
await ws.write(json.dumps({"op": 1, "d": None}))
await anyio.sleep(interval)
By the time I've finished writing this, I can see the following in the console:
{'t': None, 's': None, 'op': 10, 'd': {'heartbeat_interval': 41250, '_trace': ['["[trimmed]",{"micros":0.0}]']}}
{'t': None, 's': None, 'op': 11, 'd': None}
{'t': None, 's': None, 'op': 11, 'd': None}
{'t': None, 's': None, 'op': 11, 'd': None}
{'t': None, 's': None, 'op': 11, 'd': None}
Step 3: identifying
This part's mechanical enough: just write to the websocket right after you spawn off the heartbeating task and add a few more parameters to your function.
Here's what I did:
await ws.write(json.dumps({
"op": 2,
"d": {
"token": token,
"properties": {
"os": platform.system(),
"browser": "doll",
"device": "bloom"
},
"intents": intents
}
}))
I like the quirk of differing "browsers"
and "device"
where the "browser"
is the codename for the specific gateway implementation and the "device"
is the actual name for the library. This isn't actually what the documentation suggests doing, but I like it enough I do it. Totally up to you!
At this point, I took break because I knew how annoying the next part would be!
... I ran my gateway during this break and it didn't break. Shocking. Still, onwards.
Step 4: sharing state
I want to push state into a shared class, meaning that I can get some heartbeating task <-> gateway receive communication going on.
So, I just defined this Shard
class:
@attrs.define()
class Shard:
ws: WebsocketConnection
_seq: Optional[int]
_heartbeat_acknowledged: bool
Next, I handled opcode 0
events by setting _seq
to their "s"
key and then handled opcode 11
events by setting _heartbeat_acknowledged
to True
.
Step 5: resuming
This is a pretty annoying step as resumes only really get sent to you every 2 hours or so, last I remember. However, a very simple trick is just to comment out the part of your code that tells the heartbeat an event was acknowledged.
After quite some extra indentation levels, I think this works. I have a while True
loop around a try
statement that catches when the websocket raises due to it being closed. It's a bit hacky, but it works. It doesn't actually end up successfully resuming but I'm pretty sure that's just an artifact of the whole close-the-connection strategy?
Actually, I don't think that's right.
Aha!! Turns out, you need to append resume_gateway_url
with your normal gateway URL parameters. That's annoying. But now my code works. I had to change too much and I don't think there's any specific thing to show... Though I guess here's my try
statement:
try:
url = shard._resume_url.replace("wss://", "https://") if resume else "https://gateway.discord.gg"
url += "?v=10&encoding=json"
shard._heartbeat_acknowledged = True
if last_identify - anyio.current_time() >= 30 * 60:
identifies = 0
if not resume:
identifies += 1
last_identify = anyio.current_time()
if identifies > 5:
print("too many identifies too quickly!")
break
...
except DeadWSConnection as e:
resume = shard._resume_url and shard._session_id and e.code in {3000, 4000, 4001, 4002, 4008}
except DeadConnection as e:
resume = True
finally:
shard._ws = None
if not resume:
await anyio.sleep(5)
Now then, that works. Let's get immediately to the next step.
Step 6: log messages
Ensure that your intents are GUILD_MESSAGES
and MESSAGE_CONTENT
: the magic number to look for is 33280
. Then, just add something in the main gateway loop. I'll go ahead and do that for myself now...
Here's what I did:
elif msg["t"] == "MESSAGE_CREATE" and not msg["d"]["author"].get("bot"):
print(f'{msg["d"]["author"]["global_name"]}: {msg["d"]["content"]}')
Now, let me go ahead and talk to some people!
A5rocks: forgot `"bot"` was an optional thing lol
A5rocks: but it works now 🙏
ibx: "bot"
A5rocks: ur a bot
This all works and could be the end of it. Now, let's finally get into where discord-docs
comes into play. I hope it's obvious by this point that setting up a basic gateway connection, while not necessarily easy (especially around restructuring required for resuming), is possible without trying too hard.
Step 7: autogenerate the models
... Aaaand, time:
$ mypy .\bloom\autogenerated.py
Success: no issues found in 1 source file
Now I have a 30,000 line autogenerated file. Great. That took me maybe 4-5 hours (I lost track of time) so I saved probably 10 hours or so, but more importantly that was really REALLY fun. The code is awful, I can still very much clean up the output models... but it works. As far as I know, it works. Step 8 will validate this.
Also, a bit of that time was spent fixing discord-docs
itself :^). The most important things to keep in mind, IMO, are the interactions allOf
has. While this is an extremely powerful attribute, discord-docs
uses it in 2 main ways:
- inheritance (i.e.
allOf: [{$ref: ...}]
) - spreading a base class all over a gigantic enum
Additionally, I had a whole bunch of errors about definition order but rather than fix them, I used from __future__ import annotations
(forward annotations) and ssort
. These were massive helps and prevented me from overengineering my own ordering.
Step 8: deserialize into the models
Luckily I'm using cattrs
. It shouldn't be too hard to codegen deserialization code (or make your own slower dynamic library for such) but this is minimal effort!
Yeah, that wasn't too hard. I had to fix a bit more and add a couple things to the autogenerated file but overall pretty easy thing to add. And now, I'm kinda done here. There's a lot more any good library needs, but I've got something that connects to Discord and gets things as they happen. That's really really cool!!
I'll be back when discord-docs
describes the REST routes!
This work was done as part of my so-far-unpublished Discord library.
Mentions around the web
Mentioned 0 times!