Some thoughts on Feedbot, and a few more on IRC bots in general

April 30, 2020 by Lucian Mogosanu

In case you're using Feedbot and you're wondering what's new, I'd like to let you know that not much has changed since the winter update. Or rather, not much was expected to change, until very recently, when some Freenode hang/disconnect weird, that before that moment was occuring about once a month, started occuring daily, give or take.

So in case you're using Feedbot and you're annoyed by this, I'd like to apologize for the shoddy service that you've been seeing lately. I'd also like you to know that I'm looking into the matter, my looking into the matter being what prompted this here article. However, this article will go a tad deeper, so as to reexamine a few mistakes in my thinking and to fix them.

If you remember, when I started growing Feedbot into what it is now, I split it into three pieces: the checker, the announcer and the IRC bot that folds the first two items together neatly into a single program. Well, I realized I've been looking at it the wrong way the whole time, Feedbot is not a program -- it's two of 'em1!

Let's revisit the "what is a feed bot" matter for a second. Looking at it plainly, it's a distributed system integrating two elements: a feed checker and an IRC notifier. Or, in more words, it's:

a. a program that maintains a user-feed data store: it provides a mechanism for mapping RSS feeds to IRC users; and it periodically checks feeds for "new" "content" (entries it hasn't identified before), that it then records and marks accordingly, by storing "content"-user mappings ("messages") in a "mailbox"/queue database.

Our feed bot also comprises b. a program that connects to an IRC server and joins a bunch of channels; then it b.1. responds to IRC commands as per the manual; b.2. periodically checks the aforementioned "mailbox" queue for new messages; b.3. for each message, for each associated user, delivers the message to the user; and b.4. it marks delivered messages as "old" or it otherwise deletes them from the queue.

That's about it, regardless of what flavour of computing tools one fancies at any given time. Whether the feed bot comprises two Unix programs that communicate through a pipe or through a MySQL database; or whether it's a Lisp program running two threads in the same Unix address space; or whether it's implemented some other way altogether, that is unimportant as far as the feed bot is concerned. Moreover, observe how the sole point of intersection between the two pieces is that message database thing.

As for my own implementation, it is, fortunately enough for me, not that far off from the spec; for one, the feed checker remains precisely as it was conceived in day one. The more iffy part remains the IRC bot; believe it or not, "where the fuck am I to put the announcer" was precisely what I was asking myself this morning, freshly awoken, the sun casting its powerful mid-Spring ray upon my face, as I was sitting my ass on a chair, breathing the fresh air and admiring the green and the birdsong.

As the current Feedbot lies -- or rather, as it lied before I started fiddling with it -- the so-called "announcer" was a small program of its own, intricately tied between the checker on one hand and the IRC bot on the other: while it didn't do b.1, it did do some of b.2-4; and especially as b.3 goes, it got integrated with the IRC piece more than I would have initially liked it to, as it's a separate thread that also interfaces with IRC. Not only that, but it also makes the bot piece itself more complicated, because that one is also going to have to do some of b.2-4 in addition to b.1, since it needs to check that users are online before sending PM notifications.

So my question was, in a deeper sense, how to cut this b.2-4 thing more precisely; which I expect would prove to be a futile quest in the world of overengineering. Instead of that, I'll just admit that the IRC bot and the "announcer" are the one and same thing, which makes this a question of what that "periodically" means in b.2: for some stupid reason, I decided to check and announce new RSS entries on a separate thread, asynchronously, which looks nice if you ignore the shitty nature of IRC.

Which brings us to the shitty nature of IRC, which we'll deal with shortly: for some stupid reason, IRC was built on top of TCP, which splits the world into "clients" and "servers", thus putting weight on the "server" to maintain multiple "connections" to itself. The IRC server achieves this by periodically sending special PING (IRC) messages, to which the client must answer with PONG during some pre-established time frame. This, again, looks nice on paper, except servers often don't honor their end of the deal, and so the client is stuck disconnecting and reconnecting and what the hell, why is it so hard to send some messages from one IP address to another?

As discussed above, this asynchronous announcer thing has to do some IRC, which means that it needs to be aware of whether the IRC is still connected, at which point the whole cleanup-and-reconnect thing becomes a whole mire of complexity, which leads us to where we are today, I'll spare you the rest of the details. Stan said it multiple times and I ignored it, and... well.

This whole thing, as described above, has the following implications. For one, there is going to be no more asynchronous "multi-threaded" execution in the bot-part of Feedbot, so if I find out that some functionality relies on it, I'll cut it without asking too many questions. As for the other, there is a way to gracefully handle announcement, by doing it on the PING-PONG loop of the bot. This means that announcements are now entirely dependent on this ping-pong, hell, they were dependent on it in the first place and there's nowhere else to put them. Just keep in mind that if Freenode decides to send PINGs once four minutes, it's not my fault for the delays in notification. This also means that the "ircbot-pinger" thread in Ircbot is by now long gone and replaced with a simpler mechanism that merely "handles" pings coming from the server2.

As for the user-side of the things, this is all still work-in-progress at the moment, so expect to see some hiccups as I adapt the code to the news -- all this emerging from those precious bouts of clarity yet again.

  1. Had you asked me yesterday whether a mere RSS reader would lead me towards posing fundamental questions about IRC, the protocol, I would have laughed. How unwise of me!

    No problem at the end of the day, though, since this is how I stumbled upon the good opportunity to re-spec some shit. 

  2. This isn't published anywhere to my knowledge, I posted it in a paste at some point and I suppose it got lost meanwhile. I know Whaack was looking into it, while Trinque is, as far as I know, maintaining his own tree.

    This is one of the main disadvantages of not having a Res Publica, by the way: that people are no longer expected to publish the shit they own, and that there's a general lack of convergence in the texts that makes machinery do its job. On one hand this is beneficial, because it's protocols and/or interfaces between machines/programs that are supposed to be standardized, not the implementations themselves. On the other hand, this puts on my shoulders the weight of maintaining a Feedbot Linux distribution, including SBCL, usocket, CL-IRC and all those dependencies that I've listed numerous times here and elsewhere. You might not believe me when I say that I've actually run into the need to patch SBCL recently, but... well, I did, what can I do. 

Filed under: computing.
RSS 2.0 feed. Comment. Send trackback.

2 Responses to “Some thoughts on Feedbot, and a few more on IRC bots in general”

  1. [...] Now, where did I see that before? ↩ [...]

  2. [...] post is quite transparently an attempt to dig deeper into that Feedbot series. There's one small problem, though, namely that writing this properly requires time, and [...]

Leave a Reply