A solution to the largest-of-seven FFACalc puzzle

06a January 19, 2018 -- (tech)

The solution itself is short and quite straightforward, so I'm not sure it's qualified for a blog post of its own. Instead, I'm gonna use this occasion to embellish it1 with a discussion of FFA (as far as I know it at the time of writing) and a brief description of FFACalc, only then followed by the solution to the puzzle given at the end of the fourth chapter of Stanislav's series.

FFA is, I quote:

[...] the Finite Field Arithmetic library. FFA differs from the typical "Open Sores" abomination, in that -- rather than trusting the author blindly with their lives -- prospective users are expected to read and fully understand every single line. In exactly the same manner that you would understand and pack your own parachute. The reader will assemble and test a working FFA with his own hands, and at the same time grasp the purpose of each moving part therein.

That is, a library for performing arithmetic operations on arbitrarily large numbers, in turn useful for things such as sane implementations of crypto algorithms. "But wait, Lucian, there are implementations of crypto out there already!" Yes, and none of them are sane, and yet you insist that "there are", even though you haven't read the code2.

Anyway, by "sane" I mean very specifically items that you would run on your car or in your nuclear plant (or any other safety-critical system); and thus by "arbitrarily" I don't mean "however large", but "as large as the operator specifies3, within constraints imposed by the iron". Or, as seen in the logs around 2016:

asciilifeform: sooooooo in other nyooz here's a small preview of things to come:
asciilifeform: i discovered that mainstream finite field libs are as complicated as they are largely because they insist on growable - and, ergo, heap-allocated - nums.
mircea_popescu: tbh, the heap should go away entirely.
mircea_popescu: wtf is it even a thing for.
asciilifeform: for items whose size you ~do not know~ in advance.
asciilifeform: which - notably - integers used in crypto ARE NOT.
mircea_popescu: this is not programming.
asciilifeform: anyway pre-allocated-buffers-for-motherfucking-everything is sop in 'this must work' (e.g., avionics) world.
asciilifeform: what is interesting is that - as far as i can tell - no one has ever published example of finite field system from such.
mircea_popescu: hm.
asciilifeform: 'there is a first time for everything'

So how can one achieve such a thing? For one, the compiler and whatever pieces of high-level language run-time there exist need to behave deterministically, and provably so; this excludes every language with garbage collected memory management4. Assembly works, but it's not portable; a subset of C should work, but unfortunately the C language specification is a mess, and sane compilers are getting scarcer by the day. Fortunately, other ALGOLs do a much better job at this, which is how Stanislav chose (a very restricted subset of) Ada as the implementation tool.

As previously mentioned, the core functionality of the FFA library doesn't do much more besides addition, subtraction, etc.; its main benefit being that it allows the programmer to cross the 64 bit limit, or whatever the machine register size is. To keep things simple, FFA contains a user-facing component, the small program called FFACalc. In addition to being a small program, FFACalc is also a small stack-based context-free5 language that can be used to "talk to" the FFA arithmetron.

The FFACalc in Chapter 4 implements the following constructs:

Now that we have the basics in place, let's take a look at the puzzle:

Write a FFACalc tape that will take seven numbers, presumed to be on the top of the stack, and return the largest.

Your answer should work with any legal WIDTH, and any stack HEIGHT large enough to hold the working set.

For instance, suppose file numbers.txt were to contain:

.9.1.7.5.1.1.0

Then the following example invocation:

cd ffacalc
gprbuild
cat numbers.txt youranswer.txt |  ./bin/ffa_calc 256 16

... should produce the output:

0000000000000000000000000000000000000000000000000000000000000009

... and similarly for any other seven numbers.

For our convenience, let's first specify the solution in a pseudo-code (or quasi-natural) language. We have a stack off which we can pop items and compare them. The algorithm would look something along the lines of:

0. let S be our stack
1. pop S into Max
2. while S has elements
2. a. pop S into Candidate
2. b. if Candidate > Max then
2. c. | Max becomes Candidate
3. return Max

Notice that we a. have a pure stack machine ("Candidate" and "Max" are pseudo-variables that reside on the stack) and b. lack the means to loop "while" or "until". However, we know that at step 0, the stack contains precisely seven elements; thus it should be popped exactly seven times, including the operation at step 1. Thus we can unroll our "while" loop into precisely six comparisons.

Given the FFACalc stack machine, steps 2.a-c can be expressed as: compare the current two topmost elements on the stack and keep the higher of the two. Let's assume for now that we push exactly two numbers on the stack. Given that comparison and arithmetic operations are destructive, we need to operate on copies of the two numbers, so the first thing we need to do is "over" them (we also print the contents of the stack):

.2.1``####
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002

So the next step would be to compare them using (conveniently) <:

.2.1``<###
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002

or

.1.2``<###
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000001

Now there are two approaches that we can take. Let's take the obvious one first: based on the result of < at the top of the stack, we can execute conditionally:

.2.1``<{[a: ]#}{[b: ]#}##
b: 0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000002
.1.2``<{[a: ]#}{[b: ]#}##
a: 0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000001

Notice that the second printed value is the conditional, and that it gets printed after exiting the branch6. So we just drop it:

.2.1``<{[a: ]#}{[b: ]#}_#
b: 0000000000000000000000000000000000000000000000000000000000000001
0000000000000000000000000000000000000000000000000000000000000002
.1.2``<{[a: ]#}{[b: ]#}_#
a: 0000000000000000000000000000000000000000000000000000000000000002
0000000000000000000000000000000000000000000000000000000000000001

So we notice that on the "if" branch (second execution), the largest number is on the top of the stack, while on the "else" branch it's the other way around. Thus, the correct reasoning is: if S(SP - 1) < S(SP), swap the two values and drop the first (i.e. the smaller value), else just drop the first. That is:

.2.1``<{'_}{_}_#
0000000000000000000000000000000000000000000000000000000000000002
.1.2``<{'_}{_}_#
0000000000000000000000000000000000000000000000000000000000000002

Et voila! Extending this to a seven-number stack, we get:

.9.1.7.5.1.1.0
``<{'_}{_}_ ``<{'_}{_}_ ``<{'_}{_}_ ``<{'_}{_}_ ``<{'_}{_}_ ``<{'_}{_}_
#
0000000000000000000000000000000000000000000000000000000000000009

This solution works, but the branches aren't "balanced", i.e. a different number of operations are executed on each branch. We can solve this quite elegantly by employing the mux (U) operation, whose functionality is left as an exercise for the reader. The solution for two numbers is:

.2.1``<U#
0000000000000000000000000000000000000000000000000000000000000002
.1.2``<U#
0000000000000000000000000000000000000000000000000000000000000002

And this about sums up our exercise. An early text version of the solution can be downloaded separately for the curious. All in all, quite an elegant and down-to-earth approach to big number arithmetic; and who knows what else awaits us in the next chapters!


  1. This write-up may turn out to be not such a useless endeavour, and for a few reasons.

    First, my break from writing (anything else other than academic hogwash) has left a visible negative mark on the quality thereof, so writing anything at all stemming from my own mental preputium is welcome. I don't care if the reader agrees or not, it's why I have a blog in the first place.

    Second, if you're still reading this, it's very possible that you haven't heard about Loper OS or FFA, or for that matter anything related to TMSR. Third, the exercise of describing programs in a literate manner is never useless; so who knows, maybe by the end I'll have understood this myself!

    Note that this article is not meant to replace the actual thing. If you want to understand FFA properly, go read FFA. And just so that we understand each other, that DeGrasse Tyson guy is an imbecile.

  2. And if you have, then how are you any better than Koch, the OpenSSL freaks et al.?

  3. Before Microsoft's idiotic "please restart your computer so we can probe you anally with upgrades", computing was used for things that need to run, preferrably for years, or at least for months, without any interruption. That includes your hypothetical nuclear plant, but also your heater and the plane you sometimes use to go to Rome and Paris to take all those selfies.

    If any of those items were designed even remotely similarly to Windows, they would blow up within hours. Do you know why they don't? Right, because engineers design them not to. And to do that, engineers need to ensure that the behaviour of each of those items is deterministic, which means that no, "please wait, inserting anal probe" is not an option.

  4. Unless the behaviour of said garbage collected memory management can be measured and observed to be deterministic. This leaves all software implementations out, and any hardware implementation would leave "multitasking" and any other such nonsense out. But we're already way over our pay grade here.

  5. I haven't verified that it is context-free, but I am making an educated guess based on the presence of the stack and other nested constructs, e.g. comments and quotations, and moreover, based on the lack of loops.

    Here's an exercise for the aspiring computer scientist: write down the pushdown automaton that describes FFACalc.

  6. I can't emphasize this enough: read the code to see exactly why and how!