Article (archived). Quoth the abstract:
Computer programming is a novel cognitive tool that has transformed modern society. What cognitive and neural mechanisms support this skill? Here, we used functional magnetic resonance imaging to investigate two candidate brain systems: the multiple demand (MD) system, typically recruited during math, logic, problem solving, and executive tasks, and the language system, typically recruited during linguistic processing. We examined MD and language system responses to code written in Python, a text-based programming language (Experiment 1) and in ScratchJr, a graphical programming language (Experiment 2); for both, we contrasted responses to code problems with responses to content-matched sentence problems. We found that the MD system exhibited strong bilateral responses to code in both experiments, whereas the language system responded strongly to sentence problems, but weakly or not at all to code problems. Thus, the MD system supports the use of novel cognitive tools even when the input is structurally similar to natural language.
So in short, the authors attempt, and claim to have succeeded in finding out which (physical and functional) areas of the brain are used in reading and understanding code; as a distinct activity from understanding and solving problems using computers. To be more precise, they say the following:
[i] The mapping of brain areas to primary functions is largely known by now (e.g. Duncan, 2010, Hagoort et al., 2004); the question is, what functional units are activated on specific activities? This is investigated using known and new techniques, i.e. functional MRI, and control tests for "determining the locations" of MD and language systems in individuals -- at least those fMRI spots pertaining to the experiments described in the paper.
[ii] The authors devise two experiments involving "code": instances of textual and graphical representations of computer programs written in Python and ScratchJr respectively. Said programs are presented to individuals with tested proficiency in "code comprehension", i.e.:
By code comprehension, we refer to a set of cognitive processes that allow programmers to interpret individual program tokens (such as keywords, variables, and function names), combine them to extract the meaning of program statements, and, finally, combine the statements into a mental representation of the entire program. It is important to note that code comprehension may be cognitively and neurally separable from cognitive operations required to process program content, that is, the actual operations described by code. For instance, to predict the output of the program that sums the first three elements of an array, the programmer should identify the relevant elements and then mentally perform the summation. Most of the time, processing program content recruits a range of cognitive processes known as computational thinking (Wing, 2006; Wing, 2011), which include algorithm identification, pattern generalization/abstraction, and recursive reasoning (e.g., Kao, 2010). These cognitive operations are notably different from code comprehension per se and may not require programming knowledge at all (Guzdial, 2008). Thus, research studies where people read computer programs should account for the fact that interpreting a computer program involves two separate cognitive phenomena: processing computer code that comprises the program (i.e., code comprehension) and mentally simulating the procedures described in the program (i.e., processing problem content).
So, in short, the authors regard the (mental) manipulation of code and the reasoning about any problems said code might solve as two distinct activities -- in other words, specification is the bottleneck.
[iii] The experiment works as follows: the test subject is placed in a fMRI; then a stimulus is provided (e.g. a program written in Python); the test subject is asked to interpret the provided content, e.g. to comprehend what is "said", either in a controlled setting (the meaning of a sentence) or in the experimental one (the result of the execution of a computer program); fMRI results are recorded while the task is performed; then the control results are used to determine MD and language system response; and finally, experimental results are interpreted.
[iv] The authors determined that the MD system consistently responds to code comprehension tasks. Furthermore, the MD response is broadly distributed: MD-related areas in both brain hemispheres respond to such tasks for both experiments, although problem-related response is left-lateralized. Furthermore, language system response is "weak and inconsistent": a. measured correlation between language knowledge (English/Japanese, in the experiments) and comprehension of language-specific tokens in code was very low; b. Python responses are driven by the underlying problem, so that the language network is recruited to process the stimulus (e.g. the word "width") rather than the content (e.g. the fact that "width" is the name of a variable); c. there is no consistent evidence found of regions outside the MD/language systems responding to code comprehension problems; d. the results are consistent with Liu et al., 2020.
[v] Code comprehension engages domain-general processing resources in the brain. It's not clear however what provokes the language response in Python tasks: prior work (e.g. Fedorenko et al., 2011) finds no evidence of linguistic resources being employed in syntactic processing; on the other hand, meaningless identifiers (in the Japanese experiment) evoke the same response as the ones carrying meaning to the subject. These issues are discussed, but not addressed in the paper.
I'm far from an expert in the field, so the reader may wish to consult the original text for further clarifications. We're not told too much, I blame the overly simplistic experimental approach for this sadness1, as well as the statistical methods they use to disentangle what they call "responses to problem content" from code comprehension. Besides, the whole classification stands on some shaky grounds, if we consider fMRI measurements being a relatively new method in the field, despite its sexiness. Nevertheless, the results are intriguing, as is their MD/language response identification method. Perhaps there's really something there, or maybe there isn't; regardless of that, we're not "scientists"2, so we're allowed to read into it and speculate, wildly even. We begin with the following cautionary inference from the authors:
Of course, the lack of consistent language system engagement in code comprehension does not mean that the mechanisms underlying language and code processing are completely different. It is possible that both language and MD regions have similarly organized neural circuits that allow them to process combinatorial input or map between a symbol and the concept it refers to. However, the fact that we observed code-evoked activity primarily in the MD regions indicates that code comprehension does not load on the same neural circuits as language and needs to use domain-general MD circuits instead.
I for one judge the reasoning to be broadly correct from a technical point of view: the simple fact that code is processed in a different area of the brain is not evidence that the circuits processing it are structurally different from language circuits -- sure, on Intel PCs you have floating point units in both the form of the 8087 coprocessor and GPUs, so it's reasonable to believe that there could be some redundancy there.
Still, this abstract, architectural viewpoint does not tell us anything about the nature of the processing units under scrutiny. I'm not educated on which part of the linguistic content the so-called "language network" is supposed to support, since there's no clear indication that either semantics or syntax activate said region. On the other hand, drawing from the vast corpus of work referenced in the paper, as well as papers found in the wild (e.g. Lu et al., 2017), I dare say that the MD system is the brain's very own abstraction mechanism! in other words, the functional unit which deals with the forming of representations of things. In other other words, anything pertaining to shaped patterns, visual or otherwise, and their recurrence in space and/or time, will go through the MD system3.
Furthermore, I dare state that the authors' cautionary statement is wrong, in that it positions itself on false premises: despite what the AI ideologues are spouting, there is actually no reason to suspect that linguistic resources are involved in code comprehension, nor in computer programming, nor in solving some other particular computation-related problem. Linguistic processing, and conversely, linguistic comprehension is needed to internalize the specification of some problem in a clear language -- which is why, yes, specification is the bottleneck! -- but other than that, the same neural resources are needed to internalize computing machinery as any other geometrical4 item. This is supported by the authors' experiments, in that the human brain must have a very fine sieve to put abstract nonsense, i.e. computation, aside from language, which is why e.g. people with aphasia have no trouble performing spatial tasks (Bek et al., 2010). This may further be considered from an evolutionary point of view: for millions of years, human individuals may have needed to describe a course through some mountains to their peers, but the actual projection of that course was always a distinct process, and possibly more so required in times of danger.
Looking at it from this point of view, I can't help but wonder how the relation between the human brain and computers reflects upon that between humans themselves and the darned machines. We reason here that computers are machines, but the human individual is not rational. In his irrationality, he somehow gets a drive from some Pinocchio syndrome, believing that he can "bring life" to the machine; and maybe in some limited sense he even does it! But still, the damned thing is a lifeless soon-to-be piece of junk, and yet our archetypal human is still drawn to the shiny, even if substanceless, item.
Well, the story goes like this: some dude in the '80s, while studying the human brain, saw the structural nature of grammar and upon this, developed a theory of linguistic computation, also known as "universal grammar". In this theory of his, "the rules of grammar" and those of automata become equivalent under the proposed classification; however, it's highly doubtful that his interpretation of "grammar" has anything to do with actual grammar, on account of entire classes of natural ambiguities that formal syntax and semantics are unable to handle. At its best, the universal grammar is an imprecise tool used mainly in geometrical modelling; at worst, it's a story that fooled the naïve into believing that "artificial intelligence" and "natural language processing" are attainable using discrete computers as we know them. No, we don't "talk" to computers, which means that the objects in question are not slaves at all -- they're mere tools without a specific purpose, but inherently constrained within their design parameters.
And so is language! except we don't have a precise model of how language was, if in any way, designed; and had we begun to build this model, I bet we'd all agree that our current model of computation wouldn't suffice, if only because we don't fully understand the underlying genetics, which is indeed deeply computational in nature, and upon which language is formed, as a layer of software -- how much of it is really soft-ware, though?
So then my concern for many decades, programming and code comprehension, are not "communicating with the machine", and code is not text. But-but yes, there is something deeply instructional in the interaction between human and sequential machine, which is why al-Khwarizmi's method thrives in computing; and it surely makes sense to make up metaphors to express the problem in a symbolic way, such that it can be understood at the same time by the machine and the programmer; I'm not arguing against clarity, quite the contrary. I am however arguing against the commonly-held belief that the linguistic metaphor, convenient as it finds itself, is in any way universal. There are many ways to represent and perform computation in this world, old Wolfram saw as much, aided perhaps by Feynman -- the masses aren't aware, for example, of the old analog computers and the hole they leave; who knows what would e.g. GPUs bring to the table if they steered this way? The only child highly praised for having shown some promise is "quantum", and even that...
To summarize, I guess this experiment, if valid, tells us that the human brain contains two distinct tools that individuals use to reason about tasks: one is language, i.e. talking about them, and the other is abstract representation, and while the two may share some similarities and may even be employed at the same time, there is a divide in the inputs that are processed by the two. Makes you wonder how this all works in the interpretation of legal texts -- can legalese be reduced to Python (or some self-contradictory variation thereof) by this measure?
As far as the field of computing goes, I'm sure it will continue to be plagued by useless and overly-expensive "AI" trinkets, with all their overly-attached issues, probably for the next decade at least. But once that's down the proverbial drain, and if any technology at all will be still standing at the time, then the fields will be ripe again and perhaps something e.g. a breakthrough in bioinformatics will emerge, if it isn't already. Who knows, maybe "software as a living being" isn't quite what we've imagined...
-
How do they know they've covered the entire space of possible "code comprehension" tasks? Granted, they admit to this limitation, but unfortunately this is but one of the things that lowers the quality of the paper below that of a breakthrough. A Heisenberg or (later on) a Bagdasar would have laughed their asses off at this. ↩
-
At least I'm not, not anymore. ↩
-
Building upon this principle of similarity, this is why e.g. musical improvisation activates MD resources: because generating notes and rhythms is a process so very alike to certain computations. But of course, this is only one aspect of musical interpretation, let alone music. ↩
-
To be clear, I use the term "geometrical" very broadly, so that it encompasses all model thinking, regardless of how broken said thinking might be. Structured thought ain't for everyone and JBLC's garden is a very big place. ↩
[...] be possible? It's not a very sophisticated mechanism, yes, but on second thought, neither are the biological variety of neural [...]
[...] machinery. But let's be clear on the fact that the so-called Large Language Models aren't it, for obvious [...]