I have very recently been thinking about the question of partiality vs totality in programming languages, a perennial topic in PL’s that every generation thinks it discovers for itself. And this got me to remembering an old theorem that, it seems, hardly anyone knows ever existed in the first place. What I like about the theorem is that it says something specific and technically accurate about the sizes of programs in total languages compared to those in partial languages. The theorem provides some context for discussion that does not just amount to opinion or attitude (and attitude alway seems to abound when this topic arises).
The advantage of a total programming language such as Goedel’s T is that it ensures, by type checking, that every program terminates, and that every function is total. There is simply no way to have a well-typed program that goes into an infinite loop. This may seem appealing, until one considers that the upper bound on the time to termination can be quite large, so large that some terminating programs might just as well diverge as far as we humans are concerned. But never mind that, let us grant that it is a virtue of T that it precludes divergence.
Why, then, bother with a language such as PCF that does not rule out divergence? After all, infinite loops are invariably bugs, so why not rule them out by type checking? (Don’t be fooled by glib arguments about useful programs, such as operating systems, that “run forever”. After all, infinite streams are programmable in the language M of inductive and coinductive types in which all functions terminate. Computing infinitely does not mean running forever, it just means “for as long as one wishes, without bound.”) The notion does seem appealing until one actually tries to write a program in a language such as T.
Consider computing the greatest common divisor (GCD) of two natural numbers. This can be easily programmed in PCF by solving the following equations using general recursion:
The type of defined in this manner has partial function type
, which suggests that it may not terminate for some inputs. But we may prove by induction on the sum of the pair of arguments that it is, in fact, a total function.
Now consider programming this function in T. It is, in fact, programmable using only primitive recursion, but the code to do it is rather painful (try it!). One way to see the problem is that in T the only form of looping is one that reduces a natural number by one on each recursive call; it is not (directly) possible to make a recursive call on a smaller number other than the immediate predecessor. In fact one may code up more general patterns of terminating recursion using only primitive recursion as a primitive, but if you examine the details, you will see that doing so comes at a significant price in performance and program complexity. Program complexity can be mitigated by building libraries that codify standard patterns of reasoning whose cost of development should be amortized over all programs, not just one in particular. But there is still the problem of performance. Indeed, the encoding of more general forms of recursion into primitive recursion means that, deep within the encoding, there must be “timer” that “goes down by ones” to ensure that the program terminates. The result will be that programs written with such libraries will not be nearly as fast as they ought to be. (It is actually quite fun to derive “course of values” recursion from primitive recursion, and then to observe with horror what is actually going on, computationally, when using this derived notion.)
But, one may argue, T is simply not a serious language. A more serious total programming language would admit sophisticated patterns of control without performance penalty. Indeed, one could easily envision representing the natural numbers in binary, rather than unary, and allowing recursive calls to be made by halving to achieve logarithmic complexity. This is surely possible, as are numerous other such techniques. Could we not then have a practical language that rules out divergence?
We can, but at a cost. One limitation of total programming languages is that they are not universal: you cannot write an interpreter for T within T (see Chapter 9 of PFPL for a proof). More importantly, this limitation extends to any total language whatever. If this limitation does not seem important, then consider the Blum Size Theorem (BST) (from 1967), which places a very different limitation on total languages. Fix any total language, L, that permits writing functions on the natural numbers. Pick any blowup factor, say , or however expansive you wish to be. The BST states that there is a total function on the natural numbers that is programmable in L, but whose shortest program in L is larger by the given blowup factor than its shortest program in PCF!
The underlying idea of the proof is that in a total language the proof of termination of a program must be baked into the code itself, whereas in a partial language the termination proof is an external verification condition left to the programmer. Roughly speaking, there are, and always will be, programs whose termination proof is rather complicated to express, if you fix in advance the means by which it may be proved total. (In T it was primitive recursion, but one can be more ambitious, yet still get caught by the BST.) But if you leave room for ingenuity, then programs can be short, precisely because they do not have to embed the proof of their termination in their own running code.
There are ways around the BST, of course, and I am not saying otherwise. For example, the BST merely guarantees the existence of a bad case, so one can always argue that such a case will never arise in practice. Could be, but I did mention the GCD in T problem for a reason: there are natural problems that are difficult to express in a language such as T. By fixing the possible termination arguments in advance, one is tempting fate, for there are many problems, such as the Collatz Conjecture, for which the termination proof of a very simple piece of code has been an open problem for decades, and has resisted at least some serious attempts on it. One could argue that such a function is of no practical use. I agree, but I point out the example not to say that it is useful, but to say that it is likely that its eventual termination proof will be quite nasty, and that this will have to be reflected in the program itself if you are limited to a T-like language (rendering it, once again, useless). For another example, there is no inherent reason why termination need be assured by means similar to that used in T. We got around this issue in NuPRL by separating the code from the proof, using a type theory based on a partial programming language, not a total one. The proof of termination is still required for typing in the core theory (but not in the theory with “bar types” for embracing partiality). But it’s not baked into the code itself, affecting its run-time; it is “off to the side”, large though it may be).
Updates: word smithing, fixed bad link, corrected gcd, removed erroneous parenthetical reference to Coq, fixed LaTeX problems.