Harmful GOTOs, Premature Optimizations, and Programming Myths are the Root of all Evil
What follows is the presentation I gave in the inaugural meeting of Papers We Love Madrid. The subject of the talk was the paper Structured Programming with Go To Statements by Donald Knuth.
The biggest problem we have as human beings is that we confuse our beliefs with reality” - Alan Kay.
The question is whether we should ban it [GOTO], or educate against it. –– Donald Knuth
In this article I would like to talk about myths. Google defines myth as “a widely held but false belief or idea”, while dictionary.com tell us: “an unproved or false collective belief that is used to justify a social institution”. The question is, do we, XXI century people believe in myths?
Umberto Eco on The Book of Legendary Lands proposes a simple exercise to show us how widespread some myths are in our societies, so far in fact that sometimes we don’t even realize we are talking about things that are not true. He proposed to ask a person what it was that Christopher Columbus tried to demonstrate when he wanted to reach the Indies by traveling west, and what was the common held belief at the time by the doctors of Salamanca. Many people will answer that Columbus wanted to prove that the Earth was round, opposing the common held belief that the Earth was flat. The thing is, as Eco demonstrates in his book, people in the Middle Ages knew all too well that the Earth wasn’t flat. There were even calculations about the actual diameter of the Earth to see how long Columbus would have to travel to reach the Indies. The real reason the establishment opposed Columbus’ plans was in fact based on this knowledge: they were concerned it would demand too many resources because of the distance.
How did such a myth become so widespread? There’s a book called A History of the Life and Voyages of Christopher Columbus by Washington Irving, who, while trying to document Columbus’ life, also added some spice to the story in order improve the entertaining qualities of his work. At the same time as Irving’s book was making the rounds, science was trying to bring forth the Theory of Evolution as opposed to Creationism. If the Church Fathers were wrong about the shape of the Earth, perhaps these secular thinkers from the XIX century could show as well how the Church was wrong about Creationism. They unearthed the theories of a certain Lactantius and also Cosmas Indicopleustes who claimed back in the 3rd century that the Earth was flat. So we have on one side a writer trying to improve the story on his book by adding some facts here and there, and some secular thinkers trying to bring forth the Theory of Evolution, giving birth, or adding fuel, to the myth of the flat Earth.
We have a case of a widespread myth that despite facts to the contrary, still survives. A myth propagated by people whose intentions were to bring reason forward. It looks like a case of the end justifies the means. It’s interesting to see how some pseudo-fact get published on a book, and then scholars just repeat that fact without checking its veracity, until it snowballs into the present. It is most probable that the Myth of the Flat Earth doesn’t affect us at all today, after more than four hundred years have passed since Columbus voyages, but the question is, what about our field, computer science. Do we have similar cases in Software Engineering? Myths that persist in present days, but for which we don’t even know the origin or the context in which they were initially created, and whose truth we actually ignore? I think we do, but more in the shape of certain maxims that we all should follow no matter what. Which brings us to the paper I would like to discuss today, Structured Programming with Go To Statements by Donald Knuth.
How did I get interested in this paper? It all started with a simple question posted as a Github issue for a PHP library. The question is as follows:
May I ask if there is a reason why you prefer using goto instead of function recursion?
The user was referring to the goto statement that appears on line 17 of the following program:
The author of the library proceeded to give a deep explanation of how the PHP interpreter works, giving the reasons why he favored the use of GOTO instead of recursion as suggested by the user. Of course, this being the internet, a flamewar was just about to start. One user not only was against the use of a GOTO statement, but he also linked to the famous paper by Dijkstra Go To Statement Considered Harmful. Seeing so many different opinions on the issue, I wanted to know more.
A simple Google search lead me to the Stack Overflow discussion called GOTO still considered harmful? which in turn contrasts Dijkstra’s paper with the one by Donald Knuth, which I have already mentioned, and which is the main subject of this talk.
Dijkstra’s paper advocates for the removal of GOTO statements, since they make understanding programs that use them difficult. It’s interesting to note some of the story around that paper. Dijkstra had submitted it as A case against the goto statement, but in order to speed up its publication, the editor published it under “letters to the editor” with the new, more interesting, title. The editor was Niklaus Wirth, creator of Pascal.
It seems the paper created a lot of agitation at the time, not only against people that were misusing GOTO but also against people that were writing well structured programs, but which happened to use GOTOs. These people were offended because now they were also treated as bad programmers.
Before finding Knuth’s Structured Programming with Go To Statements paper, I had the idea that Columbus was trying to prove that the Earth was flat, which is to say, that Dijkstra was against all uses of the GOTO statement. A few paragraphs into Knuth’s paper and we start finding some gems from private communications between Knuth and Dijkstra. According to Knuth, Dijkstra wrote:
Please don’t fall into the trap of believing that I am terribly dogmatical about [the go to statement]. I have the uncomfortable feeling that others are making a religion out of it, as if the conceptual problems of programming could be solved by a single trick, by a simple form of coding discipline!
Interesting choice of words, of which I would like to underline religion and single trick. For some reason we developers are always in search of Silver Bullets, looking for Holy Grails that will solve all our problems. For example, more often than not we see blog posts by people claiming to have beaten the CAP Theorem. A mathematical theorem… beaten; but in software development silver bullets don’t exist, as Frederick Brooks Jr. discusses on his paper called No Silver Bullet.
There is no single development, in either technology or management technique, which by itself promises even one order of magnitude [tenfold] improvement within a decade in productivity, in reliability, in simplicity.
Despite all evidence, we still are in the look for the magical formula that will save us from ourselves.
Back to the paper. After the intro to his paper, Knuth starts to do an exhaustive historical revision of who has opposed GOTOs in the history of programming, mentioning that this goes back as afar as 1959.
Then the paper starts the first of its main sections: “Elimination of
go to statements”. Here Knuth presents a variety of algorithms that
involve either for loops, or while loops that use GOTO statements here
and there and then Knuth proceed to discuss how to remove them. What’s
interesting here besides the programs themselves, is that they are made
only of the most basic program constructs, like
there are no procedures for example, let alone classes. So I started
to wonder, in what era of programming are we in while we read this
The paper was published in 1974. Here’s the second code snippet, called Example 1a:
What’s interesting about this is not the algorithm per se, but what follows:
The and operation used here stands for McCarthy’s sequential conjunction operation.
So in this paper Knuth has to stop and explain what is the
that today we take for granted in most programming languages. Let’s
take a diversion for a moment and see what was going on in the
computing circles of the time this paper was published.
I’ve mention already that the examples in the paper are about basic
programming constructs, like
while. This is 1974 and
Barbara Liskov just published her seminal paper on
Programming with Abstract Data Types. In
a recent talk at
Liskov said that procedures were the way of doing abstraction, people
didn’t even know what modules were.
Logic Programming is just starting to be a thing. PLANNER was designed in 1969 and in Europe we get Prolog in 1972.
Back in 1964
was created, and in 1968 we get ALGOL, the language that introduced
end as block delimiters.
In 1973, UNIX is released to the public, the same year Ethernet was created. Also the ACTOR model was developed by Carl Hewitt.
Another characteristic of Knuth’s paper is that it has the old way of calculating the complexity of an algorithm by counting its operations. It will take until 1976 for Knuth to introduce Big O notation into the programming circles as a way of analyzing algorithm complexity, a concept that was part of mathematics since 1894. To bring some more context, 1976 is the same year the Apple I was released.
On the security side we have that in 1977 the RSA algorithm was presented to the public.
Next year, in 1978 Robin Milner publishes A Theory of Type Polymorphism in Programming. In the same year Leslie Lamport publishes the seminal paper on distributed systems called Time, Clocks, and the Ordering of Events in a Distributed System
So we have a time where personal computers were far from becoming
common place, where public-key encryption as we know it today was
about to appear, and where programming techniques in common use today,
like the use of Abstract Data Types were just starting to be discussed
in academia. Having that in mind we can see why the examples in
Knuth’s paper are all about how to improve loops that use for
statements or GOTOs; or why Knuth has to explain what
In this context of scarce computer resources we find why Knuth is interested about extracting the maximum performance out of his programs. He writes that “most of the running time in non-IO-bound programs is concentrated in about 3% of the source text”, that there’s always an “inner-loop” which if we speed it up by 10% then everything gets the same gains.
The problem for Knuth is that programmers spend time trying to speed up the wrong parts of the program, as he said in his turing award lecture, Computer Programming as an Art:
Experience indicates that nearly everybody has the wrong idea about the real bottlenecks in his programs
So for him, “we should forget about small efficiencies, say about 97% of the time” and continues with the famous quote:
premature optimization is the root of all evil
That last sentence has been quoted most of the time out of context, creating a myth on its own, a complete religion on what to do and what not to do with regards to program optimization. Is like the Godwin Law of optimization discussions. The problem is we usually omit what comes next in the paper:
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified
The key words on that sentence being “only after that code [the critical one] has been identified”; and then Knuth reminds us:
the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.
It’s clear that Knuth is not forbidding us from performing optimizations, we shall do them, but only when we know they are going to help, and of course the only way of knowing is measuring, running benchmarks on our programs.
The problem with the root of all evil quote being cited out of context is that it seems to be used to forbid all kind of optimizations, like a sword of Damocles preventing us from even trying to improve some code.
As the paper progresses Knuth keeps showing us examples of program where GOTOs can be removed, but all the while the resulting program keeps having the same or very similar performance characteristics as the original one. At the same time, readability and the understandability of the program are key factors for Knuth when deciding to refactor the program one way or another. One might wonder why Knuth is so concerned with the efficiency of his programs. Of course he has the answer:
My books emphasize efficiency because they deal with algorithms that are used repeatedly as building blocks in a large variety of applications
This can help us put into context the information we are receiving from this paper. If we think about it, perhaps most of the basic algorithms in use today have been influenced one way or another by what Knuth wrote on his TAOCP book.
Later in the paper Knuth writes:
We shouldn’t merely remove go to statements because it’s the fashionable thing to do; the presence or absence of go to statements is not really the issue.
“The biggest problem we have as human beings is that we confuse our beliefs with reality” - Alan Kay. Programming and Scaling
I’ve presented only two cases of sentences or even paper titles that have out grown their surroundings to become greater than the papers where they were originally published, but I think they are enough as examples of myths that we carry with us.
On an interview, Alan Kay discussed why he think some languages with arguably more technical merits than others go unnoticed, while some other programming languages become more popular. He understand that we have a problem with education, with our own education as programmers:
[in the 70s] computing spread out much, much faster than educating unsophisticated people can happen
He said that this created a sort of pop-culture of programming where we started ignoring the fundamentals of our discipline. He adds:
So I think the lack of a real computer science today, and the lack of real software engineering today, is partly due to this pop culture.
And this brings us to two phenomena of today’s programming culture: Twitter and Hacker News. Discussions that happen in those websites would lead us to believe that people is forming opinions based in just reading the story titles. We live in such clickbait times, than there are even Twitter accounts whose only purpose is to dismantle those baits.
Once I joked that if a “thought doesn’t fit in 140 characters, then it’s not a thought worth having”, to which a colleague responded: did you know there’s a conference for computer science research whose papers fit in 140 characters or less? Yep, here it is:
I think this trend is taking us away from computer science. We no longer know what happened in the past. We reinvent methodologies and give them new names: Agile, TDD, offline-first, to name a few.
Dijkstra knew all too well about this problem:
In 1968, the Communications of the ACM published a text of mine under the title “The goto statement considered harmful”, which in later years would be most frequently referenced, regrettably, however, often by authors who had seen no more of it than its title
His paper more than anything has been “godwin-lawed” at innumerable discussions, often by just naming its title, ignoring completely the context, or the remarks made afterwards about it by Dijkstra himself.
This treatment of knowledge reminds me of a paragraph from Fahrenheit 451, when the chief fireman explains to Mr. Montag what was the reason why they banned books:
Classics cut to fit fifteen-minute radio shows, then cut again to fill a two-minute book column, winding up at last as a ten- or twelve-line dictionary resume.
Why is it that we prefer to spread only these digests of great papers and ideas from the past, to form these myths? I believe there’s a simple answer and a more complex, related one. The simple answer has to do with laziness and the need to appear smart. On an agitated discussions is better to drop a paper title than to be beaten by other people’s arguments. Of course I’m being sarcastic.
I believe the more complex answer is related to memes as explained by Richard Dawkins on his book The Selfish Gene. All these one-sentence programming maxims are easy to repeat and to follow, than to actually learning the whole methodology. As Knuth says on his paper:
[…] there has been far too much emphasis on go to elimination instead of on the really important issues; people have a natural tendency to set up all easily understood quantitative goal like the abolition of jumps, instead of working directly for a qualitative goal like good program structure
It’s easier to count lines of code produced by an programmer that the actual quality of the work they produce.
At the end of the day we should focus on what matters, we should internalize that we are always making choices between different kind of tradeoffs, but in our case, programs that are easier to understand should triumph over complicated ones:
we should strive most of all for a program that is easy to understand and almost sure to work
Finally with regards to our own discipline, let’s close by reminding us of Dijkstra’s words:
“we had better see to it that the computer industry does not kill computing science” from Dijkstra on Computing Science: Achievements and Challenges
I would like to thank Miguel Pastor and Félix López for inviting me to present at Papers We Love Madrid.
At the same time I would like to thank Leif Walsh for helping me by proof reading an early draft of this article. I have to add that whatever errors or typos that remain in this paper are entirely my fault.
- Myth of the Flat Earth
- A History of the Life and Voyages of Christopher Columbus
- Structured Programming with Go To Statements
- Retry library, GOTO Usage
- Explanation of how the PHP compiler works
- Go To Statement Considered Harmful
- GOTO still considered harmful?
- A Case against the Go To Statement
- Beating the CAP Theorem
- No Silver Bullet
- Short-Circuit Operator
- Programming with Abstract Data Types
- The Power of Abstraction, B. Liskov InfoQ Keynote
- PLANNER: A Language for Manipulating Models and Proving Theorems in a Robot
- The birth of Prolog
- Ethernet: Distributed Packet Switching for Local Computer Networks
- A Universal Modular ACTOR Formalism for Artificial Intelligence
- BIG OMICRON AND BIG OMEGA AND BIG THETA
- Die analytische Zahlentheorie
- RSA Algorithm Patent
- A Theory of Type Polymorphism in Programming
- Time, Clocks, and the Ordering of Events in a Distributed System
- Computer Programming as an Art
- Godwin Law
- Programming and Scaling, Talk by Alan Kay
- Cargo Cult Programming
- A Conversation with Alan Kay
- HN just reading the story titles
- Saved You A Click
- Tiny ToCS
- Thinking Forth
- You Won’t Believe How Old TDD Is
- Ficus Replication
- What led to “Notes on Structured Programming”
- The Selfish Gene
- Computing Science: Achievements and Challenges