The Wit and Wisdom of Trygve Haavelmo

I was talking some time ago with my friend Enno about Merijn Knibbe’s series of articles on the disconnect between the variables used in economic models and the corresponding variables in the national accounts.1 Enno mentioned Trygve Haavelmo’s 1944 article The Probability Approach in Econometrics; he thought Haavelmo’s distinction between “theroetical variables,” “true variables,” and “observable variables” could be a useful way of thinking about the slippages between economic reality, economic data and economic theory.

I finally picked up the Haavelmo article, and it turns out to be a deep and insightful piece — for the reason Enno mentioned, but also more broadly on how to think about empirical economics. It’s especially interesting coming from soeone who won the Nobel Prize for his foundational work in econometrics. Another piece of evidence that orthodox economists in the mid-20th century thought more deeply and critically about the nature of their project than their successors do today.

It’s a long piece, with a lot of mathematical illustrations that someone reading it today can safely skip. The central argument comes down to three overlapping points. First, economic models are tools, developed to solve specific problems. Second, economic theories have content only insofar as they’re associated with specific procedures for measurement. Third, we have positive economic knowledge only insofar as we can make unconditional predictions about the distribution of observable variables.

The first point: We study economics in order to “become master of the happenings of real life.” This is on some level obvious, or vacuous, but it'[s important; it functions as a kind of “he who has ears, let him hear.” It marks the line between those who come to economics as a means to some other end — a political commitment, for many of us; but it could just as well come from a role in business or policy — and those for whom economic theory is an end in itself. Economics education must, obviously, be organized on the latter principle. As soon as you walk into an economics classroom, the purpose of your being there is to learn economics. But you can’t, from within the classroom, make any judgement about what is useful or interesting for the world outside. Or as Hayek put it, “One who is only an economist, cannot be a good economist.”2

Here is what Haavelmo says:

Theoretical models are necessary tools in our attempts to understand and explain events in real life. … Whatever be the “explanations” we prefer, it is not to be forgotten that they are all our own artificial inventions in a search for an understanding of real life; they are not hidden truths to be “discovered.”

It’s an interesting question, which we don’t have to answer here, whether or to what extent this applies to the physical sciences as well. Haavelmo thinks this pragmatic view of scientific laws applies across the board:

The phrase “In the natural sciences we have laws” means not much more and not much less than this: The natural sciences have chosen fruitful ways of looking upon physical reality.

We don’t need to decide here whether we want to apply this pragmatic view to the physical sciences. It is certainly the right way to look at economic models, in particular the models we construct in econometrics. The “data generating process” is not an object existing out in the world. It is a construct you have created for one or both of these reasons: It is an efficient description of the structure of a specific matrix of observed data; it allows you to make predictions about some specific yet-to-be-observed outcome. The idea of a data-generating process is obviously very useful in thinking about the logic of different statistical techniques. It may be useful to do econometrics as if there were a certain data generating process. It is dangerously wrong to believe there really is one.

Speaking of observation brings us to Haavelmo’s second theme: the meaningless of economic theory except in the context of a specific procedure for observation.  It might naively seem, he says, that

since the facts we want to study present themselves in the form of numerical measurement, we shall have to choose our models from … the field of mathematics. But the concepts of mathematics obtain their quantitative meaning implicitly through the system of logical operations we impose. In pure mathematics there really is no such problem as quantitative definition of a concept per se …

When economists talk about the problem of quantitative definitions of economic variables, they must have something in mind which has to do with real economic phenomena. More precisely, they want to give exact rules how to measure certain phenomena of real life.

Anyone who got a B+ in real analysis will have no problem with the first part of this statement. For the rest, this is the point: economic quantities come into existence only through some concrete human activity that involves someone writing down a number. You can ignore this, most of the time; but you should not ignore it all of the time. Because without that concrete activity there’s no link between economic theory and the social reality it hopes to help us master or make sense of.

Haavelmo has some sharp observations on the kind of economics that ignores the concrete activity that generates its data, which seem just as relevant to economic practice today:

Does a system of questions become less mathematical and more economic in character just by calling x “consumption,” y “price,” etc.? There are certainly many examples of studies to be found that do not go very much further than this, as far as economic significance is concerned.

There certainly are!

An equation, Haavelmo continues,

does not become an economic theory just by using economic terminology to name the variables invovled. It becomes an economic theory when associated with the rule of actual measurement of economic variables.

I’ve seen plenty of papers where the thought process seems to have been somthing like, “I think this phenomenaon is cyclical. Here is a set of difference equations that produce a cycle. I’ll label the variables with names of parts of the phenomenon. Now I have a theory of it!” With no discussion of how to measure the variables or in what sense the objects they describe exist in the external world.

What makes a piece of mathematical economics not only mathematics but also economics is this: When we set up a system of theoretical relationships and use economic names for the otherwise purely theoretical variables involved, we have in mind some actual experiment, or some design of an experiment, which we could at least imagine arranging, in order to measure those quantities in real economic life that we think might obey the laws imposed on their theoretical namesakes.

Right. A model has positive content only insofar as we can describe the concrete set of procedures that gets us from the directly accessible evidence of our senses. In my experience this comes through very clearly if you talk to someone who actually works in the physical sciences. A large part of their time is spent close to the interface with concrete reality — capturing that lizard, calibrating that laser.  The practice of science isn’t simply constructing a formal analog of physical reality, a model trainset. It’s actively pushing against unknown reality and seeing how it pushes back.

Haavelmo:

When considering a theoretical setup … it is common to ask about the actual meaning of this or that variable. But this question has no sense within the theoretical model. And if the question applies to reality it has no precise answer … we will always need some willingness among our fellow research workers to agree “for practical purposes” on questions of definitions and measurement …A design of experiments … is an essential appendix to any quantitative theory.

With respect to macroeconomics, the “design of experiments” means, in the first instance, the design of the national accounts. Needless to say, national accounting concepts cannot be treated as direct observations of the corresponding terms in economic theory, even if they have been reconstructed with that theory in mind. Cynamon and Fazzari’s paper on the measurement of household spending gives some perfect examples of this. There can’t be many contexts in which Medicare payments to hospitals, for example, are what people have in mind when they construct models of household consumption. But nonetheless that’s what they’re measuring, when they use consumption data from the national accounts.

I think there’s an important sense in which the actual question of any empirical macroeconomics work has to be: What concrete social process led the people working at the statistics office to enter these particular values in the accounts?

Or as Haavelmo puts it:

There is hardly an economist who feels really happy about identifying the current series of “national income, “consumptions,” etc. with the variables by those names in his theories. Or, conversely, he would think it too complicated or perhaps uninteresting to try to build models … [whose] variables would correspond to those actually given by current economic statistics. … The practical conclusion… is the advice that economists hardly ever fail to give, but that few actually follow, that one should study very carefully the actual series considered and the conditions under which they were produced, before identifying them with the variables of a particular theoretical model.

Good advice! And, as he says, hardly ever followed.

I want to go back to the question of the “meaning” of a variable, because this point is so easy to miss. Within a model, the variables have no meaning, we simply have a set of mathematical relationships that are either tautologous, arbitrary, or false. The variables only acquire meaning insofar as we can connect them to concrete social phenomena. It may be unclear to you, as a blog reader, why I’m banging on this point so insistently. Go to an economics conference and you’ll see.

The third central point of the piece is that meaningful explanation requires being able to identify a few causal links as decisive, so that all the other possible ones can be ignored.

Think back to that Paul Romer piece on what’s wrong with modern macroeconomics. One of the most interesting parts of it, to me, was its insistent Humean skepticism about the possibility of a purely inductive economics, or for that matter science of any kind. Paraphrasing Romer: suppose we have n variables, any of which may potentially influence the others. Well then, we have n equations, one for each variable, and n2 parameters (counting intercepts). In general, we are not going to be able to estimate this system based on data alone. We have to restrict the possible parameter space either on the basis of theory, or by “experiments” (natural or otherwise) that let us set most of the parameters to zero on the grounds that there is no independent variation in those variables between observations. I’m not sure that Romer fully engages with this point, whose implications go well beyond the failings of real business cycle theory. But it’s a central concern for Haavelmo:

A theoretical model may be said to be simply a restriction upon the joint variations of a system of quantities … which otherwise might have any value. … Our hope in economic theory and research is that it may be possible to establish contant and relatively simple relations between dependent variables … and a realtively small number of independent variables. … We hope that for each variable y to be explained, there is a realtively small number of explaining factors the variations of which are practically decisive in determining the variations of y. …  If we are trying to explain a certain observable varaible, y, by a system of causal factors, there is, in general, no limit to the number of such factors that might have a potential influence upon y. But Nature may limit the number of fctors that have a nonneglible factual influence to a relatively small number. Our hope for simple laws in economics rests upon the assumption that we may proceed as if such natural limitations of the number of relevant factors exist.

One way or another, to do empirical economic, we have to ignore mst of the logically possible relationships between our variables. Our goal, after all, is to explain variation in the dependent variable. Meaningful explanation is possible only if the number of relevant causal factors is small. If someone asks “why is unemployment high”, a meaningful answer is going to involve at most two or three causes. If you say, “I have no idea, but all else equal wage regulations are making it higher,” then you haven’t given an answer at all. To be masters of the hapennings of real life, we need to focus on causes of effects, not effects of causes.

In other words, ceteris paribus knowledge isn’t knowledge at all. Only unconditional claims count — but they don’t have to be predictions of a single variable, they can be claims about the joint distribution of several. But in any case we have positive knowledge only to the extent we can unconditionally say that future observations will fall entirely in a certain part of the state space. This fails if we have a ceteris paribus condition, or if our empirical works “corrects” for factors whose distribution and the nature of whose influence we have not invstigated.3 Applied science is useful because it gives us knowledge of the kind, “If I don’t turn the key, the car will not start, if I do turn the key, it will — or if it doesn’t there is a short list of possible reasons why not.” It doesn’t give us knowledge like “All else equal, the car is more likely to start when the key is turned than when it isn’t.”4

If probability distributions are simply tools for making unconditional claims about specific events, then it doesn’t make sense to think of them as existing out in the world. They are, as Keynes also emphasized, simply ways of describing our own subjective state of belief:

We might interpret “probability” simply as a measure of our a priori confidence in the occurrence of a certain event. Then the theoretical notion of a probability distribution serves us chiefly as a tool for deriving statements that have a very high probability of being true.

Another way of looking at this. Research in economics is generally framed in terms of uncovering universal laws, for which the particular phenomenon being  studied merely serves as a case study.5 But in the real world, it’s more oftne the other way: We are interested in some specific case, often the outcome of some specific action we are considering. Or as Haavelmo puts it,

As a rule we are not particularly interested in making statements about a large number of observations. Usually, we are interested in a relatively small number of observations points; or perhaps even more frequently, we are interested in a practical statement about just one single new observation.

We want economics to answer questions like, “what will happen if US imposes tariffs on China”? The question of what effects tariffs have on trade in the abstract is, itself, uninteresting and unanswerable.

What do we take from this? How, according to Haavelmo, should empirical economics be?

First, the goal of empirical work is to explain concrete phenomena — what happened, or will happen, in some particular case.

Second, the content of a theory is inseparable from the procedures for measuring the variables in it.

Third, empirical work requires restrictions on the logically possible space of parameters, some of which have to be imposed a priori.

Finally, prediction (the goal) means making unconditional claims about the joint distribution of one or more variables. “Everything else equal” means “I don’t know.”

All of this based on the idea that we study economics not as an end in itself, but in response to the problems forced on us by the world.