By Nate Meyvis — Feb 16, 2022

On Bryan Caplan's spreadsheets

Bryan Caplan is complaining that nobody looked at the spreadsheets on which he based The Case for Education. Tyler Cowen comments here.

I also did not look through Prof. Caplan's spreadsheets, and I'm probably one of the likeliest people in the world to have done it. I'm a computer programmer, quantitatively trained, and I care a lot about the subject.

So why didn't I read through the spreadsheets? Yes, I'm going to repeat some of the Excel complaints (see the comments), but with more context.

It's almost impossible to take someone seriously who says that they care so much about the accuracy of their results when they're calculating this intricately in Excel. If you really care about maintaining a complicated set of data relationships, you write readable, testable code somewhere.

Put another way: when you care enough that something like this is right, you test it with computers, not just people.

I know Bryan is aware of this complaint. He says:

P.S. Using Excel does not count as "demonstrable error." It’s my comparative advantage.

Comparative advantage at what? There are literally hundreds of thousands of people who are better than this at maintaining quantitative infrastructure with computers, so I don't think he's saying "I have a comparative advantage at maintaining quantitative infrastructure with computers, and this is how I do it." He might be implying that there's some aspect of this reasoning that is better done with Excel than with other tools, and that he is uniquely trained to do it--but I can't think of what that would be.

(By the way, I spent more of 2020-1 looking at work in Excel than I would have preferred to, and I suspect Bryan's Excel skills do not stack up as well as he thinks they do. But, happily, I am not really in a position to judge.)

A few further points:

When you say "I have put tons of effort into ensuring the accuracy of this Excel sheet," knowledgable people will not infer that your work is less likely to contain errors. They will assume that you are fighting quadratic growth and losing at it.

Compare: "I have spent X hours fixing my car myself this year. I have asked many neighbors to look at the car. I've issued an open invitation to the Internet also, allowing any respondent into my garage."

For large X:

Do you think that car is safe? Would you ride in it?
How much do you think that person really cares about car safety?
Are you surprised that more people did not respond to the open invitation?
Would you expect those who did respond to be the very best mechanics?

Crucially, how do your answers change as X goes from "large" to "very large"? I conclude that Bryan is not giving evidence for what he thinks he's giving evidence for when he talks about how much effort (of a certain sort) went into his spreadsheets.

Finally, remember that the attention of people who can best assess Bryan's work is very scarce. (Surely Bryan knows what this labor market is like?) A lot of them care very much about Bryan's work and conclusions, but the world is full of interesting data and programs to look at and make. To spend scarce time giving a professional look at these spreadsheets, I'd have to care not just more but so much more about this than about any project in the world that uses more modern tools. Elsewhere, I could use my attention at least an order of magnitude more efficiently.

Like Bryan, I conclude with regret about state of the art in professionalized quantitative social science, but I'm much more optimistic than he seems to be about the future of quantitative work more generally. (The ironies of Prof. Caplan, in this context, citing his professional training to justify his methods, are too obvious to elaborate on--but they are amusing nonetheless.)

Previous issue

On scarcity and regionality

Subscribe to Nate Meyvis

You'll get email when I post new essays and notes.

[email protected]