Dear Analyst #38: Breaking down an Excel error that led to a $6.2B loss at JPMorgan Chase

You blink a few times at the screen and realize what you’re seeing is not a typo. $6.2B has left your bank due to some rogue trader making untimely bets on the market. That’s B as in billion. You call up the modeler who was supposed to make sure this never happens to your bank. The modeler takes a closer look at his model, and realizes that he made a fundamental error in how he calculates one value that caused the dominoes to fall. This is the story of the “London Whale” at JPMorgan Chase in 2012 who cost the bank $6.2B and a breakdown of the Excel error that may have caused the whole thing. This is the Google Sheet if you want to follow along with the Excel error.

Derivative of a derivative

I’m not going to pretend like a know the intricacies of all the financial products involved here, so you can read the Wikipedia article if you want the full details. In 2012, there was a CDS (credit default swap) product called CDX IG 9 that the trader at JPMorgan may have made large bets on, and ended up on the wrong side of the bet. The London trader’s name is Bruno Iksil, and it was a classic scenario of a gambler trying to get out of his losses by doubling down on black at the roulette table.

Source: The Fiscal Times

Multiple investigations were taken by the authorities in the U.S. and U.K., the the investigations show that a variety of institutional failures may have facilitated the large bets made by the London Whale. This HBR article by Ben Heineman, Jr. provides a nice summary of all the key players:

  • London traders – The traders simply didn’t understand the complexity of the derivative products they were buying and selling
  • Chief Investment Office (CIO) – The head of the CIO didn’t monitor the trading strategies and put in the proper controls for the portfolio of products the office was buying. The Value at Risk (VaR) model was flawed (see more below).
  • Firm-wide Leaders – Not enough oversight by the CFO and CEO (Jamie Dimon)
  • Board and Risk Policy Committee – The committee was told that everything was fine with the CIO, and didn’t get accurate pictures of what risk officers really felt about the risky trades being made.

Appendix of the Task Force Report by JPMorgan

There is a 130-page report created by JPMorgan Chase in 2012 which details what happened internally that led to this debacle. In my opinion, the juicy stuff starts in the appendix starting on page 121 of the report. I read off some parts of this appendix in this episode, but the appendix basically details issues with the VaR models created by one of the quantitative modelers at JPMorgan to more accurately value the complex traders that were happening. Or at least they thought the model was more accurate.

At the very end of the appendix, there’s a section called “Discovery of Problems with the New VaR Model and Discontinuance” where the report details the Excel error that contributed to the large inaccuracies in how the model valued risk.

The $6.2B Excel error

This is how the error is described in the report (emphasis mine):

Following that decision, further errors were discovered in the Basel II.5 model, including, most significantly, an operational error in the calculation of the relative changes in hazard rates and correlation estimates. Specifically, after subtracting the old rate from the new rate, the spreadsheet divided by their sum instead of their average, as the modeler had intended.

Note: I don’t have domain expertise in VaR models, synthetic credit derivatives, or trading in general. The following example is my over-simplification of the error based on what’s written in the report.

The report talks about hazard rates (for what I assume relate to the default of corporate loans in this case) and how the changes in the hazard rates were improperly calculated. Here’s a simple table from the Google Sheet showing fictitious dates, hazard rates, and the change in rates:

Now here’s what happens when you apply a SUM vs. an AVERAGE to the “Change in %” column:

This is hitting the border of my knowledge of growth rates and time periods, but the sum of changes will always be 5X the average of changes given there are 5 values we are summing/averaging.

The difficulty with detecting this type of formula error

The magnitude of the difference between the SUM and the AVERAGE is not what I think is interesting, but rather the absolute difference between the SUM and AVERAGE. Here is a chart plotting the same data:

Based on this chart, can you estimate what the average of the Change in % is? Looks like something around 0%, but 3% doesn’t feel that far off. The point I’m trying to make is that unless you are monitoring the SUM and AVERAGE consistently over time to detect any outliers, it will be difficult to know whether you made the formula mistake in the first place. With the presence of outliers, it makes it more clear that you might have an error in your model. Here’s the other table from the Google Sheet with intentionally skewed hazard rates:

Here we see the magnitude of the difference is still 5X, but the absolute difference is much wider. This would cause an analyst to look deeper into the model and try to figure out why there is such a large discrepancy. But this is only because there are fictitious hazard rates. In the case of JPMorgan Chase, my hunch is that the gap between the lower and upper bound of daily hazard rates was really narrow, so detecting a change like this would’ve been very difficult without the proper controls in place.

This reminds me of the tale of the boiling frog:

Urban myth has it that if you put a frog in a pot of boiling water it will instantly leap out. But if you put it in a pot filled with pleasantly tepid water and gradually heat it, the frog will remain in the water until it boils to death. (Source)

Without a really hot pot of boiling water, it was too late for JPMorgan to detect there was something wrong with the CDS trades, and the proverbial frog boils to death.

Hanlon’s Razor

One frame for this egregious Excel error is Hanlon’s Razor:

“Never attribute to malice that which is adequately explained by stupidity”, known in several other forms. It is a philosophical razor which suggests a way of eliminating unlikely explanations for human behavior. (Source)

Perhaps the modeler cannot be blamed for his Excel error because it was an error that he had know way of knowing or predicting. I’m not trying to remove blame from the modeler, but it’s an interesting frame to analyze the problem because this is an spreadsheet error that is difficult to prevent unless you have other models and risk controls that are able to predict this type of error in advance. There are many other cases of Excel errors that led to false calculations that cost firms millions of dollars, and it’s hard to say if one can blame the modeler for “malice” or plain stupidity.

New intermediate Excel class on Skillshare

Quick plug for a new Excel class I just launched today on Skillshare. It’s an intermediate Excel class for cleaning and analyzing data.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting: