Dear Analyst #41: How to do a VLOOKUP to the “left” without INDEX/MATCH with TikTok data
Podcast: Play in new window | Download
Subscribe: Google Podcasts | Spotify | Stitcher | TuneIn | RSS
Since TikTok is in the news right now about who is going to buy them, I thought using some fake-ish TikTok acquisition data would be relevant for this episode. A classic Excel/Google Sheets challenge: how to do a
VLOOKUP to the “left” e.g. your lookup column is not the first column in your lookup table. There are all sorts of strategies to overcome this issue with how your data is structured. Notably, the
INDEX/MATCH strategy is the most commonly-cited strategy when good ‘ol
VLOOKUP is not at your disposal. In this episode I walk through a strategy that allows you to use
VLOOKUP: array formulas. Skip to strategy #3 below if you want to see the answer. Associated Google Sheet for this episode if you want to follow along.
See the video below if you want to jump straight to the tutorial:
Why the VLOOKUP won’t work
If you are new to why
VLOOKUP won’t work in this scenario (see Google Sheet), take a look at the data data structure below:
ID in column A and we want to find
Company Name and
Market Cap in columns C and D, respectively, for these
ID in column A is the unique identifier for the row, and we need to do a lookup to
Company ID in column I.
While you can eyeball the result for the first row (“Triller” is the company for
ID 3), we want to find a scalable solution using formulas.
As you start writing the
VLOOKUP formula in column C, you’ll start to notice the problem: the
Company ID column is not the first column in your table to lookup the
ID value in column A:
Here are a few strategies for solving this problem (#3 is probably the one you haven’t seen before).
Strategy #1: Move the lookup column to the first column position
This is not the most ideal solution, but you could just simply cut and paste the
Company ID column and move it to the left-most “first” column of your lookup table. In Excel you would have to do a cut and paste, but in Google Sheets you can just drag and drop the column into the proper position:
Company Name will work correctly since
Company ID is the first column in your lookup table:
I don’t like this strategy because it involves some manual cutting and pasting of columns. If your lookup table isn’t static (e.g. might be sales data that gets added daily), then you might be ruining the “structure” of your data on subsequent updates. Let’s see what else we can do.
Strategy #2: Make copies of the columns to the right of the lookup column
Also not an ideal solution, but it works in one-off cases where your data is static and you don’t care about showing your back-end work to a colleague. It looks like data is duplicated, but you’re basically referencing existing columns in your table so that those columns appear to the “right” of your lookup column:
Now you can do a
VLOOKUP for columns I to K to get the
Company Name and
Market Cap values to show up in columns C and D:
Strategy #3 (preferred): Use array formulas
A relatively unknown feature in Google Sheets is you can create your own “tables” using array formulas. An array is simply a range of cells, and you can separate different range of cells using a semicolon. To create an array, you put curly brackets around your ranges. Here’s how an array of columns F and G would look like:
What’s the result? You simply get a reference to the two ranges after you enter the formula:
The key here is that you can create any order of range references in the array formula. We could’ve put G2:G6 first and F2:F6 second, and you would’ve seen the values in
Website first followed by
Company Name after entering the formula.
Knowing this, we can create our own lookup “table” using the array formula syntax like so:
Notice how the second argument in the
VLOOKUP formula is no longer a table, but rather an array of column I followed by columns F to H. In this array, the second “column” is
Company Name since we are saying column F is the second range of cells after column I.
Market Cap is now the fourth column in this array:
In order to fill this formula down, we need to turn the range references in the array formula into absolute references as shown above.
Strategy #4 (most common): INDEX/MATCH
As mentioned at the beginning of this post, this is the most common method for looking up values to the left. I won’t give a detailed explanation of how
INDEX/MATCH works, but here’s how you would get the
Company Name given the data structure:
Which strategy should you use?
I’m a little torn between strategies #3 and #4 since
INDEX/MATCH is the go-to method for looking up data to the left, and is also more performant than
VLOOKUP on large data sets. The fact that the array formula in strategy #3 doesn’t involve a nested formula makes it potentially easier to debug in complicated spreadsheets. I haven’t used an array formula in many
VLOOKUP situations since I learned
INDEX/MATCH such a long time ago, but I may try this strategy in the future.
Of course, this all becomes irrelevant if you have the
XLOOKUP function at your disposal which became available to certain Office 365 subscribers about a year ago (September 2019). This video is a fun poke at
XLOOKUP, but also holds some truth for the
VLOOKUP purists out there (start watching at 1:19):
A little Kant and poker
I talk about this in the 2nd half of the episode, but thought it would be worth sharing a passage from The Critique of Pure Reason as it relates to betting on your convictions. Listen to the Knowledge Project episode for the full background:
The usual touchstone, whether that which someone asserts is merely his persThe usual touchstone, whether that which someone asserts is merely his persuasion — or at least his subjective conviction, that is, his firm belief — is betting. It often happens that someone propounds his views with such positive and uncompromising assurance that he seems to have entirely set aside all thought of possible error. A bet disconcerts him. Sometimes it turns out that he has a conviction which can be estimated at a value of one ducat, but not of ten. For he is very willing to venture one ducat, but when it is a question of ten he becomes aware, as he had not previously been, that it may very well be that he is in error. If, in a given case, we represent ourselves as staking the happiness of our whole life, the triumphant tone of our judgment is greatly abated; we become extremely diffident, and discover for the first time that our belief does not reach so far. Thus pragmatic belief always exists in some specific degree, which, according to differences in the interests at stake, may be large or may be small.
Other Podcasts & Blog Posts
In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:
- The ShopTalk Show #424: Web Components, Frameworks vs Vanilla, Accessible Numbers, and SVG Memory Usage
- The Knowledge Project #89: Maria Konnikova: Less Certainty, More Inquiry
[…] you learn the VLOOKUP formula, your world opens up in terms of being able to analyze and manipulate data. There are hundreds if […]