SPACs (Special Purpose Acquisition Companies) or “blank check” companies have been in the news recently, so I used some real SPAC data for this episode. Your spreadsheet has empty cells in column A, and these empty cells should be filled with values. Your task is to fill values down up until you find another cell with a value, at which point you need to fill that value down. This episode walks through how to do this programmatically with a script in Google Apps Script (for Google Sheets) and VBA (for Excel). This is the Google Sheet associated with the episode. The Google App Script is here and VBA script is here. See a quick example of what the issue is in the gif below and how the script “fills in” the values for you.
See the video below if you want to jump straight to the tutorial:
Why is this data structure a problem?
You’ve inherited a spreadsheet and the data structure looks like this:
It’s a list of data but there are empty cells in column A. This is usually a category or dimension in your data set that needs to be “filled down” so that the data set is complete. In the Google Sheet, each row represents one person that is associated with a given SPAC, but the
SPAC Ticker column is incomplete. You’ll usually get this type of data structure through the following:
- Data was manually created by someone who didn’t fill down the values in column A since they thought it was a “category”
- You are working a data set that originally came from a PivotTable but you only have the “values” from the PivotTable, not the PivotTable itself
This data structure is a problem because if you want to do any type of analysis on this data, it will be extremely difficult since you have missing values in column A. Sorting, filtering, and PivotTables are all out of the question if your data set looks like that screenshot.
Solving this with keyboard shortcuts
Totally doable for this Google Sheet. This is what you could do:
All I’m doing above is the following (on PC):
- SHIFT+CONTROL+DOWN ARROW – Select all the empty cells from the current cell with a value up until the next cell with a value
- SHIFT+UP ARROW – Reduce the selection by one row
- CONTROL+D – Fill the value from the first cell in the selection down
- CONTROL+DOWN ARROW – Skip to the next value that needs to be filled down
The obvious tradeoff here is time vs. human error. Every time I have to do this task on a spreadsheet, I think about whether it was worth filling the values down “manually” using keyboard shortcuts or using a VBA script (in Excel) to do this programatically. It really depends on the number of rows. For the example SPAC Google Sheet, doing this with keyboard shortcuts takes 10 seconds tops. If this spreadsheet was 1,000,000 rows, then we have a problem.
Don’t worry, I got you. Here’s the script you can use to do this programmatically.
Using Google Apps Script in Google Sheets
First off, here’s the script you can use for Google Sheets (gist here). Just 14 lines of code and you’re good to go:
Never used macros or Google Apps Script before? It’s super simply. First go to Tools then Script Editor:
You may be asked to authenticate your Google account so just hit Yes to all those screens. Copy/paste the script into the editor:
Go to File and Save in order to save the script into the Google Apps Script project. Go back to Google Sheets and go to Tools, Macros, and click Import to import the
fillValuesDown function into Google Sheets. Now you can use this function as a macro in your Google Sheet:
You can close out the Google Apps Script editor and now click on Tools, Macros, and click on fillValuesDown to run the script on your dataset:
How does the script work?
The script utilizes the Spreadsheet service for Google Apps Script to access the data object for your Google Sheet (more on that below). The script is really only 12 lines long, and does the following in sequential order:
- Sets the
spreadsheetvariable so that we can use the active worksheet you’re on
- Sets the
currentRangevariable to start from A2 to the last row in the table
- Two more variables are set:
newRangeto store the new range of values we want to put into column A, and
newFillValuewhich is kind of like an intermediate variable used in the loop
- The script goes through all values in
currentRange(including the blank ones) and adds all the correct values to the
currentRangeis then set equal to
newRangeto get all the “correct” values into column A
On the backend, the
currentRange array looks like this:
[['HZAC'], , ['FST'], ,  , ...]
The purpose of
newRange is to create a new array that is a complete list of values:
[['HZAC'], ['HZAC'], ['FST'], ['FST'], ['FST'] , ['FST']...]
Recording macros vs. programming Google Sheets
When I first started learning macros, the first thing I did was record my keystrokes and break down what the backend “code” looked like. Here’s what recording a macro looks like:
When you open up the script editor, you’ll see this:
There’s a lot of
getCurrentCell() functions being called. You can then deconstruct all these keystrokes to build a script that accomplishes the task. But here’s the key difference between recording keystrokes versus working with the data object:
You are programming keystrokes instead of the Google Sheets application.
Other advantages of programming the application instead of the keystrokes:
- Utilizes less compute resources and runs faster
- Easier to debug
- Easier to adapt to more scenarios and use cases
In the keystroke world, you are literally telling Google Sheets to select cells, select ranges, and moving the cursor around which doesn’t seem like a big deal. When you are working with hundreds of thousands of rows, this could cause serious performance issues. Since Google Apps Script runs in the cloud, you may not see these performance deficiencies, but you’ll definitely see this in your Excel workbooks.
Speaking of Excel workbooks…
Using the VBA script for Excel
The structure of the VBA script is pretty similar to the Google Apps Script, but it’s just a little different syntax. I’m not going to walk through the tutorial of how to set this up since it’s pretty similar to Google Sheets. In the VBA script, you do end up doing some “cell selection” like in line 8. Most of the script, however, is working with the Excel data object model so the script should run pretty quickly regardless of the size of your Excel file.
Other Podcasts & Blog Posts
In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:
- Developer Love #3: Developer Experience Teams with Peggy Rayzis of Apollo