Dear Analyst Episode 25: Structuring data challenge (denormalize data) with Get and Transform
Podcast: Play in new window | Download
Subscribe: Google Podcasts | Spotify | Stitcher | TuneIn | RSS
This episode is based on a video and Tweet posted by Mr. Excel (Bill Jelen). Bill discusses an Excel challenge someone emailed him about regarding how to “transform” a badly structured table of data into a structure that makes it easy to do PivotTables and other downstream analysis. Interestingly, I received a ticket from a user at work today about a very similar issue on denormalizing a table of data. Here is the Google Sheets file with the problem for you to follow along with.
The person is trying to convert this table of data (notice the quarters and employee sums across the columns):
Into a table like this where the employees are listed in a column with corresponding quarters of data:
Original Tweet and video below:
Get and Transform (Power Query) in Excel
The solution Bill walks through involves the Get and Transform feature in Excel. I’m on a Mac and don’t have Get and Transform since you need to be an Office 365 subscriber and my version of Excel is too old, but the method of transforming the data reminds me a lot of the transformations OpenRefine provides you. Major benefit of OpenRefine is that it’s free. After you unpivot the columns, Bill’s solution involves a bit of text manipulation to get your categories/dimensions to look the way you want.
Personally, I would’ve approached the problem like this user did with using the
MATCH formulas to pinpoint the row/column of data you need from the original “bad” dataset and getting the data to show up in your clean table:
I didn’t explore this solution entirely, but I prefer solving these data structuring problems with formulas vs. macros and “applied steps” in Get and Transform.
No-Code and keeping your identity small
In the second half of this episode, I mention an episode with Shawn Wang from Netlify on the Software Engineering Daily podcast, and highly recommend checking out a few blog posts related to the talkL:
- Keep Your Identity Small by Paul Graham’s
- No code is a lie by Shawn Wang
Other Podcasts & Blog Posts
In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:
- Software Engineering Daily: No Code with Shawn Wang
No comments yet.