Dear Analyst #83: Cleaning up and doing an exploratory data analysis of all Friends episodes raw data in Excel

I was playing around with a dataset on TikTok recently to show a PivotTable trick. The raw data is a list of all Friends episodes that have aired starting in 1994. I noticed that the raw data requires some cleanup so decided to take the dataset and to some cleaning operations and also try out some exploratory data analysis. I also explored building a simple dashboard using PivotTable Slicers in Excel. This is the process I might follow with cleaning up any raw dataset. The data isn’t too complex so it might be a good dataset to practice on. If you want to use the sample data, download the raw data here. This episode might be better for watching than listening, so you can see the full YouTube video below as well.

A quick view of the data

The dataset is relatively straightforward. It’s just a list of all Friends episodes with various attributes associated with each episode:

Issues with the dataset

You’ll notice some issues with this raw data that will make it difficult to do any exploratory analysis:

  • The Episode column is in dates instead of integers
  • The Rating/Share column is one value and ideally they should be separated into two unique columns
  • The U.S. viewers column is actually a text column where the “number” is written out as “21.5 million”

There are some other formatting issues with the dataset but these three issues prevent us from putting this data into a PivotTable where we can filter and sort the data. We want numbers to be actual numbers so that we can apply a SUM or AVERAGE to the column when summarizing the column in a PivotTable. The tutorial walks through solving these issues with the MID(), FIND(), and LEFT() functions.

Simple dashboarding with PivotTable Slicers

In addition to doing some exploratory analysis, we also want to make this data usable for our colleagues. By adding some basic PivotTable Slicers, we can turn the analysis into an interactive dashboard that anyone can filter and sort without having to go into the PivotTable options. Slicers are also available in Google Sheets too!

Other Podcasts & Blog Posts

No other episodes mentioned!