Dear Analyst https://www.thekeycuts.com/category/podcast/ A show made for analysts: data, data analysis, and software. Mon, 26 Sep 2022 16:37:56 +0000 en-US hourly 1 https://wordpress.org/?v=6.0.2 This is a podcast made by a lifelong analyst. I cover topics including Excel, data analysis, and tools for sharing data. In addition to data analysis topics, I may also cover topics related to software engineering and building applications. I also do a roundup of my favorite podcasts and episodes. KeyCuts clean episodic KeyCuts info@thekeycuts.com info@thekeycuts.com (KeyCuts) A show made for analysts: data, data analysis, and software. Dear Analyst https://www.thekeycuts.com/wp-content/uploads/2019/03/dear_analyst_logo-1.png https://www.thekeycuts.com/excel-blog/ TV-G New York, NY New York, NY 50542147 How to do a VLOOKUP with multiple conditions or criteria (3 methods) https://www.thekeycuts.com/how-to-do-a-vlookup-with-multiple-conditions-or-criteria-3-methods/ https://www.thekeycuts.com/how-to-do-a-vlookup-with-multiple-conditions-or-criteria-3-methods/#respond Mon, 26 Sep 2022 05:05:00 +0000 https://www.thekeycuts.com/?p=52397 Once you learn the VLOOKUP formula, your world opens up in terms of being able to analyze and manipulate data. There are hundreds if not thousands of tutorials on how to use the VLOOKUP formula since it’s such a powerful formula for finding the data you need in a long list. Comparable formulas include the […]

The post How to do a VLOOKUP with multiple conditions or criteria (3 methods) appeared first on .

]]>
Once you learn the VLOOKUP formula, your world opens up in terms of being able to analyze and manipulate data. There are hundreds if not thousands of tutorials on how to use the VLOOKUP formula since it’s such a powerful formula for finding the data you need in a long list. Comparable formulas include the combination of INDEX and MATCH and the new XLOOKUP formula which took the Excel world by storm. One common task you might need to do as an analyst is find data based on multiple conditions or criteria. VLOOKUP only allows you to lookup one specific value or criteria in a list of data. In this episode, I’ll describe three methods for doing a VLOOKUP when you have multiple conditions or criteria. These methods utilize more advanced formula features, and the third method is my favorite. Copy this Google Sheet to see the different methods in action.

Video tutorial of this episode:

Method #1: Creating a new array with ARRAYFORMULA and brackets

The first method is quite advanced and requires knowledge of the following:

In Excel, you can do something similar to the ARRAYFORMULA function in Google Sheets by pressing CTRL+SHIFT+ENTER when entering a formula in a cell. It’s not the most intuitive way of entering a formula. If you have Office 365, you actually don’t have to know how to use this keyboard shortcut at all.

This is not my preferred method for doing a VLOOKUP with multiple conditions but it is more scalable than my favorite method (method #3). In terms of the dataset, we have a list of cars and we want to find the Fuel_Type for a “ciaz” car made in “2015” and has “15,000” kilometers on the car (see the highlighted yellow row in the screenshot above). The reason we need to do a VLOOKUP with multiple conditions in this case is because there are multiple rows with a car name “ciaz” made in different years.

Explaining the formula for method #1

Let’s take a look at the formula and work inside out to see how this formula works:

=vlookup(1,{arrayformula((A2:A302=I5)*(B2:B302=I6)*(D2:D302=I7)),E2:E302},2,0)

The stuff inside the ARRAYFORMULA is a way to compare everything in the list to the Car Type, Year, and Kms_Driven defined in cells I5:I7. The syntax is a bit weird since you’re multiplying each condition to get the row that matches all the conditions. In plain English, it reads something like this:

Find rows where the Car_Name is "ciaz" AND the Year is "2015" AND the Kms_Driven is "15,000"

The reason you need to wrap this in an ARRAYFORMULA is because you are telling Google Sheets to look at all cells in a column (an array) instead of one cell at a time. The “result” of the ARRAYFORMULA is a list of 0s and 1s where the 1s represent rows that meet all the conditions:

So we have a list of 0s and 1s, now we need to join it with the actual list of values we want to return from our VLOOKUP function (the Fuel_Type column). This is where the left and right brackets come into play. Notice how there is a left bracket to the left of ARRAYFORMULA and a right bracket after the reference to E2:E302. That range is all the values in the Fuel_Type column. So what we get from this part of the formula {arrayformula((A2:A302=I5)*(B2:B302=I6)*(D2:D302=I7)),E2:E302} is a list of 0s and 1s on the left, and a list of fuel types on the right:

This is a list we are manually creating in Google Sheets to feed into the VLOOKUP formula. The VLOOKUP looks for the “1” in this 2-column list and returns the 2nd column from our fabricated list. In this case, the Fuel Type that meets all 3 conditions is “Petrol.”

Method #2: Using INDEX, MATCH, and INDEX

If you know advanced formulas in Excel/Google Sheets, you’re probably familiar with INDEX and MATCH as an alternative to VLOOKUP. I think this method is slightly easier than the 1st method since it utilizes INDEX and MATCH in a way we are used to. This second method uses a combination of INDEX and MATCH to get the Fuel_Type from column E that match all the conditions in column I:

=index(E2:E302,match(1,index((A2:A302=I13)*(B2:B302=I14)*(D2:D302=I15),0,1),0))

While this formula is still advanced, it doesn’t require the knowledge of ARRAYFORMULA. It’s just a clear use of the INDEX function since this function can accept arrays as a parameter.

Explaining the formula for method #2

The key part of this formula is the index((A2:A302=I13)*(B2:B302=I14)*(D2:D302=I15),0,1) portion. Similar to method #1, you multiple each condition that you want to filter your list on. Our conditions are in cells I13:I15. What is the purpose of wrapping this in the INDEX function? As stated earlier, the INDEX function can accept an array as the first parameter. So that first parameter is an array of 0s and 1s (just like we saw in method #1). Then the 2nd parameter “0” and 3rd parameter “1” simply tells Google Sheets to return the entire column back as the result. There is only one column in our list so that’s why we simply need that “1” as the 3rd parameter.

From here, the rest of the formula should be pretty easy to understand if you’ve used INDEX and MATCH as a substitute for VLOOKUP. The MATCH function looks for the number “1” in the list of 0s and 1s. The outer INDEX function looks at the Fuel_Type column (column E) and returns back the value based on the result of the MATCH function.

Method #3 (Preferred): Combining multiple columns to create a unique “key” column

I like this method because it’s the easiest to understand and implement. You’ve probably concatenated multiple columns before to create a unique key in a list. It’s kind of a hack to create a unique ID when your data has multiple columns that have the same values. Some people might think this is a hack and it totally is! I’d say if your dataset and workflow match these three conditions, it’s ok to use this method when you want to do a VLOOKUP with multiple criteria:

  • You have edit access to the Excel file or Google Sheet and can add additional columns
  • Your dataset doesn’t change much
  • The number of columns your are concatenating (e.g. the number of criteria) won’t change much

When you’re doing one-off analysis, you’ll typically meet the above conditions and I think it’s “safe” to use this method.

=vlookup(J22&J23&J24,A2:H302,6,0)

Explaining the formula for method #3

There are multiple ways to optimize this formula. You could add named ranges, reference all of columns A and H, and more. Remember: we are doing a quick one-off analysis, so let’s just find the easiest and quickest way to get our answer. All we have is a simple VLOOKUP formula and nothing else. The “key” is the first parameter in our formula.

In the screenshot above, notice how our conditions are now in column J. I added a helper column by inserting a blank new column in column A. That helper column is a combination of columns B, C, and E. These happen to be the columns we care about since they are the columns that contain our conditions. So in cell A2, the formula would be =B2&C2&E2. You could also do =CONCATENATE(B2,C2,E2) but I like the formula with the ampersand “&” because it’s just easier.

When you do the VLOOKUP, you’re simply looking for the combination of cells J22, J23, and J24 in column A. In theory, column A should be a unique list of “IDs” since you are combining three different columns together. The rest of the VLOOKUP formula should be simple to understand.

Advanced Excel formulas and features class

In this episode, you learned about some advanced formulas for looking up multiple conditions. I recently published a class on Skillshare where I teach advanced formulas like the ones mentioned in this episode. The class is called Excel: Advanced Formulas & Features to Create Efficient Team Workflows. Take your Excel skills to the next level and help your team make better business decisions by taking this class! You’ll learn things like NPV, IRR, and how to use the advanced features in Excel like Goal Seek.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post How to do a VLOOKUP with multiple conditions or criteria (3 methods) appeared first on .

]]>
https://www.thekeycuts.com/how-to-do-a-vlookup-with-multiple-conditions-or-criteria-3-methods/feed/ 0 Once you learn the VLOOKUP formula, your world opens up in terms of being able to analyze and manipulate data. There are hundreds if not thousands of tutorials on how to use the VLOOKUP formula since it's such a powerful formula for finding the data yo... Once you learn the VLOOKUP formula, your world opens up in terms of being able to analyze and manipulate data. There are hundreds if not thousands of tutorials on how to use the VLOOKUP formula since it's such a powerful formula for finding the data you need in a long list. Comparable formulas include the combination of INDEX and MATCH and the new XLOOKUP formula which took the Excel world by storm. One common task you might need to do as an analyst is find data based on multiple conditions or criteria. VLOOKUP only allows you to lookup one specific value or criteria in a list of data. In this episode, I'll describe three methods for doing a VLOOKUP when you have multiple conditions or criteria. These methods utilize more advanced formula features, and the third method is my favorite. Copy this Google Sheet to see the different methods in action.







Video tutorial of this episode:




https://www.youtube.com/watch?v=OYQm7XHCZoA




Method #1: Creating a new array with ARRAYFORMULA and brackets







The first method is quite advanced and requires knowledge of the following:



* Boolean logic* The ARRAYFORMULA function in Google Sheets* Brackets {} for creating arrays in Google Sheets



In Excel, you can do something similar to the ARRAYFORMULA function in Google Sheets by pressing CTRL+SHIFT+ENTER when entering a formula in a cell. It's not the most intuitive way of entering a formula. If you have Office 365, you actually don't have to know how to use this keyboard shortcut at all.



This is not my preferred method for doing a VLOOKUP with multiple conditions but it is more scalable than my favorite method (method #3). In terms of the dataset, we have a list of cars and we want to find the Fuel_Type for a "ciaz" car made in "2015" and has "15,000" kilometers on the car (see the highlighted yellow row in the screenshot above). The reason we need to do a VLOOKUP with multiple conditions in this case is because there are multiple rows with a car name "ciaz" made in different years.



Explaining the formula for method #1



Let's take a look at the formula and work inside out to see how this formula works:



=vlookup(1,{arrayformula((A2:A302=I5)*(B2:B302=I6)*(D2:D302=I7)),E2:E302},2,0)



The stuff inside the ARRAYFORMULA is a way to compare everything in the list to the Car Type, Year, and Kms_Driven defined in cells I5:I7. The syntax is a bit weird since you're multiplying each condition to get the row that matches all the conditions. In plain English, it reads something like this:



Find rows where the Car_Name is "ciaz" AND the Year is "2015" AND the Kms_Driven is "15,000"



The reason you need to wrap this in an ARRAYFORMULA is because you are telling Google Sheets to look at all cells in a column (an array) instead of one cell at a time. The "result" of the ARRAYFORMULA is a list of 0s and 1s where the 1s represent rows that meet all the conditions:




]]>
Dear Analyst 105 24:36 52397
Dear Analyst #104: Creating a single source of truth by cleaning marketing analytics data with Austin Dowd https://www.thekeycuts.com/dear-analyst-104-creating-a-single-source-of-truth-by-cleaning-marketing-analytics-data-with-austin-dowd/ https://www.thekeycuts.com/dear-analyst-104-creating-a-single-source-of-truth-by-cleaning-marketing-analytics-data-with-austin-dowd/#respond Mon, 05 Sep 2022 14:14:25 +0000 https://www.thekeycuts.com/?p=51823 I’m always fascinated by how different people on the podcast find their way into data, and this episode is no exception. Austin Dowd has always enjoyed photography. He was associated with the American Marketing Association and was always curious in the metrics for his photos. After pestering the analytics person in the AMA in terms […]

The post Dear Analyst #104: Creating a single source of truth by cleaning marketing analytics data with Austin Dowd appeared first on .

]]>
I’m always fascinated by how different people on the podcast find their way into data, and this episode is no exception. Austin Dowd has always enjoyed photography. He was associated with the American Marketing Association and was always curious in the metrics for his photos. After pestering the analytics person in the AMA in terms of how to analyze his photo metrics, he eventually decided to do a career change into analytics and received a Nanodegree in data analytics through Udemy. This career change coincided with the onset of the pandemic as his photography business started slowing down. We talk about Austin’s experiences working for a big conglomerate to a startup, working with messy data, and how photography is just like data visualization because you’re telling a story.

Marketing analytics at a startup vs. a big conglomerate

As you can imagine, working for a big company comes with its pros and cons. You have a ton of resources to tackle large problems and projects, but the change management can be quite the process. Multiple teams and stakeholders are involved, and changes can take months to years depending on the type of change you are trying to make.

Source: xkcd

Austin worked at Cox Automotive and talked about how the data stack at Cox Automotive was custom built years ago. That means even small changes to the system were very hard to adjust. Then you have the issue of clients wanting custom edits to their reporting. If you don’t have a data engineering team that can build an infrastructure that allows analysts to edit reports on the fly, you’ll start getting into being a consulting company providing custom data analytics solutions.

Austin moved to a startup called Blues Wireless where they built a robust data stack, but they didn’t necessarily have the marketing team in mind when building out the stack. Product usage analytics were top of mind for the small but budding analytics team. Austin was brought in to coordinate web analytics projects so that the marketing funnel–from a website visitor to a conversion–could be better quantified. Getting accurate data, however, is paramount to this project because you can’t make decisions on bad data.

Website data platforms fighting each other

Austin is currently enrolled in a data analytics masters program, and he talked about how cleaning “dirty” data is very different in the academic world vs. the real world. This is true for any discipline, I suppose. In the academic world, the problem space is confined and there are “right” answers, so to speak. In the real world, surprises and nuance are littered all over the place. In the marketing analytics world, you are working with a data format that a Google Analytics or some CRM platform forces upon you. In rare scenarios, cleaning data is as simple as using Python to get rid of a bunch of NULL and N/A values.

Austin realized that page data was being counted twice. I’m sure all of you have dealt with double-counting data and coming up with a system to de-duplicate data. The problem came down to Google Tag Manager fighting with Segment to report on page data for the website. Austin uses GA to view early marketing funnel activities and Segment pulls late marketing funnel activity. Once he got these two systems working together and the output is clean data, he then pipes the data into Google Data Studio so that the data is accessible for Blues’ content creators and business stakeholders.

Source: Shopify

On the topic of de-duplicating data, I feel like all analysts have to go through solving this problem at some stage in their career. Finding the root cause of duplicate data requires a lot of spelunking through different systems, curiosity, and down right determination.

Austin’s team came to a point where they couldn’t de-duplicate the data anymore since the data was simply not reliable. There were two systems and around the new year, their team just said we’re going to create a “point of trust.” This is a specific time where they say the data coming from one of the systems is clean, and everyone will trust the data coming out of that system going forward.

How a background in photography helps with data storytelling

When Austin was working with businesses as a photographer, he talked about being a visual storyteller. His clients wanted to deliver a message with the help of Austin’s photos, so Austin’s job involved working with different people and departments to figure out their goals. He then came up with a strategy on how to capture the right photo to meet those goals. In his new world of data analytics, he says he takes the same approach. The only difference is that he’s working with data instead of photos.

In terms of translating his analytics work to business outcomes, content creators at his company want to know some basic metrics on what links are being clicked on. This way, content creators know how which content resonates the most with site visitors. Austin created a dashboard for these content creators so that they always know what content is doing well on company site.

Austin’s boss also wants to know what the ROI from their various marketing campaigns is. This is where setting up tracking on the entire marketing funnel is important. Through Google Analytics (what Austin uses to track top of funnel metrics), you only get anonymized data at an aggregate level. With this data, Austin can see which web sessions lead to a sale or conversion.

Austin’s company is a B2B company which sells a product to data or engineering teams. Customers don’t just go to the website once and decide to buy their product. It might take several visits to the homepage, a blog post, or a webinar before they finally convert. For these bottom of the funnel metrics, Google Data Studio helps Austin map out the flow from someone who visits the website to the eventual conversion. Austin also uses Tableau to visualize Segment data which tracks bottom of the funnel metrics as well. In marketing terms, you might hear this type of tracking called the “360 degree view of the customer.”

Adding visitor scoring to marketing analytics stack

In my opinion, B2B content has been going through quite a transformation over the last 5 years. Long white papers, case studies, and webinars are being replaced with shorter TikTok-style content, podcasts, and Instagram Reels. There is still a world for this older type of content in some industries, but I’m seeing more content being created by employees at these B2B companies which allows for more personalized and “real” content. LinkedIn newsletters like this one, for example, allow you to reach a “business” audience that a traditional white paper gated behind an email signup would not provide.

Source: The Drum

Content creators at Austin’s company are also creating B2B content. Are the people consuming the content interested in Blues Wireless as a hobby or would they actually become paying customers? Austin helped created a dashboard that scores a site visitor to help the content team differentiate between a hobbyist and a power user of their product. For instance, if the visitor read a blog post they would get 1 point, but signing up for a trial might be 5 points. Creating a scoring system like this requires constant tweaking. Austin realized that a lot of the points were “front-loaded,” meaning people who consumed an educational resource on their website usually didn’t go back to that resource. This resulted in scores dropping substantially a few quarters later.

What a perfect marketing analytics future looks like

I asked Austin if he had a magic wand, what would he change to get the perfect marketing analytics system? He said at a startup, the landscape changes every week. A few weeks ago, the sales team might be ramping up so he was doing research on which companies would be good to target and putting those analytics together for the sales team. Another week, he might be doing late-funnel stage marketing analytics work working with Segment data. This sounds like the typical lifestyle for anyone who works at a startup :).

Source: Art Science Millennial – Substack

In the future, Austin hopes that the systems they’ve set up will stay fixed, and they can focus on doing more predictive analytics. Using machine learning, they might be able to figure out which visitors will most likely convert to a customer. Or which feature used in the product might lead to more usage and higher revenue in the future. At the end of the day, all this requires not just a solid system, but also clean data so that you can make solid predictions for the future.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #104: Creating a single source of truth by cleaning marketing analytics data with Austin Dowd appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-104-creating-a-single-source-of-truth-by-cleaning-marketing-analytics-data-with-austin-dowd/feed/ 0 I'm always fascinated by how different people on the podcast find their way into data, and this episode is no exception. Austin Dowd has always enjoyed photography. He was associated with the American Marketing Association and was always curious in the... I'm always fascinated by how different people on the podcast find their way into data, and this episode is no exception. Austin Dowd has always enjoyed photography. He was associated with the American Marketing Association and was always curious in the metrics for his photos. After pestering the analytics person in the AMA in terms of how to analyze his photo metrics, he eventually decided to do a career change into analytics and received a Nanodegree in data analytics through Udemy. This career change coincided with the onset of the pandemic as his photography business started slowing down. We talk about Austin's experiences working for a big conglomerate to a startup, working with messy data, and how photography is just like data visualization because you're telling a story.







Marketing analytics at a startup vs. a big conglomerate



As you can imagine, working for a big company comes with its pros and cons. You have a ton of resources to tackle large problems and projects, but the change management can be quite the process. Multiple teams and stakeholders are involved, and changes can take months to years depending on the type of change you are trying to make.



Source: xkcd



Austin worked at Cox Automotive and talked about how the data stack at Cox Automotive was custom built years ago. That means even small changes to the system were very hard to adjust. Then you have the issue of clients wanting custom edits to their reporting. If you don't have a data engineering team that can build an infrastructure that allows analysts to edit reports on the fly, you'll start getting into being a consulting company providing custom data analytics solutions.



Austin moved to a startup called Blues Wireless where they built a robust data stack, but they didn't necessarily have the marketing team in mind when building out the stack. Product usage analytics were top of mind for the small but budding analytics team. Austin was brought in to coordinate web analytics projects so that the marketing funnel--from a website visitor to a conversion--could be better quantified. Getting accurate data, however, is paramount to this project because you can't make decisions on bad data.



Website data platforms fighting each other



Austin is currently enrolled in a data analytics masters program, and he talked about how cleaning "dirty" data is very different in the academic world vs. the real world. This is true for any discipline, I suppose. In the academic world, the problem space is confined and there are "right" answers, so to speak. In the real world, surprises and nuance are littered all over the place. In the marketing analytics world, you are working with a data format that a Google Analytics or some CRM platform forces upon you. In rare scenarios, cleaning data is as simple as using Python to get rid of a bunch of NULL and N/A values.



Austin realized that page data was being counted twice. I'm sure all of you have dealt with double-counting data and coming up with a system to de-duplicate data. The problem came down to Google Tag Manager fighting with Segment to report on page data for the website. Austin uses GA to view early marketing funnel activities and Segment pulls late marketing funnel activity.]]>
KeyCuts 35:50 51823
Dear Analyst #103: How to use one of the best features in PivotTables to filter your data (Slicers) https://www.thekeycuts.com/dear-analyst-103-how-to-use-one-of-the-best-features-in-pivottables-to-filter-your-data-slicers/ https://www.thekeycuts.com/dear-analyst-103-how-to-use-one-of-the-best-features-in-pivottables-to-filter-your-data-slicers/#respond Mon, 22 Aug 2022 06:08:05 +0000 https://www.thekeycuts.com/?p=52052 I used to create a monthly 30-slide report and each slide had a different table or chart that I copied and pasted from Excel. As a naive analyst, I literally filtered my list of data using regular dropdown filters on each column to get the numbers I needed. I would filter, sum or average the […]

The post Dear Analyst #103: How to use one of the best features in PivotTables to filter your data (Slicers) appeared first on .

]]>
I used to create a monthly 30-slide report and each slide had a different table or chart that I copied and pasted from Excel. As a naive analyst, I literally filtered my list of data using regular dropdown filters on each column to get the numbers I needed. I would filter, sum or average the data, and then enter the data onto the slide. It was super manual work. One benefit was I got really good at using keyboard shortcuts to filter a list of data. I didn’t realize that a PivotTable could easily automate my report. I could’ve built several PivotTables off of my raw data, stuck each PivotTable on an individual worksheet, filter each PivotTable to the data I need, and I’m done.

PivotTables continue to be one of the most important features in Excel, and in this episode I walk through how to use Slicers, one of the best features for filtering your data in PivotTables. You can download the Excel workbook used in this episode here. I also just launched a new Advanced PivotTable Dashboard class on Skillshare which I’ll talk about at the end.

Watch this episode to see a video tutorial on how to use Slicers in PivotTables

How to create a PivotTable using YouTuber data

The raw data (download file for this episode) we we are using for this episode is a list of the top 200 YouTubers and data associated with their channels. Interestingly, I don’t recognize many of the channels on this list (or I just don’t watch enough YouTube):

Let’s create a PivotTable to better filter through this list of 200 YouTubers. It’s not a huge list but maybe we’d like a quick way to see the top channels in a certain country, category, etc. On a Mac Excel, the easiest way is to go to the Insert menu, click on PivotTable, and then hit OK. You can keep all the settings in the menu as is since we want to create a PivotTable on a new worksheet:

If you don’t want to start from scratch with building a PivotTable, you can have Excel suggest a PivotTable for you. Instead of clicking on the PivotTable button, click on “Recommended PivotTables” and you’ll see a worksheet get created with the PivotTable fields filled out:

Building out PivotTable with filters, rows, and columns

Let’s summarize our data in this PivotTable by finding the average number of Likes and Followers by Main Video Category and then Category. To set this up, your PivotTable fields should look like this:

The resulting PivotTable isn’t well formatted as you can see below:

A few things we can do to fix the formatting so this PivotTable is a bit more clear:

  1. Currently the Values show “Sum of Likes” and “Sum of followers,” and we want to change both of these Values to Average (since we want to find the average Followers and Likes)
  2. Give consistent formatting to the numbers so that there are commas in the thousandths place and no decimals
  3. Remove the “(blank)” option
  4. Change the layout of the PivotTable to the “Classic PivotTable layout”

If you watch the video tutorial for this episode, you’ll see how to do all the above steps. The final PivotTable output from these formatting steps looks like this which is a little more usable than before:

A note about the classic PivotTable layout: it’s the best layout to learn from if you’re just getting started with PivotTables. If you look at the default PivotTable layout, you’ll notice it’s difficult to differentiate between the “Main View Category” and the “Category.” It’s actually really hard.

With the classic PivotTable layout, you can clearly see that you are first pivoting by the “Main Video Category,” and then by “Category.” The hierarchy is really clear. Setting the default PivotTable layout is so common that Mr. Excel himself wrote a blog post on how to set defaults for future PivotTables (applies to PC users only).

How to create an interactive Slicer for a PivotTable

If you want to filter the PivotTable to say the “Entertainment” Main Video Category, you could go through the dropdown menu in the header of the PivotTable like this:

If you’re trying to create an interactive dashboard with various PivotTable and PivotCharts, this isn’t a scalable solution and it’s not very user friendly. More importantly, you usually want to filter all your PivotTables and charts on a dashboard using the same filter. A Slicer in Excel is the exact feature we need to filter our PivotTables and charts. It’s like Slicers were built for dashboards in Excel.

To create a Slicer for our PivotTable for the “Main Video Category” column, we can follow these steps (for Mac Excel):

  1. Click on the PivotTable Analyze menu in the ribbon
  2. Click on Insert Slicer
  3. Then click on the column we want to build our Slicer for (“Main Video Category” in this case)

How to filter a PivotTable with a Slicer

Now that we have the Slicer, we can simply click on the value we want to filter the PivotTable by. If you click on “Entertainment,” you’ll see the PivotTable instantly filter to that Main Video Category:

You can also change the Slicer so that it allows you to select multiple values so that your PivotTable can be filtered by multiple options. You simply click on the checkbox icon in the top-right of the Slicer:

On the ribbon, there are a ton of different ways you can format your Slicer. One common formatting tip for Slicers is to put the selectable options into multiple columns to make the Slicer easier to use:

How to filter multiple PivotTables with one Slicer

The best part about Slicers is that you can filter multiple PivotTables and PivotCharts with just one Slicer. Like I said earlier, this is what makes Slicers the perfect feature for making an interactive dashboard. Your teammates and colleagues can now filter all the data and charts on a dashboard to the exact values they care about using Slicers.

The easiest way to have a Slicer filter multiple PivotTables is to simply do a copy and paste an existing PivotTable on the same worksheet. It’s that simple. Just by doing the copy and paste, the existing Slicer on the worksheet will automatically get connected to the new PivotTable you created. Let’s go ahead and copy the existing PivotTable we created earlier and move it to the right of the Slicer we created:

You’ll see we have a replica of our original PivotTable starting in column J. For the sake of this example, let’s remove the “Category” field from our new PivotTable and add in the “Main Topic” field instead. Your new PivotTable should now have “Main Video Category” and “Main Topic” in the Rows fields:

Now try clicking on various Main Video Categories in the Slicer and notice how it filters both PivotTables at the same time:

I inserted a PivotChart based off of the first PivotTable, and the Slicer also filters that PivotChart! Now you can see how powerful the Slicer is for making your dashboard more interactive:

How to disconnect a Slicer from filtering a PivotTable

A natural question to wonder about Slicers is can you disconnect a Slicer from a PivotTable? You absolutely can. On a big dashboard, you might have multiple Slicers that are filtering different parts of your dashboard.

When you right-click a Slicer, there’s a “Report Connections” options which let’s you disconnect a PivotTable from one or several PivotTables. In this case, we only have two PivotTables connected to our “Main Video Category” Slicer:

You’ll notice that the name of our two PivotTables in this example are “PivotTable1” and “PivotTable6.” When you’re copying and pasting PivotTables, Excel automatically assigns these generic names to your PivotTables. You should give your PivotTables unique names so that it’s clear which PivotTable you are connecting and disconnecting from a Slicer.

If you want to rename a PivotTable, click on the “PivotTable Analyze” menu in the ribbon, and in the left-most part of the menu, you’ll see a form to enter the PivotTable Name:

New advanced PivotTable class for creating a dashboard

As I mentioned earlier, I recently launched a brand new advanced Excel class on Skillshare called Advanced PivotTable Techniques for Creating a Cohesive Dashboard. You’ll learn how to use features like Slicers to create an interactive dashboard that your teammates can use to make better business decisions. I had a lot of fun digging into some advanced PivotTable features and I hope they help you analyzing and visualizing data better!

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #103: How to use one of the best features in PivotTables to filter your data (Slicers) appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-103-how-to-use-one-of-the-best-features-in-pivottables-to-filter-your-data-slicers/feed/ 0 I used to create a monthly 30-slide report and each slide had a different table or chart that I copied and pasted from Excel. As a naive analyst, I literally filtered my list of data using regular dropdown filters on each column to get the numbers I ne... I used to create a monthly 30-slide report and each slide had a different table or chart that I copied and pasted from Excel. As a naive analyst, I literally filtered my list of data using regular dropdown filters on each column to get the numbers I needed. I would filter, sum or average the data, and then enter the data onto the slide. It was super manual work. One benefit was I got really good at using keyboard shortcuts to filter a list of data. I didn't realize that a PivotTable could easily automate my report. I could've built several PivotTables off of my raw data, stuck each PivotTable on an individual worksheet, filter each PivotTable to the data I need, and I'm done.







PivotTables continue to be one of the most important features in Excel, and in this episode I walk through how to use Slicers, one of the best features for filtering your data in PivotTables. You can download the Excel workbook used in this episode here. I also just launched a new Advanced PivotTable Dashboard class on Skillshare which I'll talk about at the end.



Watch this episode to see a video tutorial on how to use Slicers in PivotTables




https://www.youtube.com/watch?v=HiyDGYs5EOM




How to create a PivotTable using YouTuber data



The raw data (download file for this episode) we we are using for this episode is a list of the top 200 YouTubers and data associated with their channels. Interestingly, I don't recognize many of the channels on this list (or I just don't watch enough YouTube):







Let's create a PivotTable to better filter through this list of 200 YouTubers. It's not a huge list but maybe we'd like a quick way to see the top channels in a certain country, category, etc. On a Mac Excel, the easiest way is to go to the Insert menu, click on PivotTable, and then hit OK. You can keep all the settings in the menu as is since we want to create a PivotTable on a new worksheet:







If you don't want to start from scratch with building a PivotTable, you can have Excel suggest a PivotTable for you. Instead of clicking on the PivotTable button, click on "Recommended PivotTables" and you'll see a worksheet get created with the PivotTable fields filled out:







Building out PivotTable with filters, rows, and columns



Let's summarize our data in this PivotTable by finding the average number of Likes and Followers by Main Video Category and then Category. To set this up, your PivotTable fields should look like this:







The resulting PivotTable isn't well formatted as you can see below:







A few things we can do to fix the formatting so this PivotTable is a bit more clear:



* Currently the Values show "Sum of Likes" and "Sum of followers," and we want to change both of these Values to Average (since we want to find the average Followers and Likes)* Give consistent formatting to the numbers so that there are commas in the thousandths place and no decimals* Remove the "(blank)" option* Change the layout of the PivotTable to the "Classic PivotTable layout"



]]>
KeyCuts 22:06 52052
Dear Analyst #102: Building a culture of experimentation on a data analytics team with Mel Restori, former Director of Analytics at Trove https://www.thekeycuts.com/dear-analyst-102-building-a-culture-of-experimentation-on-a-data-analytics-team-with-mel-restori-former-director-of-analytics-at-trove/ https://www.thekeycuts.com/dear-analyst-102-building-a-culture-of-experimentation-on-a-data-analytics-team-with-mel-restori-former-director-of-analytics-at-trove/#respond Mon, 08 Aug 2022 05:02:00 +0000 https://www.thekeycuts.com/?p=51656 Experimentation is a valuable activity in a variety of functions. A product team should be constantly experimenting with features to see which variant leads to the most engagement, sales, or some target metric. But what about on a data analytics team? Mel Restori is the former Director of Analytics & Analytics Engineering at Trove, a […]

The post Dear Analyst #102: Building a culture of experimentation on a data analytics team with Mel Restori, former Director of Analytics at Trove appeared first on .

]]>
Experimentation is a valuable activity in a variety of functions. A product team should be constantly experimenting with features to see which variant leads to the most engagement, sales, or some target metric. But what about on a data analytics team? Mel Restori is the former Director of Analytics & Analytics Engineering at Trove, a resale platform for some of the world’s top brands. Mel’s career has always been in tech, but she started as an industrial engineer. She discusses her path to leading the analytics team at Trove, how to build a culture of experimentation, and how to navigate a career in analytics.

Experimentation to drive decision making

While Mel was at Trove, her team led the charge on implementing an experimentation framework, including incorporating a third party A/B testing tool. People at Trove were hungry for data analysis and wanted to know how to make better product decisions using a new testing framework. The end result will be to give all PMs and analysts an easy way to run thoughtful experiments. In addition to developing this testing framework, Mel was also a big believer in how people internally would learn about testing. At the end of the day, PMs and analysts should be able to make better decisions.

Testing what works on a brand’s website when it comes to reselling is quite different from a traditional e-commerce site. “Recommerce” provides one customer with an experience for trading in their existing goods and another customer with buying a resale item. The product team may not have the luxury of seeing what competitors are doing since there might only be one color or one size of a product.

Pursuing an analytics degree at university and pivoting into data analytics

Mel talked at length about how people like herself have stumbled into data analytics. For Mel, she thought this was the only way to do it since there wasn’t a data analytics “major” when she was in college. I completely agree with her on this point since the only data analytics experience I received was on the job.

Now you can major in specific areas of data analytics like data science and machine learning. If you major in data analytics, your career path is a little more straightforward. One thing Mel pointed out is that datasets are never as clean as what you see in school. When you’re working with real world data, most of the time you’re just cleansing data to get the data in a usable format.

Source: I-O Psych Memes on Twitter

Mel gave some great advice for those who want to learn a specific skill or tool like Python or Excel. Look for a larger company that has an entire analytics department built out where you can specialize. You probably won’t have to do as much data cleaning compared to the data you might come across at a startup. A benefit of being at a startup, according to Mel, is that you can figure out where you would thrive but wearing multiple hats.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #102: Building a culture of experimentation on a data analytics team with Mel Restori, former Director of Analytics at Trove appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-102-building-a-culture-of-experimentation-on-a-data-analytics-team-with-mel-restori-former-director-of-analytics-at-trove/feed/ 0 Experimentation is a valuable activity in a variety of functions. A product team should be constantly experimenting with features to see which variant leads to the most engagement, sales, or some target metric. But what about on a data analytics team? Experimentation is a valuable activity in a variety of functions. A product team should be constantly experimenting with features to see which variant leads to the most engagement, sales, or some target metric. But what about on a data analytics team? Mel Restori is the former Director of Analytics & Analytics Engineering at Trove, a resale platform for some of the world’s top brands. Mel’s career has always been in tech, but she started as an industrial engineer. She discusses her path to leading the analytics team at Trove, how to build a culture of experimentation, and how to navigate a career in analytics.







Experimentation to drive decision making



While Mel was at Trove, her team led the charge on implementing an experimentation framework, including incorporating a third party A/B testing tool. People at Trove were hungry for data analysis and wanted to know how to make better product decisions using a new testing framework. The end result will be to give all PMs and analysts an easy way to run thoughtful experiments. In addition to developing this testing framework, Mel was also a big believer in how people internally would learn about testing. At the end of the day, PMs and analysts should be able to make better decisions.



Testing what works on a brand’s website when it comes to reselling is quite different from a traditional e-commerce site. “Recommerce” provides one customer with an experience for trading in their existing goods and another customer with buying a resale item. The product team may not have the luxury of seeing what competitors are doing since there might only be one color or one size of a product.







Pursuing an analytics degree at university and pivoting into data analytics



Mel talked at length about how people like herself have stumbled into data analytics. For Mel, she thought this was the only way to do it since there wasn’t a data analytics “major” when she was in college. I completely agree with her on this point since the only data analytics experience I received was on the job.



Now you can major in specific areas of data analytics like data science and machine learning. If you major in data analytics, your career path is a little more straightforward. One thing Mel pointed out is that datasets are never as clean as what you see in school. When you’re working with real world data, most of the time you’re just cleansing data to get the data in a usable format.



Source: I-O Psych Memes on Twitter



Mel gave some great advice for those who want to learn a specific skill or tool like Python or Excel. Look for a larger company that has an entire analytics department built out where you can specialize. You probably won’t have to do as much data cleaning compared to the data you might come across at a startup. A benefit of being at a startup, according to Mel, is that you can figure out where you would thrive but wearing multiple hats.



Other Podcasts & Blog Posts



No other podcasts mentioned in this episode!
]]>
Dear Analyst 102 28:52 51656
Dear Analyst #101: How to invest in modern data startups with David Yakobovitch https://www.thekeycuts.com/dear-analyst-101-how-to-invest-in-modern-data-startups-with-david-yakobovitch/ https://www.thekeycuts.com/dear-analyst-101-how-to-invest-in-modern-data-startups-with-david-yakobovitch/#respond Mon, 25 Jul 2022 15:05:06 +0000 https://www.thekeycuts.com/?p=51703 Outside of Excel, you’ve seen and heard multiple data platforms on this newsletter and podcast. Everything from commercial data platforms to open-source platforms driven by communities. In this episode, you’ll hear the other side of the data platform ecosystem. David Yakobovitch is a general partner at DataPower Ventures, a venture capital firm that invests in […]

The post Dear Analyst #101: How to invest in modern data startups with David Yakobovitch appeared first on .

]]>
Outside of Excel, you’ve seen and heard multiple data platforms on this newsletter and podcast. Everything from commercial data platforms to open-source platforms driven by communities. In this episode, you’ll hear the other side of the data platform ecosystem. David Yakobovitch is a general partner at DataPower Ventures, a venture capital firm that invests in early stage data science, applied AI, and machine learning startups. I don’t normally hear or read about the investor’s perspective in the data space, so this episode was quite the learning opportunity. You’ll also hear about some of the data startups David’s firm has invested in and what their unique value propositions are.

Mainframes, Tableau dashboards, and the modern data stack

David started his career in actuarial science and finance information systems. He originally worked at Aflac on their mainframes. At the time, the “modern” data stack included tools like Qlik, SAP Crystal Reports, and of course, SQL. David eventually moved to Tableau and was building dashboards for his team. After stints in the banking world at Citi and Deutsche, David moved to NYC and started working a lot with Python and R. He was a lead data science instructor at General Assembly and eventually landed at Galvanize as the data science team lead. I’ve been an instructor and have gone to events for both GA and Galvanize and encourage you to check out both organizations if you would like to up-level your data skills.

Galvanize co-working space. Source: ERIC LAIGNEL

David currently works full-time at SingleStore as a senior manager of technical enablement. Saying he works “full-time” for SingleStore is not an accurate characterization of what David does day-to-day since he wears many hats. He is also runs a venture capital fund called DataPower Ventures and hosts an artificial intelligence podcast called HumAIn. As a believer in side hustles, I think David shows no side hustle is too small or big to take on!

Evolution of the modern data stack

David shared his perspective on the modern data stack and the key takeaway is (surprise surprise) Excel is not going anywhere. Old and new platforms still have integrations with Excel. David rattled off a few including Refinitiv, Quickbooks, and Bloomberg. As an analyst, you have so many tools in the data stack that allow you to work with data. With ETL or ELT, you can import/export your data tables and schemas into another tool (like Excel) to do the actual analysis. This is where tools like Fivetran and dbt really shine to help you get your data into the right destination. The data can be in a low-code tool where you drag-and-drop tables and schemas or even in a Jupyter notebook. Once the analysis is done, you have visualization tools like Power BI and Looker to help you communicate your findings.

Source: David Jayatillake

The above modern data stack diagram comes from David Jayatillake’s substack newsletter. To hear about other tools in the data stack, I’d recommend listening to David’s episode or this episode with Priyanka Somrah from Work-Bench.

David also brought up an interesting observation about how data analysts are viewed at different companies. For instance, McKinsey typically views data analysts as strategists who help solve customer problems. At Lyft, data analysts are treated more like data scientists where you’re scripting and building automations in your data workflows. What is the definition of a data analyst at your company?

Becoming a VC investor

When David first moved to NYC 7 years ago, he attended Founder Fridays. Founder Fridays is a meetup where a founder of a company has an open chat with the Founder Fridays community of other founders. I attended these meetups a few years ago and it’s refreshing to hear candid stories from founders about the ups and downs of running a startup. A lot of startup meetups just focus on how a founder is crushing it without giving air time to the parts of running a startup that suck.

David was meeting founders through Founder Fridays and Techstars and had the technical skills to help these founders on the data and tech side. The next logical step was to start coaching and investing in these startups. The average check size at DataPower Ventures is $250K and the fund helps bring startups from the accelerator phase to their Series A. DataPower also helps its portfolio companies with scaling data pipelines, hiring, and basically whatever is required to help the startups succeed. The portfolio consists of 30 companies in the AI and ML stack who are mostly based in NYC. We then chatted a bit about some of the companies in the portfolio David is excited about.

OpenAxis

OpenAxis is a no-code tool for building data visualizations. The problem data analysts face is that visualizations are challenging to create when you’re building from scratch. This is typically the case when you’re building visualizations for your team in Tableau and Looker. What’s neat is the community that OpenAxis is building. The community can submit visualizations to the platform so if you need to a template to build something great, you can find something pre-built. They’ve started seeing some Substack writers include their visualizations in their newsletters.

Nomad Data

Nomad Data anonymously connects buyers and sellers of datasets. They typically work with quant funds and finance shops who are looking to get an edge in the market through unique datasets. Nomad Data’s value proposition is for use cases where data is “sparse.” Let’s say you need a dataset on telecommunication providers and the dataset has 500 columns with 5 billions rows. Every cell of this table might not have data in it which means you have incomplete and bad data. Through AI, machine learning, and human recommendations, your table will get filled in with high quality data.

I’m not sure why, but this reminded me of that episode from Billions in season 2 where Axelrod and Taylor are trying to figure out which microchip company Krakow is investing in. Taylor figures out that the Chinese microchip company is faking trucking activity into and out of their warehouse. That activity is captured by satellite images which hedge funds analyze to see how business is going for the company. Without high quality data, funds will make bad investment decisions even in this fictional example :).

What makes a good data startup?

We concluded the episode with what David and his team look for in data startups. Here are a few of the criteria:

  • At least one of the founders should be technical
  • The product must be commercially viable (e.g. it has to make money)
  • They don’t just invest in an algorithm
  • Tech has to have some real-world application
  • Founders are relentlessly curious

One could argue that most of these bullet points are what VC funds look for in any startup. I think the big difference is that the data industry is growing quicker compared to other industries and there is a lot of crossover with other industries like the cloud.

Speaking of the cloud, David mentions that data startups should be cloud first. Since customers are already on AWS, Azure, or GCP, you don’t want to force the customer to move off of their data stack. David believes that in the previous 20 years, people were building software, infrastructure, and developer tool companies. The next 20 years is all about building tools and technology for the data stack. In this new world where data is everywhere like The Minority Report, all that data will have to be stored and analyzed somewhere. And I guarantee in that world, Excel and Google Sheets will still be around.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #101: How to invest in modern data startups with David Yakobovitch appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-101-how-to-invest-in-modern-data-startups-with-david-yakobovitch/feed/ 0 Outside of Excel, you've seen and heard multiple data platforms on this newsletter and podcast. Everything from commercial data platforms to open-source platforms driven by communities. In this episode, you'll hear the other side of the data platform e... Outside of Excel, you've seen and heard multiple data platforms on this newsletter and podcast. Everything from commercial data platforms to open-source platforms driven by communities. In this episode, you'll hear the other side of the data platform ecosystem. David Yakobovitch is a general partner at DataPower Ventures, a venture capital firm that invests in early stage data science, applied AI, and machine learning startups. I don't normally hear or read about the investor's perspective in the data space, so this episode was quite the learning opportunity. You'll also hear about some of the data startups David's firm has invested in and what their unique value propositions are.







Mainframes, Tableau dashboards, and the modern data stack



David started his career in actuarial science and finance information systems. He originally worked at Aflac on their mainframes. At the time, the "modern" data stack included tools like Qlik, SAP Crystal Reports, and of course, SQL. David eventually moved to Tableau and was building dashboards for his team. After stints in the banking world at Citi and Deutsche, David moved to NYC and started working a lot with Python and R. He was a lead data science instructor at General Assembly and eventually landed at Galvanize as the data science team lead. I've been an instructor and have gone to events for both GA and Galvanize and encourage you to check out both organizations if you would like to up-level your data skills.



Galvanize co-working space. Source: ERIC LAIGNEL



David currently works full-time at SingleStore as a senior manager of technical enablement. Saying he works "full-time" for SingleStore is not an accurate characterization of what David does day-to-day since he wears many hats. He is also runs a venture capital fund called DataPower Ventures and hosts an artificial intelligence podcast called HumAIn. As a believer in side hustles, I think David shows no side hustle is too small or big to take on!







Evolution of the modern data stack



David shared his perspective on the modern data stack and the key takeaway is (surprise surprise) Excel is not going anywhere. Old and new platforms still have integrations with Excel. David rattled off a few including Refinitiv, Quickbooks, and Bloomberg. As an analyst, you have so many tools in the data stack that allow you to work with data. With ETL or ELT, you can import/export your data tables and schemas into another tool (like Excel) to do the actual analysis. This is where tools like Fivetran and dbt really shine to help you get your data into the right destination. The data can be in a low-code tool where you drag-and-drop tables and schemas or even in a Jupyter notebook. Once the analysis is done, you have visualization tools like Power BI and Looker to help you communicate your findings.



Source: David Jayatillake



The above modern data stack diagram comes from David Jayatillake's substack newsletter. To hear about other tools in the data stack, I'd recommend listening to David's episode or this episode with Priyanka Somrah from Work-Bench.

]]>
Dear Analyst 101 37:51 51703
Dear Analyst #100: Hitting a hunned! A look back at the top 5 episodes of Dear Analyst https://www.thekeycuts.com/dear-analyst-100-hitting-a-hunned-a-look-back-at-the-top-5-episodes-of-dear-analyst/ https://www.thekeycuts.com/dear-analyst-100-hitting-a-hunned-a-look-back-at-the-top-5-episodes-of-dear-analyst/#respond Mon, 11 Jul 2022 05:59:49 +0000 https://www.thekeycuts.com/?p=51950 Holy cannoli. Somehow we’ve hit 100 hundred episodes of the Dear Analyst podcast. This podcast started off as an experiment because I was lazy and got tired of writing. I figured speaking about Excel and data analytics would be easier than coming up with prose. Ironically, I’m still writing because all the show notes for […]

The post Dear Analyst #100: Hitting a hunned! A look back at the top 5 episodes of Dear Analyst appeared first on .

]]>
Holy cannoli. Somehow we’ve hit 100 hundred episodes of the Dear Analyst podcast. This podcast started off as an experiment because I was lazy and got tired of writing. I figured speaking about Excel and data analytics would be easier than coming up with prose. Ironically, I’m still writing because all the show notes for each episode are summarized through this blog and in the LinkedIn newsletter.

Source: scaffmag

I started publishing episodes more regularly during the pandemic because I was always at home and what else was there to do? Perhaps I was also fulfilling my millennial obligation to have my own podcast as well. This episode is a look back at some of my favorite episodes and I’ll include snippets of these older episodes as well. To all you data nerds out there, thank you for listening to the podcast and reading the show notes. I could say something trite like “I’m doing this all for you.” The reality is I do it for myself to explore topics I’m interested in :).

1. Episode #1: The inaugural episode about paste special values in Excel

Episode #1

The first episode was published on March 3rd, 2019. I had no idea what I was doing (I still don’t). At the time, I thought talking about a relatively technical subject where most people learn via video would be a good idea. Was I ever this young and naive? I had started working at Coda less than a year at this point and was creating content at work comparing Excel and Coda. Needless to say, I was questioning a lot of the common behaviors we do in Excel and paste special values was one of them. I pretty much just read the original blog post I wrote about this subject and framed it as a “why” paste special values is not that great versus “how” to do paste special values.

Source: Super User

2. Episode #28: Filling values down to the last row of data

Explaining formulas and technical concepts over audio is tricky. As a listener, you’re trying to visualize the Excel file I’m explaining. This episode is deep in the weeds of how to fill values down to the last row of your data. The reason why this is one of my favorite episodes is because the blog post with the show notes has become one of the most trafficked posts on thekeycuts.com. I guess a lot of people are looking for ways to fill their values down? Prior to this episode/blog post, the post with the most views is this post on how to split costs with friends (published in 2014). From an SEO perspective, I didn’t expect this blog post to get so much search traffic in such a short period of time.

3. Episode #50: First guest on Dear Analyst, Shawn Wang

Episode #50 was also a big milestone for the podcast. As such, I invited a guest onto the show instead of just me talking about some Excel or Google Sheets tip. This turned the podcast into more of a Q&A type of show rather than a “how-to” show. Now I have the privilege of interviewing and learning from some of the top professionals in the data analytics industry. Shawn (aka swyx on Twitter) is pretty well known in the ReactJS and Svelte community. I heard him speak on some other web dev podcasts like SyntaxFM and didn’t know he had a banking background where he wrote thousands of lines of VBA code for Excel. After I learned this fact about him, I knew I had to have him on the show.

4. Episode #91: All things Peloton

During the pandemic, I started riding Peloton more and got pretty addicted to the post-ride analytics from the platform. You see a big output number after a ride and you instinctively want to beat that number the next time around. Elena originally couldn’t talk on the podcast because she was still a full-time employee at Peloton when we first started chatting. Due to NDA and PR/comms concerns, we agreed it wouldn’t make sense for her to be on the podcast. Some time passed, and I Elena moved to a different company. We got back in touch, and Elena was able to share more about her role as a product analyst since she moved on from Peloton. I got nerd out about Peloton’s analytics and learn how the sausage is made via this episode with Elena.

5. Episode #38: The first “Excel mistake leads to catastrophe” episode

I published a few of these episodes and the reception has generally been pretty good for these episodes. The story line goes something like this: analyst makes some basic formatting or formula error in Excel and this error ends up costing their company millions and billions of dollars. Episode #38 was the first episode exploring one of these Excel errors: JPMorgan loses $6.2B because of an Excel error. While most news outlets just report the facts about what transpired, I thought it would be interesting exploring the actual Excel error in this episode and how one would fix it in the future.

When we read these click-baity headlines, we immediately tell ourselves: “I would never make this mistake in Excel.” The reality is that these errors can happen to anyone regardless if you’re an entry-level analyst or an experienced Excel guru. These errors are so simple to make and most of the time, they are a result of some default setting in Excel we overlook.

Thank you, thank you, thank you you’re far too kind

And that’s a wrap on 100 episodes! I appreciate all 10 or 100 of you who listen to this podcast. Maybe one day it will be 1,000. I don’t know how much longer I’ll do this thing and what direction it will go in, but that’s part of the journey ain’t it?

The post Dear Analyst #100: Hitting a hunned! A look back at the top 5 episodes of Dear Analyst appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-100-hitting-a-hunned-a-look-back-at-the-top-5-episodes-of-dear-analyst/feed/ 0 Holy cannoli. Somehow we've hit 100 hundred episodes of the Dear Analyst podcast. This podcast started off as an experiment because I was lazy and got tired of writing. I figured speaking about Excel and data analytics would be easier than coming up wi... Holy cannoli. Somehow we've hit 100 hundred episodes of the Dear Analyst podcast. This podcast started off as an experiment because I was lazy and got tired of writing. I figured speaking about Excel and data analytics would be easier than coming up with prose. Ironically, I'm still writing because all the show notes for each episode are summarized through this blog and in the LinkedIn newsletter.



Source: scaffmag



I started publishing episodes more regularly during the pandemic because I was always at home and what else was there to do? Perhaps I was also fulfilling my millennial obligation to have my own podcast as well. This episode is a look back at some of my favorite episodes and I'll include snippets of these older episodes as well. To all you data nerds out there, thank you for listening to the podcast and reading the show notes. I could say something trite like "I'm doing this all for you." The reality is I do it for myself to explore topics I'm interested in :).



1. Episode #1: The inaugural episode about paste special values in Excel



Episode #1



The first episode was published on March 3rd, 2019. I had no idea what I was doing (I still don't). At the time, I thought talking about a relatively technical subject where most people learn via video would be a good idea. Was I ever this young and naive? I had started working at Coda less than a year at this point and was creating content at work comparing Excel and Coda. Needless to say, I was questioning a lot of the common behaviors we do in Excel and paste special values was one of them. I pretty much just read the original blog post I wrote about this subject and framed it as a "why" paste special values is not that great versus "how" to do paste special values.



Source: Super User



2. Episode #28: Filling values down to the last row of data







Explaining formulas and technical concepts over audio is tricky. As a listener, you're trying to visualize the Excel file I'm explaining. This episode is deep in the weeds of how to fill values down to the last row of your data. The reason why this is one of my favorite episodes is because the blog post with the show notes has become one of the most trafficked posts on thekeycuts.com. I guess a lot of people are looking for ways to fill their values down? Prior to this episode/blog post, the post with the most views is this post on how to split costs with friends (published in 2014). From an SEO perspective, I didn't expect this blog post to get so much search traffic in such a short period of time.







3. Episode #50: First guest on Dear Analyst, Shawn Wang







Episode #50 was also a big milestone for the podcast.]]>
Dear Analyst 100 21:03 51950
Dear Analyst #99: Hyperscaling at Uber Eats and using VBA to automate M&A models with Ryan Cunningham https://www.thekeycuts.com/dear-analyst-99-hyperscaling-at-uber-eats-and-using-vba-to-automate-ma-models-with-ryan-cunningham/ https://www.thekeycuts.com/dear-analyst-99-hyperscaling-at-uber-eats-and-using-vba-to-automate-ma-models-with-ryan-cunningham/#respond Tue, 05 Jul 2022 05:14:00 +0000 https://www.thekeycuts.com/?p=51659 MLOps, neuromorphic computing, next-generation enterprises. These are concepts you generally don’t hear about in the field of data analytics. This episode shows that you can parlay a career in data analysis into any field you want. That’s exactly what Ryan Cunningham did to become a Senior Associate Builder at AI Fund. Ryan studied finance and […]

The post Dear Analyst #99: Hyperscaling at Uber Eats and using VBA to automate M&A models with Ryan Cunningham appeared first on .

]]>
MLOps, neuromorphic computing, next-generation enterprises. These are concepts you generally don’t hear about in the field of data analytics. This episode shows that you can parlay a career in data analysis into any field you want. That’s exactly what Ryan Cunningham did to become a Senior Associate Builder at AI Fund. Ryan studied finance and computer science at Georgetown, and honed his modeling skills at Credit Suisse after graduation focusing on the oil and gas sector. At Uber, Ryan helped scale the Uber Eats product and worked on various moonshot products like Uber Elevate and Uber JUMP. After a stint at an AI gaming company, Ryan eventually joined AI Fund, an AI startup studio founded by Andrew Ng (founder of Google Brain and co-founder of Coursera). This episode follow Ryan’s journey writing VBA as a banker, challenging the status quo at Uber Eats, and diving into the world of AI.

VBA is the gateway drug to automation

If you’re in Excel long enough, you’ll eventually come across VBA macros. Either you use the ones your colleagues build or you dive into writing/recording the macro yourself. The reason I think VBA is the gateway to automation, generally, is because you are able to break Excel down into its building blocks. Furthermore, you can tie other Microsoft apps together using VBA. When I was a financial analyst, I automated the creation of reports in Excel and PowerPoint with the push of a button. That feeling was powerful and extremely gratifying. The next obvious question is: what else can be automated? You don’t realize it at the time, but the object-oriented programming skills you learn from VBA can be carried into web development, Python scripting, and more.

Source: The Register

While at Georgetown, Ryan was introduced to VBA and quickly started using VBA in all of his classes since he was in Excel all the time. VBA is not the most popular programming language to learn. According to this source, it’s popularity has decreased 60% since 2004. Eventually, Microsoft’s Office Scripts (a flavor of TypeScript) should, in theory, replace VBA. But for those old Excel macros from the early 2000s, someone has to maintain or migrate those scripts.

In addition to VBA, Ryan taught himself Python in his free time while at university. Python wasn’t in the regular curriculum at Georgetown, so Ryan learned through Codeacademy. He built programs to scrape Twitter to calculate sentiment in Tweets. When Dogecoin came onto the scene in 2013, Ryan built a mini super computer with four Raspberry Pis to mine Dogecoin. He wrote a Python script to tie everything together so that he would get a Tweet when he successfully mined some Dogecoin.

Source: Coincu News

M&A modeling at Credit Suisse

An investment banking job right out of college used be a coveted position since you get to learn a lot of analytical skills on the job. When Ryan first got into banking, he tried to get as much as he could out of the experience. He broke down his experience into three mental models:

  1. Perception – This is the part of the job where you’re doing the actual work. Modeling, analyzing financial statements, spreading comps, and putting together decks. You’re mostly being paid to execute, not to think too hard about it. This is the time to build your muscle memory.
  2. Cognition – At this stage, typically by the time you become a second year, you’ve got the fundamentals down and you can be trusted to think critically about the materials.
  3. Metacognition – This is the next step that many banking analysts don’t get to, according to Ryan. This is the stage where you are thinking about thinking, how best you can begin to scale yourself up through automation.

By automating a lot of his tasks with VBA, Ryan was able to get a lot of his weekends back. He was working with a VP at the time who was working on an M&A model. The model involved potential mergers between various satellite companies. The model took an hour and a half to get updated debt and EBITDA numbers. The model was difficult to run for just two companies, and Ryan’s VP wanted to run the model on six companies. Every time they wanted to update a deck, it would take a few hours for the model to run because there were 30 pairs of company combinations the model would have to crunch. Ryan first recorded a macro and went through the update process once in Excel. He re-factored the script to use relative references instead of absolute references, centralized the inputs, and made several other improvements to maximize efficiency. Eventually, the M&A model only took two minutes to update.

Scaling Uber Eats to 300 locations as a product analyst

Ryan joined Uber Eats in late 2016 right when Uber Eats had found product-market fit. As a product analyst, one of Ryan’s first projects was figuring out the ROI from the investments Uber Eats was making to acquire couriers and customers. At the time, Uber Eats’ modus operandi was “spend to win.” This means paid marketing, driver referrals, etc. Ryan noticed that there were diminishing returns with this strategy. Despite aggressive spending to acquire new couriers and promote eater growth, both new and existing couriers had lower throughput than expected. It was also difficult to get them to stick around for long. Multiple stakeholders were interested in Ryan’s analysis including the former CEO, Ryan Graves. Upon presenting his analysis to Ryan G., Ryan C. got called out for making a mistake.

Just like banking, numbers have to tie and balance out in the tech world too. If your numbers don’t tie, it’s really easy to lose your credibility. Ryan G. claimed Ryan C.’s numbers didn’t tie and that his findings were directional at best. Ryan C. made a mistake, was called out for it, and needed to do more homework.

The problem that Ryan was hinting at was that Uber Eats was investing a tremendous amount of money on supply acquisition, theorized at the time to directly correlate with demand growth, but wasn’t getting much of a return. Marginal demand was not enough to prevent diluting courier earnings with newly added courier supply. So utilization was low, and churn was high, creating a vicious spend cycle. Ryan’s hypothesis was that Uber Eats could reduce spend and improve courier utilization without sacrificing network reliability. Ultimately, he was successful in convincing Uber Eats leadership to execute on this hypothesis, and led a strike team across product, operations, and data science to resolve it. He referenced the Cold Start Problem (coined by Andrew Chen) when describing Uber Eats’ predicament at the time.

Understanding Uber’s data warehouse and other projects

Another one of Ryan’s projects at Uber Eats was understanding every single table in the data warehouse. He applied this knowledge of the data warehouse to other moonshot projects at Uber like Uber Elevate, Uber’s drone delivery and aerial ride-sharing initiative. This whitepaper from 2016 lays out Uber’s vision for Uber Elevate and it’s quite an interesting read. No tech moonshot would be complete without a demo video:

Check out the peons at 0:56 taking regular Uber cars. So lame.

Using SQL, Excel, and his scripting skills, Ryan helped Uber Elevate figure out its market operating conditions. He ultimately owned the operating model across many of Uber’s other moonshots, including drone delivery and micromobility. By working with the data science team, he ran forward-looking simulations that would require much more sophistication than just VBA or Google Scripts. He even co-designed (and patented) an ambient noise model which the Uber Elevate team would use to plan its skyport network, which would blend in with ambient noise conditions in dense cities and over highways, while avoiding residential areas.

An interesting by-product of aerial ride-sharing is the ambient noise the air taxis create. The noise could blend in with the traffic if flying over a highway, but would introduce unwanted noise in a residential area. This may prevent the FAA from allowing you to land in certain areas.

The road to autonomous mobility

Aka self-driving cars. Ryan’s interests shifted more AI and eventually joined Andrew Ng’s startup studio, AI Fund. Unlike other venture funds, AI Fund provides AI companies with a co-founder or founder-in-residence. The fund’s economic returns and incentives are a bit different since a founder-in-residence from the fund is working at a company in the fund’s portfolio. The key insight AI Fund discovered is that there is a long tail of of businesses that don’t know how to apply AI to their workflows. This is the one of the segments AI Fund is going after.

Ryan discusses how AI Fund re-uses existing frameworks like “data-centric AI.” This framework was coined by Andrew Ng in the scope of one of his other companies, Landing AI. Most academics and researchers will simply swap in different models on the same dataset to fine tune the model. If you look at the dataset, however, you’ll notice there is a lot of bad and missing data. Going down this topic gets you into the burgeoning world of ML Ops. This part of the conversation was definitely outside of my expertise, so I’ll just leave this quote below from the Landing AI website:

Instead of focusing on the code, companies should focus on developing systematic engineering practices for improving data in ways that are reliable, efficient, and systematic. In other words, companies need to move from a model-centric approach to a data-centric approach.

Andrew Ng

Podcasting for China Stories

Outside of AI Fund, Ryan is part of a group of English-native Mandarin speakers who read articles (volunteer-based) about Chinese tech and culture for China Stories. Ryan started learning Mandarin in July of 2020 and had a lifelong interest in voice acting and performance. According to Ryan, there’s a huge asymmetry in news shared from the Chinese community. When research papers are published in the United States, researchers in China will devour these tomes. Western researchers, however, don’t read as much published work from the Chinese community. Ryan cold DMed Kaiser Kuo, the editor-at-large at SupChina (the company that produces the China Stories podcast) and got the gig. Listen to one of Ryan’s latest episodes here on Chinese gamers.

Source: SupChina

Key takeaways for analysts

At the end of the conversation, Ryan gave his advice on what frameworks and languages analysts should learn to up-level their skills and careers. No matter the what skills you acquire in the data world, Ryan reiterated that storytelling (something we’ve heard over and over again on this podcast) is the most important skill to acquire.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #99: Hyperscaling at Uber Eats and using VBA to automate M&A models with Ryan Cunningham appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-99-hyperscaling-at-uber-eats-and-using-vba-to-automate-ma-models-with-ryan-cunningham/feed/ 0 MLOps, neuromorphic computing, next-generation enterprises. These are concepts you generally don't hear about in the field of data analytics. This episode shows that you can parlay a career in data analysis into any field you want. MLOps, neuromorphic computing, next-generation enterprises. These are concepts you generally don't hear about in the field of data analytics. This episode shows that you can parlay a career in data analysis into any field you want. That's exactly what Ryan Cunningham did to become a Senior Associate Builder at AI Fund. Ryan studied finance and computer science at Georgetown, and honed his modeling skills at Credit Suisse after graduation focusing on the oil and gas sector. At Uber, Ryan helped scale the Uber Eats product and worked on various moonshot products like Uber Elevate and Uber JUMP. After a stint at an AI gaming company, Ryan eventually joined AI Fund, an AI startup studio founded by Andrew Ng (founder of Google Brain and co-founder of Coursera). This episode follow Ryan's journey writing VBA as a banker, challenging the status quo at Uber Eats, and diving into the world of AI.







VBA is the gateway drug to automation



If you're in Excel long enough, you'll eventually come across VBA macros. Either you use the ones your colleagues build or you dive into writing/recording the macro yourself. The reason I think VBA is the gateway to automation, generally, is because you are able to break Excel down into its building blocks. Furthermore, you can tie other Microsoft apps together using VBA. When I was a financial analyst, I automated the creation of reports in Excel and PowerPoint with the push of a button. That feeling was powerful and extremely gratifying. The next obvious question is: what else can be automated? You don't realize it at the time, but the object-oriented programming skills you learn from VBA can be carried into web development, Python scripting, and more.



Source: The Register



While at Georgetown, Ryan was introduced to VBA and quickly started using VBA in all of his classes since he was in Excel all the time. VBA is not the most popular programming language to learn. According to this source, it's popularity has decreased 60% since 2004. Eventually, Microsoft's Office Scripts (a flavor of TypeScript) should, in theory, replace VBA. But for those old Excel macros from the early 2000s, someone has to maintain or migrate those scripts.



In addition to VBA, Ryan taught himself Python in his free time while at university. Python wasn't in the regular curriculum at Georgetown, so Ryan learned through Codeacademy. He built programs to scrape Twitter to calculate sentiment in Tweets. When Dogecoin came onto the scene in 2013, Ryan built a mini super computer with four Raspberry Pis to mine Dogecoin. He wrote a Python script to tie everything together so that he would get a Tweet when he successfully mined some Dogecoin.



Source: Coincu News



M&A modeling at Credit Suisse



An investment banking job right out of college used be a coveted position since you get to learn a lot of analytical skills on the job. When Ryan first got into banking, he tried to get as much as he could out of the experience. He broke down his experience into three mental models:



* Perception - This is the part of the job where you're doing the actual work. Modeling, analyzing financial statements, spreading comps, and putting together decks. You're mostly being paid to execute,]]>
Dear Analyst 99 54:57 51659
Dear Analyst #98: How a career in graphic design helps you with data storytelling and visualizations with Wacarra Yeomans https://www.thekeycuts.com/dear-analyst-98-how-a-career-in-graphic-design-helps-you-with-data-storytelling-and-visualizations-with-wacarra-yeomans/ https://www.thekeycuts.com/dear-analyst-98-how-a-career-in-graphic-design-helps-you-with-data-storytelling-and-visualizations-with-wacarra-yeomans/#comments Tue, 21 Jun 2022 20:04:13 +0000 https://www.thekeycuts.com/?p=51636 When you’re working in a marketing or advertising agency, you’ll work with clients across various industries. The data you analyze will also vary client to client, giving you exposure to various datasets and the business logic that drives these datasets. Wacarra Yeomans started her career as a graphic designer in the agency world. She always […]

The post Dear Analyst #98: How a career in graphic design helps you with data storytelling and visualizations with Wacarra Yeomans appeared first on .

]]>
When you’re working in a marketing or advertising agency, you’ll work with clients across various industries. The data you analyze will also vary client to client, giving you exposure to various datasets and the business logic that drives these datasets. Wacarra Yeomans started her career as a graphic designer in the agency world. She always had a knack for building empathy for the end consumer. This empathy would drive how she approached her designs. People assumed Wacarra didn’t “know” data given her design background. She quickly saw how her creative ideas could lead to clicks, engagement, and conversions in the classic marketing funnel. This led her to want to build up her data skills, use her design skills in her data visualizations, and eventually creating a fun product for Excel nerds :).

Source: Emarsys

Building customer profiles with Excel, Google Sheets, and more

One of Wacarra’s current projects is building customer profiles and journeys. Wacarra works at Showpad, a sales enablement platform. She’s essentially looking at tones of Salesforce data to build out these customer journeys. A customer journey tells the story of how a potential customer interacts with your product or service. You analyze how much customers spend, where they work, etc. As Wacarra dug deeper into Salesforce and other data sources, she realized she could combine different data points about companies together.

Outside of Salesforce, Wacarra looked at NPS score and product usage data. The unique identifier was simply the company name. Since there wasn’t hundreds of thousands of rows of data, Wacarra united everything in Excel. From there, she built PivotTables with calculated fields to view the company data differently. She also mentioned a Salesforce Google Sheet add-on which allows you to pull Salesforce data directly into Google Sheets.

Once the data analysis was done in Excel, Wacarra’s design skills came into play. She created PDF versions of all the customer journeys and profiles to make the data visualization nicer. Being able to communicate the findings from the data is a huge part of the data analysis process. Wacarra loves the data storytelling aspect of her job because it blends her creative and data skills together.

What is the impact of these customer profiles and journeys? One data point she brought up during our conversation is that she was able to show all the different software and platforms customers are using. This helped the product team prioritize certain integrations in the Showpad platform. When new hires join the company, they can quickly understand who Showpad’s target customers through these visualizations. Data about the customers is just one side of the story. Customer profiles typically blend quantitative and qualitative data to tell a story about the customer.

Source: UX Planet

Data storytelling through Excel files with 250+ tabs for Abercrombie & Fitch

When Wacarra was in the agency world, one of her clients was Abercrombie & Fitch. As an “experience strategist,” her goal was to advise Abercrombie & Fitch on how to improve the customer experience. She was essentially creating a different form of user journeys for her client.

Abercrombie & Fitch was working with a research vendor who would send the raw data about Abercrombie & Fitch’s customers to Wacarra’s agency. The data was in Excel, and each file had over 250 tabs for Wacarra to analyze. Each tab/worksheet was a different research question with all the answers from customers on the worksheet. Wacarra was technically on the UX team, but she worked closely with the data analytics team at her agency. To analyze all the data, Wacarra thought she could do everything in Python. Alas, Excel ended up being the tool of choice.

Predicting customer actions with a driver analysis

In order to build out the customer journeys and predict how Abercrombie & Fitch’s customers might act in the future, Wacarra and her team conducted a driver analysis. A driver analysis looks at a variety of variables and factors to predict some outcome variable. It’s similar to NPS with the main difference being that an NPS analysis is usually at the customer service level. A driver analysis gives you a more complete picture of your customer and how they will act.

The ultimate deliverable to Abercrombie & Fitch was a data story showing the customers experience from beginning to end. There were 9 customer segments and the analysis showed Abercrombie & Fitch that they didn’t know their customers as well as they think they did.

For instance, Wacarra brought up a data point showing that moms were buying Hollister apparel for themselves (and not their teenage kids). During her interviews with Abercrombie & Fitch store managers, Wacarra found that a late 20s/early 30s Hispanic persona was shopping at the store. The data further proved that the qualitative research was true. It’s always fun when the data matches up with reality.

Who remembers LFO?

Wacarra’s team didn’t know what to expect when sharing their findings with Abercrombie & Fitch. As data stewards, Wacarra’s job is to simply present the data. If there were outliers in the data, Wacarra would bring this up with the client. You never know when an outlier may become the trend. Even if the business if focused on one strategy, the outliers in the data may help steer the ship in the right direction.

Building data skills through a free data science bootcamp

As Wacarra began to build up her data skills, she realized formal training with credentials would help her transition fully into the world of data. She enrolled a free data science bootcamp called Correlation One. This data science bootcamp is unique in that they aim to create a more diverse data ecosystem by providing free training for underrepresented communities (e.g. Black, LatinX, female, LGBTQ+) through their Data Science for All program.

When Wacarra applied to the program, there were 24,000 applicants. 1,000 were admitted, 700 did a pre-bootcamp exercise in Python, and 600 students ultimately finished the program. Wacarra recalls meeting a diverse group of students including doctors, college students, and more.

Many bootcamps dive right into learning Python, SQL, or Excel. Wacarra remembers the first hour of a lecture being about how to understand the problem. Often times analysts will jump right into the analysis, but the final output answers the wrong or irrelevant questions. Once these softer skills are taught, the lectures would dive into Python, SQL, and the Seaborn data viz library (this library was also mentioned in episode #34 with Sean Tibor and Kelly Schuster-Paredes).

Source: Seaborn website

Predicting the next Billboard hit with audio features from Spotify

During the data science bootcamp, Wacarra worked on a project to help predict if a song would be a “hit” using Spotify’s API and SDK. For each song in Spotify’s expansive library, the API gives you 12 key features about each song. Some of these features include danceability, energy, liveness, and more. To help predict if a song would be popular, Wacarra’s team used Billboard data. By uniting these two datasets, the team was able to predict with 70% accuracy whether a song would show up on the Billboard list.

The first step was cleaning the data from Spotify. For instance, many of the scores from Spotify’s API had decimal points. The team had to normalize all the scores across all songs. The team then did t-tests to find correlations between the two data samples. Wacarra recalls watching the StatQuest YouTube channel to learn how to apply statistics to her project.

Wacarra mentioned some interesting findings and learnings from her project:

  • Accuracy: Originally her team thought they could predict with 100% accuracy whether a song would show up on Billboard. The teaching assistants for the bootcamp told her this wasn’t possible. It turns out the Baby Shark song was skewing the doo doo data (sorry had to do it).
  • The Weeknd: Turns out The Weeknd is a really great group. They produced hits every year in the dataset Wacarra analyzed.
  • Upbeatness & Energy: Certain features from Spotify’s API had a high impact on whether the song became a hit. Upbeatness and Energy, unsurprisingly, are two features with positive correlation with being a hit. A feature that has a negative correlation is the Acousticness of the song.

In addition to these findings, Wacarra created a Tableau dashboard to tell a story about the Spotify and Billboard data:

Advice for analysts: understand the problem

We ended the conversation discussing Wacarra’s advice for analysts and what’s she’s learned throughout her career. Reiterating the lessons she learned from her data science bootcamp, she recommends analysts understand who you are creating the analysis for and how your target audience will use your analysis.

In a lot of roles, we think about “how to do it” versus “should we do it?”

This quote speaks to the urge to dive right into the data and start using the tools we love. Wacarra’s advice is to step back and question whether the analysis should be done in the first place. This softer side of data analytics has been brought up several times on the podcast (see episode #71 with Benn Stancil, founder of Mode).

You should also feel like you have a stake in the analysis you produce since you are the de factor voice of the customer. The findings may have an emotional impact on your target audience since you are ultimately telling a story using data to drive the twists and turns in the plot. Wacarra suggest reading Ruined by Design by Mike Monteiro which helps creatives understand the emotional impact products can have on people. Finally, to become better at creating data visualizations, you should consider taking a design class to understand colors, contrast, and more.

Source: Amazon

Spreadsheets and bed sheets

To show her love of Excel spreadsheets (and her design skills), Wacarra created a product called Spreadsheets Bedsheet. It’s exactly what you think it is: a bedsheet full of cells. The product is no longer being sold, but they sure do look classy:

Source: Facebook

Not to be confused with Spreadsheet to Bedsheets: Simplify your finances, Transform relationships, Dream with confidence. Who said Excel spreadsheets can’t have an emotional impact on you and your significant other?

Source: Walmart

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #98: How a career in graphic design helps you with data storytelling and visualizations with Wacarra Yeomans appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-98-how-a-career-in-graphic-design-helps-you-with-data-storytelling-and-visualizations-with-wacarra-yeomans/feed/ 1 When you're working in a marketing or advertising agency, you'll work with clients across various industries. The data you analyze will also vary client to client, giving you exposure to various datasets and the business logic that drives these dataset... When you're working in a marketing or advertising agency, you'll work with clients across various industries. The data you analyze will also vary client to client, giving you exposure to various datasets and the business logic that drives these datasets. Wacarra Yeomans started her career as a graphic designer in the agency world. She always had a knack for building empathy for the end consumer. This empathy would drive how she approached her designs. People assumed Wacarra didn't "know" data given her design background. She quickly saw how her creative ideas could lead to clicks, engagement, and conversions in the classic marketing funnel. This led her to want to build up her data skills, use her design skills in her data visualizations, and eventually creating a fun product for Excel nerds :).



Source: Emarsys



Building customer profiles with Excel, Google Sheets, and more



One of Wacarra's current projects is building customer profiles and journeys. Wacarra works at Showpad, a sales enablement platform. She's essentially looking at tones of Salesforce data to build out these customer journeys. A customer journey tells the story of how a potential customer interacts with your product or service. You analyze how much customers spend, where they work, etc. As Wacarra dug deeper into Salesforce and other data sources, she realized she could combine different data points about companies together.







Outside of Salesforce, Wacarra looked at NPS score and product usage data. The unique identifier was simply the company name. Since there wasn't hundreds of thousands of rows of data, Wacarra united everything in Excel. From there, she built PivotTables with calculated fields to view the company data differently. She also mentioned a Salesforce Google Sheet add-on which allows you to pull Salesforce data directly into Google Sheets.



Once the data analysis was done in Excel, Wacarra's design skills came into play. She created PDF versions of all the customer journeys and profiles to make the data visualization nicer. Being able to communicate the findings from the data is a huge part of the data analysis process. Wacarra loves the data storytelling aspect of her job because it blends her creative and data skills together.



What is the impact of these customer profiles and journeys? One data point she brought up during our conversation is that she was able to show all the different software and platforms customers are using. This helped the product team prioritize certain integrations in the Showpad platform. When new hires join the company, they can quickly understand who Showpad's target customers through these visualizations. Data about the customers is just one side of the story. Customer profiles typically blend quantitative and qualitative data to tell a story about the customer.



Source: UX Planet



Data storytelling through Excel files with 250+ tabs for Abercrombie & Fitch



When Wacarra was in the agency world, one of her clients was Abercrombie & Fitch. As an "experience strategist," her goal was to advise Abercrombie & Fitch on how to improve the customer experience. She was essentially creating a different form of u...]]>
Dear Analyst 98 39:05 51636
Dear Analyst #97: Becoming a data Swiss army knife for marketing, operations, and customer support data problems with Sarah Krasnik https://www.thekeycuts.com/dear-analyst-97-becoming-a-data-swiss-army-knife-for-marketing-operations-and-customer-support-data-problems-with-sarah-krasnik/ https://www.thekeycuts.com/dear-analyst-97-becoming-a-data-swiss-army-knife-for-marketing-operations-and-customer-support-data-problems-with-sarah-krasnik/#respond Mon, 13 Jun 2022 15:24:51 +0000 https://www.thekeycuts.com/?p=51695 As a flex player on a data team, you might play the role of a data scientist, data analyst, or data engineer. Sarah Krasnik is one of those people who has held all these roles. In this conversation, Sarah gets into the weeds of what most data analysts do: helping business partners make better decisions […]

The post Dear Analyst #97: Becoming a data Swiss army knife for marketing, operations, and customer support data problems with Sarah Krasnik appeared first on .

]]>
As a flex player on a data team, you might play the role of a data scientist, data analyst, or data engineer. Sarah Krasnik is one of those people who has held all these roles. In this conversation, Sarah gets into the weeds of what most data analysts do: helping business partners make better decisions with data. Prior to her current role as an independent consultant, she worked on different data challenges faced by operations, marketing, and customer support functions. Eventually, she managed a data engineering team focused on the data platform and infrastructure. From speaking with Sarah, it conjured up memories of working with bad data, manual data tasks, and playing the role of a mediator for your business stakeholders. We also chat about a popular blog post Sarah wrote on SaaS debt.

Building and maintaining a homegrown data pipeline

Sarah’s last role before striking it off on her own was at Perpay, a financial services company focused on the buy now/pay later space. The company is a data-driven organization (as are most companies these days). The data that Sarah’s team was looking at was all marketing data. Specifically, data that influencers customer conversion rates. The problem that they were trying to solve was how the marketing team could send more personalized emails and messages to potential customers to get them to convert.

The marketing team originally used a tool called Iterable where you send customer data to the platform and the platform would know when to send the right customized email. For instance, abandoned cart e-mails are super effective at increasing conversions and Iterable could help with this task.

The data engineering team’s goal was to figure out how to get data about the customer and get it into Iterable. This is a classic data activation scenario. Over time, Sarah’s team started building a solution in-house. The biggest challenge was getting the data out of the data warehouse and having it notify Iterable’s API. As the the use case for Iterable and the in-house solution grew, the data engineering team had to constantly figure out what was in Iterable and checking diffs (seeing what changed from the previous state to the current state) to debug issues. Eventually the team moved to a paid solution called Census to help with the movement of data from the data warehouse to Iterable. Sarah reflected on the evolution of the solution:

At a startup you have to be ruthless with prioritization. I realized that the data eng team was spending too much time maintaining this in-house solution. This stood out to me as a generic problem where you spend hours per month maintaining the system. When is the cost of the paid solution cheaper than the hours required for maintaining something in-house?

Automating a manual forecasting process with SQL scripts

Sarah was also a quantitative analyst at OneMain, a private lender in the fintech space. The affiliate marketing team was responsible for marketing loans so that they show up on sites like NerdWallet, Credit Karma, and Lending Club. The problem was how to increase conversions by reducing costs–another very common marketing problem that can be solved with bette data. Sarah’s team was in charge of forecasting metrics like cost per loan and cost per conversion for these affiliate marketing channels.

If anyone has ever built a manual forecast in an FP&A role, at some point you’re comparing “actuals” with the forecast. The goal is to get the two to line up closely. If they don’t, you have to figure out what led to the variance.

Source: Nashville Severe Weather

In OneMain’s case, comparing the forecast to actuals was a super manual process. Sarah’s goal was to simply reduce the time it took to pull the data and compare the actuals to forecast. Through various SQL scripts and a dashboard in Looker, she was able to save ~20 hours/week of work across a variety of people. The solution Sarah built was also version-controlled so you could see the updates she made to the process over time (more on version-controlling later).

Building consensus on metric definitions

DAU, MAU, ARPU. There are a ton of standardized metrics in the SaaS world. While these metrics are all great for showing your company’s performance to investors, there may not be agreement internally on what these metrics mean. How do you ensure all teams and stakeholders are on the same page on what a DAU even means?

Sarah was a data engineer at Slyce, a visual search API and SDK. A typical use case for their technology would be integrating with Macy’s, and a customer takes a picture of a dress and is shown similar dresses. Whether you upload an existing image, take a picture of an image, or take a picture of text, these are all considered searches. As you can imagine, the definition of a “search” can get quite ambiguous with all the different ways someone can do a “search.”

Sarah met with the sales and product teams to ensure cross-functional alignment on the definitions of a search. The way the sales team was reporting and communicating on searches to customers was different from how the product team was defining searches. The goal was to create a document that defined the various type of searches and create a dashboard that the sales team could send to customers.

This type of “glue” work involves gathering requirements, understanding what’s important to different teams, and building consensus across teams. The data team stayed neutral during the process and simply acted as the mediator between the different teams.

In addition to getting alignment, Sarah’s team also streamlined the process for pulling data about searches. Before she got involved, the team was pulling data from three different data sources to get a complete picture of searches. The data engineering team unified everything into one system.

Nothing better than having a single source of truth for all teams to pull from. Easier to debug if there’s only one system that is causing problems. And if there’s something wrong with the numbers, at least you know everyone‘s data is wrong. With multiple data sources, some people might be wrong and some people might be right. It will just take the data team more time to diagnose the true problem.

Paying down your SaaS tool debt

One of the reasons I really wanted to have Sarah on the podcast was to have her discuss this blog post she wrote about “SaaS debt.” You often hear of technical debt, but analytics teams can also develop data debt, according to Sarah. What exactly is SaaS debt, data debt, or “process debt”? Consider this scenario:

  1. You work on a marketing team and everyone uses Mailchimp to send email campaigns.
  2. Person A on your team drafts the content to send the email.
  3. Person B manages the list of contacts to send the email campaign to.
  4. Person C pulls the latest list of customers from a database to give to Person B to upload into Mailchimp.
  5. Person C goes on vacation, and Person B is in charge of pulling the data and uploading to Mailchimp. Person C makes a mistake but doesn’t realize it.
  6. The e-mail goes out to the wrong customers, and the team is scrambling to figure out what went wrong in the upload process, the list of customers that was pulled from the database, etc.

This is a contrived example, but imagine this type of process happening across a bunch of SaaS tools your team uses. In the blog post, Sarah talks about automating things with Zapier and Google Sheets. These tools add more debt to the process where one simple mistake can be costly.

In the software world, there are more robust solutions to preventing incorrect code from being pushed to production. Namely, using Git and versioning control allows people to see what’s going to be checked into the main branch before it gets pushed live. Sarah’s argument is that some of these best practices in the software world kind of exist in the SaaS tools we use every day, but not really. The solution Sarah proposes is three-fold:

  1. Templating – Templates in the SaaS world are pretty rare. By using templates that have been tested, this reduces errors and redundancy.
  2. Testing – There are a lot of testing frameworks for analytics teams to use on their data. Our SaaS tools should also have similar tests that can be automated without affecting “live” data.
  3. Versioning – In the Mailchimp example above, perhaps Person B could submit a change to the email campaign. Then other team members on the marketing team could review the changes and catch any bugs before the email campaign goes live.
Source: TECH NOTES

Sarah’s blog post got a ton of responses from people who work in marketing and revenue operations. These teams are typically using a lot of SaaS tools and sometimes stitching and integrating them together with Zapier or Google Apps Script. If our SaaS tools could implement some of these changes Sarah proposes, Sarah believes these tools could work a lot better for the end user.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #97: Becoming a data Swiss army knife for marketing, operations, and customer support data problems with Sarah Krasnik appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-97-becoming-a-data-swiss-army-knife-for-marketing-operations-and-customer-support-data-problems-with-sarah-krasnik/feed/ 0 As a flex player on a data team, you might play the role of a data scientist, data analyst, or data engineer. Sarah Krasnik is one of those people who has held all these roles. In this conversation, Sarah gets into the weeds of what most data analysts ... As a flex player on a data team, you might play the role of a data scientist, data analyst, or data engineer. Sarah Krasnik is one of those people who has held all these roles. In this conversation, Sarah gets into the weeds of what most data analysts do: helping business partners make better decisions with data. Prior to her current role as an independent consultant, she worked on different data challenges faced by operations, marketing, and customer support functions. Eventually, she managed a data engineering team focused on the data platform and infrastructure. From speaking with Sarah, it conjured up memories of working with bad data, manual data tasks, and playing the role of a mediator for your business stakeholders. We also chat about a popular blog post Sarah wrote on SaaS debt.







Building and maintaining a homegrown data pipeline



Sarah's last role before striking it off on her own was at Perpay, a financial services company focused on the buy now/pay later space. The company is a data-driven organization (as are most companies these days). The data that Sarah's team was looking at was all marketing data. Specifically, data that influencers customer conversion rates. The problem that they were trying to solve was how the marketing team could send more personalized emails and messages to potential customers to get them to convert.







The marketing team originally used a tool called Iterable where you send customer data to the platform and the platform would know when to send the right customized email. For instance, abandoned cart e-mails are super effective at increasing conversions and Iterable could help with this task.



The data engineering team's goal was to figure out how to get data about the customer and get it into Iterable. This is a classic data activation scenario. Over time, Sarah's team started building a solution in-house. The biggest challenge was getting the data out of the data warehouse and having it notify Iterable's API. As the the use case for Iterable and the in-house solution grew, the data engineering team had to constantly figure out what was in Iterable and checking diffs (seeing what changed from the previous state to the current state) to debug issues. Eventually the team moved to a paid solution called Census to help with the movement of data from the data warehouse to Iterable. Sarah reflected on the evolution of the solution:



At a startup you have to be ruthless with prioritization. I realized that the data eng team was spending too much time maintaining this in-house solution. This stood out to me as a generic problem where you spend hours per month maintaining the system. When is the cost of the paid solution cheaper than the hours required for maintaining something in-house?







Automating a manual forecasting process with SQL scripts



Sarah was also a quantitative analyst at OneMain, a private lender in the fintech space. The affiliate marketing team was responsible for marketing loans so that they show up on sites like NerdWallet, Credit Karma, and Lending Club. The problem was how to increase conversions by reducing costs--another very common marketing problem that can be solved with bette data. Sarah's team was in charge of forecasting metrics like cost per loan and cost per conversion for these affiliate marketing chann...]]>
Dear Analyst 97 37:03 51695
Dear Analyst #96: Treating data as code and the new frontier for DBAs with Sean Scott https://www.thekeycuts.com/dear-analyst-96-treating-data-as-code-and-the-new-frontier-for-dbas-with-sean-scott/ https://www.thekeycuts.com/dear-analyst-96-treating-data-as-code-and-the-new-frontier-for-dbas-with-sean-scott/#respond Mon, 06 Jun 2022 05:19:00 +0000 https://www.thekeycuts.com/?p=51686 What did the developer say to the DBA? It doesn’t matter, the answer is “no.” I’ve never worked with a database administrator (DBA) before but know they play an important part in the data lifecycle at a company. Sean Scott stumbled into the DBA world and has been in this field for 25+ years. He […]

The post Dear Analyst #96: Treating data as code and the new frontier for DBAs with Sean Scott appeared first on .

]]>
What did the developer say to the DBA? It doesn’t matter, the answer is “no.” I’ve never worked with a database administrator (DBA) before but know they play an important part in the data lifecycle at a company. Sean Scott stumbled into the DBA world and has been in this field for 25+ years. He started his career working at a consumer electronics manufacturer. He started building his data chops as an inventory analyst and eventually got into the world of Oracle database migrations and application development. This episode explores a perspective on data we don’t normally see: from the DBA. I think it’s important to understand this perspective since data and business analysts ultimately use the data that is transformed and formatted by DBAs.

I remember saying I will never be a DBA

Sean likes to poke fun at the DBA crowd and remembers telling someone at a party he would never become a DBA. Perhaps his tongue-and-cheek attitude towards DBAs is what makes him so successful as a DBA. He currently works at a company called Viscosity where he does database and application development.

In short, I solve puzzles.

Sean explains how his background in data analysis and DevOps has helped him in his career as a DBA. This area of data is beyond my area of expertise, but Sean was able to relate things back to why this area matters for data analysts. Despite being in a “technical” role, Sean discusses the other qualities that make a DBA (or anyone in a technical role) successful:

The best technical people I’ve met have had great people and business skills.

How data analysts can work with DBAs better

What does DBA stand for? “Don’t bother asking.” That’s Sean’s favorite DBA joke. From a data or business analyst perspective, Sean says DBAs are typically seen as people who restrict access to data. DBAs can sometimes be seen as barriers or just standing in the way. Sean’s advice to data analysts and the consumers of data in organizations is to help change the perception of what DBAs do. I love this extremely outdated video explaining what DBAs do:

Sean says many DBAs fail to see the difference between data and databases. Many tend to mix the two together, but Sean believes these two concepts should be thought about and treated differently. Data analysts should seek to work with DBAs to understand where their data comes from. This leads to an important concept I haven’t heard of until this conversation with Sean: data as code.

Data as code leading to a diversity of ideas

Sean says that DBAs may think of data as being very fragile and brittle. They have this perception that data needs to be restricted or else it might be deleted when it’s in the wrong hands. That’s because DBAs aren’t thinking of data like other parts of the DevOps process.

Infrastructure as code has become a well-known concept as DevOps engineers manage data centers in the cloud, why can’t this same concept be applied to data? We can apply automation and configuration to the management of data. DevOps is typically concerned with storage and networking. The data lifecycle and pipeline can also be added to this list to “harden” data for the enterprise.

Now the actual nuts and bolts of this stuff is way beyond my pay grade. The benefits to data analysts, according to Sean, are plentiful. Analysts typically just analyze the data but don’t have much experience managing the data on their own. With these configurable data pipeline processes, analysts and other non-traditional infrastructure professionals can build their own data environments.

Whether or not analysts want to own this responsibility is another question for each data organization. I think what’s important is that analysts can be empowered to learn new skills and not rely on a data engineer to get the data they need. We saw this trend with Canva’s data engineering team in episode #58 as well. This trend of treating data as code can lead to more diverse ideas coming from all parts of the data organization.

Creating database artifacts

As I mentioned in the previous section, I’m getting way over my skis here :). Sean does an excellent job of digging into how data as code is important for analysts. I’ll try my best to summarize his thoughts below.

Docker opened up Sean’s eyes to treating data as code. He says database artifacts are like database images. Think of it as nothing more than an application that performs some service you want it to do. The data in the database can be turned into an “artifact” as well. With this artifact, you can store it somewhere, version it, share it with people, etc. The data is an asset that can go through various transformations and you can write code to “fix” and transform the data. This data-lake-as-code repo looks like an all-in-one application that shows how a “data as code” architecture might look on AWS:

Source

When you’re working with data in highly-regulated industries or in an e-commerce environment, adopting this data as code framework is important because having bad data would be costly for the business. In e-commerce, for instance, transactional data is constantly streaming in and potentially changing, so you want to make sure the quality of the data is high. If the data infrastructure results in customers getting incorrect data about their order, that’s obviously a bad customer experience.

Ensuring high quality database upgrades

Another characteristic of this data as code framework is that you can “replay” the code build the data asset. This is like using packaged code in Terraform or Ansible and applying this code to your infrastructure. You can do the same thing with a dataset and have that dataset “stored” in a repo somewhere. Now you have an asset like a Docker image.

Data analysts can pull this image of a dataset and can do their analysis like they would any other dataset. The key takeaway is that if you QA or run some process against this data, you are guaranteed a specific outcome with the data. The end result is the same no matter how your run the code because you have set up a repeatable process that’s been version-controlled. This ensures that “dirty” data will get fixed the right way each time.

How to manage a database through Slack

Sean used to work on an operations team and described a fun project where his team had a direct impact on customers and helped changed the perception of DBAs at his company.

Sean’s was working at an e-commerce company and the website used a legacy application for issuing coupon codes. These codes would get sent every Monday morning at 8AM. There was also an email that got sent to customers reminding them to login at 8AM on Monday to get their coupon code. It’s 7:59AM on a given Monday and the team is ready to see a bunch of coupon codes get issued. The problem? The legacy application wasn’t issuing coupon codes correctly.

The marketing team started getting nervous as customers started complaining and issuing tickets to the customer support team. The customer support team would then re-assign the tickets to the developers, and the developers would re-assign the tickets to the DBAs. Sean’s team is now on the hook to resolve these customer issues. This process could take a few hours as tickets are getting resolved.

Sean’s team realized it didn’t make sense for the ticket to get passed from one team to another. Precious time is getting wasted and the customer is just sitting there with no coupon code in hand.

The solution was a Slack channel just for the marketing team to keep track of the status of tickets. Sean’s team created a custom slash command for the marketing team to use. The marketing team could simply type /fix in this channel and a script would run in the background to pull up the relevant details about a given customer from the customer’s database. This prevented the need to contact the developers and DBAs.

Sean’s team realized that these Slack slash commands could do other operations in the database. Maybe you might want to update a customer’s record in the database or pull their latest sales. You could even use these slash commands on your phone!

The end result: the DBA team empowered the marketing team to solve real customer issues without needing a DBA’s help. The DBA team also felt like they were able to solve real customer issues instead of being stuck in SQL land all day. This solution also led other teams to look at the DBAs differently since the DBAs created a solution that impacted the front lines of the business.

Go out and love your DBA. We do try to balance the safety and security of the database against change. We are protective over our babies. Be demanding in a nice way.

Other Podcasts & Blog Posts

No other podcasts mentioned in this episode!

The post Dear Analyst #96: Treating data as code and the new frontier for DBAs with Sean Scott appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-96-treating-data-as-code-and-the-new-frontier-for-dbas-with-sean-scott/feed/ 0 What did the developer say to the DBA? It doesn't matter, the answer is "no." I've never worked with a database administrator (DBA) before but know they play an important part in the data lifecycle at a company. What did the developer say to the DBA? It doesn't matter, the answer is "no." I've never worked with a database administrator (DBA) before but know they play an important part in the data lifecycle at a company. Sean Scott stumbled into the DBA world and has been in this field for 25+ years. He started his career working at a consumer electronics manufacturer. He started building his data chops as an inventory analyst and eventually got into the world of Oracle database migrations and application development. This episode explores a perspective on data we don't normally see: from the DBA. I think it's important to understand this perspective since data and business analysts ultimately use the data that is transformed and formatted by DBAs.







I remember saying I will never be a DBA



Sean likes to poke fun at the DBA crowd and remembers telling someone at a party he would never become a DBA. Perhaps his tongue-and-cheek attitude towards DBAs is what makes him so successful as a DBA. He currently works at a company called Viscosity where he does database and application development.



In short, I solve puzzles.







Sean explains how his background in data analysis and DevOps has helped him in his career as a DBA. This area of data is beyond my area of expertise, but Sean was able to relate things back to why this area matters for data analysts. Despite being in a "technical" role, Sean discusses the other qualities that make a DBA (or anyone in a technical role) successful:



The best technical people I've met have had great people and business skills.



How data analysts can work with DBAs better



What does DBA stand for? "Don't bother asking." That's Sean's favorite DBA joke. From a data or business analyst perspective, Sean says DBAs are typically seen as people who restrict access to data. DBAs can sometimes be seen as barriers or just standing in the way. Sean's advice to data analysts and the consumers of data in organizations is to help change the perception of what DBAs do. I love this extremely outdated video explaining what DBAs do:




https://www.youtube.com/watch?v=74j_foRlM5U




Sean says many DBAs fail to see the difference between data and databases. Many tend to mix the two together, but Sean believes these two concepts should be thought about and treated differently. Data analysts should seek to work with DBAs to understand where their data comes from. This leads to an important concept I haven't heard of until this conversation with Sean: data as code.



Data as code leading to a diversity of ideas



Sean says that DBAs may think of data as being very fragile and brittle. They have this perception that data needs to be restricted or else it might be deleted when it's in the wrong hands. That's because DBAs aren't thinking of data like other parts of the DevOps process.



Infrastructure as code has become a well-known concept as DevOps engineers manage data centers in the cloud, why can't this same concept be applied to data? We can apply automation and configuration to the management of data. DevOps is typically concerned with storage and networking. The data lifecycle and pipeline can also be added to this list to "harden" data for the enterprise.







Now the actual nuts and bolts of this stuff is way beyond my pay grade. The benefits to data analysts, according to Sean, are plentiful. Analysts typically just analyze the data but don't have much experience managing the data on their own.]]>
Dear Analyst 96 35:08 51686