Dear Analyst https://www.thekeycuts.com/category/podcast/ A show made for analysts: data, data analysis, and software. Mon, 13 Jul 2020 03:48:04 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.2 This is a podcast made by a lifelong analyst. I cover topics including Excel, data analysis, and tools for sharing data. In addition to data analysis topics, I may also cover topics related to software engineering and building applications. I also do a roundup of my favorite podcasts and episodes. KeyCuts clean episodic KeyCuts info@thekeycuts.com info@thekeycuts.com (KeyCuts) A show made for analysts: data, data analysis, and software. Dear Analyst https://www.thekeycuts.com/wp-content/uploads/2019/03/dear_analyst_logo-1.png https://www.thekeycuts.com/excel-blog/ TV-G New York, NY 50542147 Dear Analyst #36: What The Economist’s model for the 2020 presidential election can teach us about forecasting https://www.thekeycuts.com/dear-analyst-what-the-economists-model-for-the-2020-presidential-election-can-teach-us-about-forecasting/ https://www.thekeycuts.com/dear-analyst-what-the-economists-model-for-the-2020-presidential-election-can-teach-us-about-forecasting/#respond Mon, 13 Jul 2020 09:26:00 +0000 https://www.thekeycuts.com/?p=49183 On a recent episode of The Intelligence, The data editor at The Economist spoke about a U.S. presidential election forecast their publication is working on. I looked more into their model and discuss some of the features and parameters of their model and what makes their forecast unique. Some of the techniques used in The […]

The post Dear Analyst #36: What The Economist’s model for the 2020 presidential election can teach us about forecasting appeared first on .

]]>
On a recent episode of The Intelligence, The data editor at The Economist spoke about a U.S. presidential election forecast their publication is working on. I looked more into their model and discuss some of the features and parameters of their model and what makes their forecast unique. Some of the techniques used in The Economist‘s model can be used with your own forecasting use cases. To see a summary of The Economist‘s model, see this page. Learn more about how the model works on this page.

Source: The Economist

Key takeaways and a caveat

The model utilizes machine learning and multiple data sources and it’s easy to get caught up in the details. Here are the key takeaways as described by Dan Rosenhack, the data editor at The Economist:

  1. Machine learning is used to create equations to predict the 2020 presidential outcome
  2. Early polls are not as reliable early on in the election cycle
  3. Partisan non-response bias can result in a supporter being more likely or unlikely to respond to a pollster when there is extremely good or bad news about that supporter’s party or candidate

A caveat: The Economist‘s model and the various forecasting techniques they use are definitely outside of my knowledge and skillset. Most of this episode is me learning more about the model and interpreting some of the results. You don’t have to be a statistics programmer or data science professional to appreciate what the data team has done at The Economist. If you are working with data in any capacity, pushing yourself to learn about subjects that push your comfort zone will only make you more knowledgable about the data analysis process.

Fundamentals vs. early polling

One key finding from the model is that polls conducted in the first half of the year during the election cycle are a pretty weak predictor of results. On the other hand, fundamental measures like the president’s approval rating, GDP growth, and whether there is an incumbent running for re-election are much better predictors. This chart shows the difference between poll results and fundamentals for predicting the outcome in 1992:

Source: The Economist

The model primarily relies on these fundamental indicators, but over time the polls become a better indicator for predicting the outcome. In the last week leading up the election in November, more weight is applied to the polls than the fundamentals.

This visualization below shows that early polls tend to overestimate a party’s share of the vote (in this case the Democratic share) compared to fundamental indicators. As you get closer to election day, however, the polls start to become a better predictor:

Source: The Economist

Overfitting data

One downside The Economist points out with other models that try to forecast the presidential election is that equations are created that overfit to historical data points. Think about it: if you tried to create an equation to predict who would win the NBA championship in 2020 based on 1990s data, you may create an equation that leans heavily to the Bulls. Unfortunately, Michael Jordan isn’t playing anymore and the 2020 NBA season is now being played in a bubble in Orlando.

Had to mention Jordan somewhere in this post 🙂

The Economist utilizes machine learning to better predict the outcome of the presidential election and utilizes two techniques which I’ll try to explain in layman’s terms from reading the post:

  1. Elastic-net regularisation – Simplify the equation you’re using to predict the outcome
  2. Leave-one-out-cross-validation – Split your data into pieces and apply the machine learning to each piece to predict outcomes

#2 is a pretty common technique I’ve seen used in finance. Take actual results and see if you can predict what would’ve happened if you applied your forecast to last quarter or last year.

In the context of the presidential election, let’s say the model is trying to predict what the outcome of the 1948 election would’ve been (the incumbent Harry Truman defeated Thomas Dewey). The training model is done on all the other years of data except for 1948. Then use the learnings from these other years to see which model was best at predicting the outcome in 1948.

State polling

The model also looks at state-level polling data. What’s interesting about the state model is how it uses demographic data like population density and the share of voters that are white evangelical Christians to determine how similar two states are in terms of voter preferences:

Source: The Economist

In the visualization above, Wisconsin is more similar to Ohio than Nevada is to Ohio.

A note about non-partisan response bias

I’ve never heard this term before and think the way the team is accounting for this bias in their model makes the model more accurate and unique. They take polling data from major sources like ABC and The Washington Post and track the changes in poll results over time. This means they can account for any irregularities in the data so that large swings in opinion due to news about a candidate don’t impact the model too much.

Looking at the us-potus-model repo

One visualization that caught my eye in the source code The Economist released is this one showing the model results vs. the polls vs. actuals from the 2008, 2012, and 2016 elections. Notice how in 2008 and 2012 the variability between the model, prior, and result are much closer together than in 2016? Just shows the level of uncertainty that went into the 2016 prediction.

2008

2012

2016

Speaking of uncertainty, I like this commit message as the team was refining the model back in March

We have chronic uncertainty.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #36: What The Economist’s model for the 2020 presidential election can teach us about forecasting appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-what-the-economists-model-for-the-2020-presidential-election-can-teach-us-about-forecasting/feed/ 0 On a recent episode of The Intelligence, The data editor at The Economist spoke about a U.S. presidential election forecast their publication is working on. I looked more into their model and discuss some of the features and parameters of their model a... On a recent episode of The Intelligence, The data editor at The Economist spoke about a U.S. presidential election forecast their publication is working on. I looked more into their model and discuss some of the features and parameters of their model and what makes their forecast unique. Some of the techniques used in The Economist's model can be used with your own forecasting use cases. To see a summary of The Economist's model, see this page. Learn more about how the model works on this page.



Source: The Economist



Key takeaways and a caveat



The model utilizes machine learning and multiple data sources and it's easy to get caught up in the details. Here are the key takeaways as described by Dan Rosenhack, the data editor at The Economist:



* Machine learning is used to create equations to predict the 2020 presidential outcome* Early polls are not as reliable early on in the election cycle* Partisan non-response bias can result in a supporter being more likely or unlikely to respond to a pollster when there is extremely good or bad news about that supporter's party or candidate



A caveat: The Economist's model and the various forecasting techniques they use are definitely outside of my knowledge and skillset. Most of this episode is me learning more about the model and interpreting some of the results. You don't have to be a statistics programmer or data science professional to appreciate what the data team has done at The Economist. If you are working with data in any capacity, pushing yourself to learn about subjects that push your comfort zone will only make you more knowledgable about the data analysis process.



Fundamentals vs. early polling



One key finding from the model is that polls conducted in the first half of the year during the election cycle are a pretty weak predictor of results. On the other hand, fundamental measures like the president's approval rating, GDP growth, and whether there is an incumbent running for re-election are much better predictors. This chart shows the difference between poll results and fundamentals for predicting the outcome in 1992:



Source: The Economist



The model primarily relies on these fundamental indicators, but over time the polls become a better indicator for predicting the outcome. In the last week leading up the election in November, more weight is applied to the polls than the fundamentals.



This visualization below shows that early polls tend to overestimate a party's share of the vote (in this case the Democratic share) compared to fundamental indicators. As you get closer to election day, however, the polls start to become a better predictor:



Source: The Economist



Overfitting data



One downside The Economist points out with other models that try to forecast the presidential election is that equations are created that overfit to historical data points. Think about it: if you tried to create an equation to predict who would win the NBA championship in 2020 based on 1990s data, you may create an equation that leans heavily to the Bulls. Unfortunately, Michael Jordan isn't playing anymore and the 2020 NBA season is now being played in a bubble in Orlando.



Had to mention Jordan somewhere in this post :)
Dear Analyst 47:30 49183
Dear Analyst #35: Analyzing what people dream about with the Shape of Dreams data visualization https://www.thekeycuts.com/dear-analyst-analyzing-what-people-dream-about-with-the-shape-of-dreams-data-visualization/ https://www.thekeycuts.com/dear-analyst-analyzing-what-people-dream-about-with-the-shape-of-dreams-data-visualization/#respond Mon, 29 Jun 2020 09:02:00 +0000 https://www.thekeycuts.com/?p=49164 Have you ever wondered what the underlying meaning of your dreams are? Chances are you may have tried Googling something like “What does it mean to dream about [INSERT DREAM].” In The Shape of Dreams, Federica Fragapane answers this very question of what people around the world dream about by using Google Search queries from […]

The post Dear Analyst #35: Analyzing what people dream about with the Shape of Dreams data visualization appeared first on .

]]>
Have you ever wondered what the underlying meaning of your dreams are? Chances are you may have tried Googling something like “What does it mean to dream about [INSERT DREAM].” In The Shape of Dreams, Federica Fragapane answers this very question of what people around the world dream about by using Google Search queries from 2009 to 2019. Federica uses a mix of data storytelling and data visualizations to show what we collectively dream about based on what we search for in Google. The key takeaway: someone on the opposite side of the world probably has similar dreams as you showing that we are more connected than we think.

Shape of Dreams

Importance of data visualization

Data visualizations are just as important (if not more important) than the number crunching and analysis of the data itself. While Excel and Google Sheets are the standard tools for analyzing data, there are a variety of tools for creating charts and visualizations such as Tableau, Google’s Data Studio, and Microsoft’s own Power BI.

data visualization kevin simler
Source: Melting Asphalt

I’ve posted about the power of data visualizations in the past including New York Times’ data bootcamp (that teaches data visualization), data visualizations to model COVID-19, and my own class on creating a data-driven presentation. Creating meaningful data visualizations requires you to understand the technical aspects of aggregating data and actually creating the visualization itself. It also requires the creative side of telling a story around the visualization. Federica does an amazing job of telling a story about the Google Search queries about what we collectively dream about as a society.

Structure of Shape of Dreams

I really like how Federica gives the reader two options: read the story about the data where she takes you through the visualizations with key takeaways and also gives you the ability to explore the data yourself. In the first chapter, she simply shows the most common types of dreams by keyword across different languages:

Who doesn’t dream about their teeth falling off?

When you explore the data, you can use the arrow keys to see the dreams people search for by language and by year which leads to some interesting results:

Varying the types of visualizations

As you go through chapter 2 and chapter 3, you see Federica utilizing different types of visualizations to better tell the story behind the dream Google Searches. A motif she uses across the visualizations is a flower’s pedals, and you’re able to interact with the pedals in chapter 2. To summarize what I imagine to be a extremely large dataset, we see some general categories of dreams in chapter 2:

Federica discovers that searches in English, Portuguese, and Spanish aggregate up to dreams about animals, family, and relationships.

You’ll see a more traditional time-series chart in chapter 3 showing the popularity of a certain type of dream over time. I’d be curious to see the trend of dreams about “pregnancy” in 2020 given the pandemic:

A network of dreams

My favorite visualization is in chapter 4 where you’ll see a network type of visualization that shows two metrics:

  • Languages that share common searches about dreams
  • The number of dreams in common between languages

We actually use a similar type of visualization at work when we want to see how our customers are related to each other inside an organization (and how they share their Coda docs). What I love about the visualization above is that it shows how connected we are as a society given the same type of dreams we have (and subsequently search for on Google).

Using data to get a edge on human conversations

I also discuss a new podcast I started listening to called Against the Rules by one of my favorite authors, Michael Lewis. The episode is all about how there is research (and companies) helping you optimize your conversations with people to get the most benefit from the conversation. Lewis poses the million-dollar question at the end of the episode which is what are the ethics behind using this data to optimize all of your conversations in life from business to romance?

This question is probably getting addressed already at Harvard Business School Lewis interviews Professor Allison Wood Brooks in the episode who has a class at HBS called How to Talk Gooder in Business and Life. If you don’t have access to these type of classes and resources, will that put you at a disadvantage later on in your career, negotiating a business deal, finding a romantic partner?

Taken to the extreme, this reminds of me this scene from the season finale of Westworld (spoiler alert):

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #35: Analyzing what people dream about with the Shape of Dreams data visualization appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-analyzing-what-people-dream-about-with-the-shape-of-dreams-data-visualization/feed/ 0 Have you ever wondered what the underlying meaning of your dreams are? Chances are you may have tried Googling something like "What does it mean to dream about [INSERT DREAM]." In The Shape of Dreams, Federica Fragapane answers this very question of wh... Have you ever wondered what the underlying meaning of your dreams are? Chances are you may have tried Googling something like "What does it mean to dream about [INSERT DREAM]." In The Shape of Dreams, Federica Fragapane answers this very question of what people around the world dream about by using Google Search queries from 2009 to 2019. Federica uses a mix of data storytelling and data visualizations to show what we collectively dream about based on what we search for in Google. The key takeaway: someone on the opposite side of the world probably has similar dreams as you showing that we are more connected than we think.



Shape of Dreams



Importance of data visualization



Data visualizations are just as important (if not more important) than the number crunching and analysis of the data itself. While Excel and Google Sheets are the standard tools for analyzing data, there are a variety of tools for creating charts and visualizations such as Tableau, Google's Data Studio, and Microsoft's own Power BI.



Source: Melting Asphalt



I've posted about the power of data visualizations in the past including New York Times' data bootcamp (that teaches data visualization), data visualizations to model COVID-19, and my own class on creating a data-driven presentation. Creating meaningful data visualizations requires you to understand the technical aspects of aggregating data and actually creating the visualization itself. It also requires the creative side of telling a story around the visualization. Federica does an amazing job of telling a story about the Google Search queries about what we collectively dream about as a society.



Structure of Shape of Dreams



I really like how Federica gives the reader two options: read the story about the data where she takes you through the visualizations with key takeaways and also gives you the ability to explore the data yourself. In the first chapter, she simply shows the most common types of dreams by keyword across different languages:



Who doesn't dream about their teeth falling off?



When you explore the data, you can use the arrow keys to see the dreams people search for by language and by year which leads to some interesting results:







Varying the types of visualizations



As you go through chapter 2 and chapter 3, you see Federica utilizing different types of visualizations to better tell the story behind the dream Google Searches. A motif she uses across the visualizations is a flower's pedals, and you're able to interact with the pedals in chapter 2. To summarize what I imagine to be a extremely large dataset, we see some general categories of dreams in chapter 2:







Federica discovers that searches in English, Portuguese, and Spanish aggregate up to dreams about animals, family, and relationships.



You'll see a more traditional time-series chart in chapter 3 showing the popularity...]]>
Dear Analyst 35 34:49 49164
Dear Analyst #34: Trick for finding column index for VLOOKUPs using pride events data https://www.thekeycuts.com/dear-analyst-34-trick-for-finding-column-index-for-vlookups-using-pride-events-data/ https://www.thekeycuts.com/dear-analyst-34-trick-for-finding-column-index-for-vlookups-using-pride-events-data/#respond Mon, 22 Jun 2020 09:27:00 +0000 https://www.thekeycuts.com/?p=49140 This is one of my favorite VLOOKUP tips. Given that it’s pride month, we’ll be applying this tip to a list of all pride events in the United States. Here is the Google Sheet if you want to follow along with this example. Here’s the scenario: you have a super large table in Excel or […]

The post Dear Analyst #34: Trick for finding column index for VLOOKUPs using pride events data appeared first on .

]]>
This is one of my favorite VLOOKUP tips. Given that it’s pride month, we’ll be applying this tip to a list of all pride events in the United States. Here is the Google Sheet if you want to follow along with this example. Here’s the scenario: you have a super large table in Excel or Google Sheets (by large I mean there are many columns) and you need to do a VLOOKUP on the 25th column. Instead of counting 25 columns from the left of your lookup column, you can use this column index trick to quickly get the column you’re after.

Creating column indexes above your lookup table

In the screenshot above, you’ll notice that each column has the column index above it. This is a simple formula of the previous column index added to 1:

This might feel a little strange because we’re used to heaving the column headers in the first row of our table. By having this column index in above the column header, however, it’ll make it easier to provide the col_index parameter you need to provide to your VLOOKUP formula. In this list of pride events, if I want to get the Start column pulled into my VLOOKUP formula, I simply reference the column index above the column header instead of writing out the number “5” (note that PrideEvents is a named range representing A2:E270 in my list of pride events):

Putting the column index above your new column headers

In this second example, I put the column index above the new table where I want to pull in data from my list of pride events. Notice that the order of columns I want to pull does not match the column order from my lookup table. The simple trick here is that I do a simple cell reference to the column index above the main table so that I know that the order of the columns I want to pull back in this case is 3, 5, 2:

One of the benefits of this trick is that you can move columns around in your lookup table and this VLOOKUP formula will still work only if you “reset” the column indexes above your lookup table column headers to be sequential (1, 2, 3, etc). This is kind of annoying because any time I switch columns around, I have to re-drag the formula of the previous cell plus 1 in row 1 where my column indexes are. Hopefully your columns aren’t moving around too much and this solution works for you.

Using the MATCH() function to find the column index

This is a little more advanced, but another solution is to use the MATCH function to match the column name in your new table with the column names in your lookup table:

Instead of doing a simple reference to the column index in that first row in my new table, I have this MATCH function which tries to match Location, in this case, with the column headers in the lookup table ($A$2:$E$2 represents the column headers from my list of pride events). If it finds a “match,” then the MATCH function returns back the column index. You could actually do this solution without having that column index above your new table columns by putting the MATCH function directly in your VLOOKUP formula, but it might make the formula more difficult to debug in the future.

Pride Easter egg in Google Sheets

To celebrate pride month, here’s a fun Easter egg you’ll find in Google Sheets if you type out “PRIDE” in separate columns (you’ll also see this in the Google Sheets example for this blog post):

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #34: Trick for finding column index for VLOOKUPs using pride events data appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-34-trick-for-finding-column-index-for-vlookups-using-pride-events-data/feed/ 0 This is one of my favorite VLOOKUP tips. Given that it's pride month, we'll be applying this tip to a list of all pride events in the United States. Here is the Google Sheet if you want to follow along with this example. This is one of my favorite VLOOKUP tips. Given that it's pride month, we'll be applying this tip to a list of all pride events in the United States. Here is the Google Sheet if you want to follow along with this example. Here's the scenario: you have a super large table in Excel or Google Sheets (by large I mean there are many columns) and you need to do a VLOOKUP on the 25th column. Instead of counting 25 columns from the left of your lookup column, you can use this column index trick to quickly get the column you're after.







Creating column indexes above your lookup table



In the screenshot above, you'll notice that each column has the column index above it. This is a simple formula of the previous column index added to 1:







This might feel a little strange because we're used to heaving the column headers in the first row of our table. By having this column index in above the column header, however, it'll make it easier to provide the col_index parameter you need to provide to your VLOOKUP formula. In this list of pride events, if I want to get the Start column pulled into my VLOOKUP formula, I simply reference the column index above the column header instead of writing out the number "5" (note that PrideEvents is a named range representing A2:E270 in my list of pride events):







Putting the column index above your new column headers



In this second example, I put the column index above the new table where I want to pull in data from my list of pride events. Notice that the order of columns I want to pull does not match the column order from my lookup table. The simple trick here is that I do a simple cell reference to the column index above the main table so that I know that the order of the columns I want to pull back in this case is 3, 5, 2:







One of the benefits of this trick is that you can move columns around in your lookup table and this VLOOKUP formula will still work only if you "reset" the column indexes above your lookup table column headers to be sequential (1, 2, 3, etc). This is kind of annoying because any time I switch columns around, I have to re-drag the formula of the previous cell plus 1 in row 1 where my column indexes are. Hopefully your columns aren't moving around too much and this solution works for you.



Using the MATCH() function to find the column index



This is a little more advanced, but another solution is to use the MATCH function to match the column name in your new table with the column names in your lookup table:







Instead of doing a simple reference to the column index in that first row in my new table, I have this MATCH function which tries to match Location, in this case, with the column headers in the lookup table ($A$2:$E$2 represents the column headers from my list of pride events). If it finds a "match," then the MATCH function returns back the column index. You could actually do this solution without having that column index above your new table columns by putting the MATCH function directly in your VLOOKUP formula, but it might make the formula more difficult to debug in the future.



Pride Easter egg in Google Sheets



To celebrate pride month, here's a fun Easter egg you'll find in Google Sheets if you type out "PRIDE" in separate columns (you'll also see this in the Google Sheets example for this blog post):



]]>
Dear Analyst 34 23:19 49140
Dear Analyst #33: Comparing one-time vs. monthly recurring donations to support racial justice organizations https://www.thekeycuts.com/dear-analyst-33-comparing-one-time-vs-monthly-recurring-donations-to-support-racial-justice-organizations/ https://www.thekeycuts.com/dear-analyst-33-comparing-one-time-vs-monthly-recurring-donations-to-support-racial-justice-organizations/#respond Mon, 08 Jun 2020 08:59:00 +0000 https://www.thekeycuts.com/?p=49112 The Black Lives Matter movement has come to the forefront in the news media. People around the world are looking for ways to fight racial injustice. If you are in a position to donate to an organization fighting racial injustice, you are joining the ranks of individuals and companies who are supporting the movement with […]

The post Dear Analyst #33: Comparing one-time vs. monthly recurring donations to support racial justice organizations appeared first on .

]]>
The Black Lives Matter movement has come to the forefront in the news media. People around the world are looking for ways to fight racial injustice. If you are in a position to donate to an organization fighting racial injustice, you are joining the ranks of individuals and companies who are supporting the movement with their dollars. While all these donations are great for all these organizations, I wanted to explore what happens once the news cycle dies down and whether donations will continue. I also make an argument for why you should consider donating on a monthly basis to organizations fighting racial injustice. This is the Google Sheet I discuss in the episode.

Source: Equal Justice Initiative

One-time donation model

As people rush to make donations to organizations fighting racial injustice, what will happen once the protests and police brutality finds their way out of the news cycle? Will donations dip? Here’s an extremely oversimplified model of how an organization might see their donations over time:

Month 1 (today) we see a spike in donations, but then the total revenue from donations may decrease back to “normal” levels. What is the spike in month 7? This may be the organization’s annual fundraising gala, when the grant money hits their bank account, or perhaps a large marketing campaign to drive new one-off donations.

The issue with this model is that it’s–to give it a scientific term–lumpy. A lot of resources (human and monetary) is spent producing these fundraising events and applying to grants. To put the average donation size in perspective:

The average monthly online donation is $52 ($624 per year) compared to the average one-time gift of $128. According to Network for Good’s donation data, the average recurring donor will give 42% more in one year than those who give one-time gifts. Monthly donors also have a greater lifetime revenue per donor. Finally, 52% of Millennials are more likely to give monthly vs. a large one-time donation.

Source: Donorbox

I used the same numbers from Network for Good in the Google Sheet. Let’s see what happens with the monthly recurring donation model:

Monthly recurring donation model

This oversimplified model shows that total donation revenue (green bars) increases steadily over time. The reason the increase occurs is due to small increases in new donation revenue over the months but more importantly, there’s a base of recurring donations every month.

Benefits of the monthly recurring donation model

As I discuss in the episode, the main benefits for the organization (and why you should consider donating monthly) for this recurring donation model include:

  • Not relying on a large fundraising gala to raise funds for the year
  • Better predictability in terms of revenue which leads to better planning for costs and overhead
  • No time spent applying for grants

There are many other reasons for the organization, but to the individual donor there benefits as well:

  • Lower dollar amount to “get started”
  • Less pain compared to a large one-time donation
  • More involvement with the community you are supporting

The list goes on and on. Another way to think about why donating monthly vs. a one-time donation is comparing your donation to paying for SaaS software.

Similarities to SaaS

I think the challenges of donating monthly come down to:

  • Taking the emotion out of the donation
  • Knowing the expected value of the donation

I speak more about this in the episode, but imagine if you treated your donation like the cost you pay every month for Netflix or for the gym (pre-COVID). You are getting some sort of value from your monthly “donation” and you know that Netlix or your gym is on the hook to provide you with a “deliverable” every month. I think the same line of thinking can be applied to what you can expect from an organization fighting racial injustice beyond the platitude that it “feels good.”

Non-racial justice activism

One thought experiment I pose in the episode is what this current social environment means for other long-standing initiatives like animal rights, women’s rights, and environmental protection. It took literally took George Floyd’s life to spark this movement around the world. Will it take a similar event to trigger widespread activism for climate change, women’s rights in Saudi Arabia, etc.? Could funding for these other initiatives decrease in light of what’s going on?

Source: Star Tribune

I hope that all these initiatives are brought to the forefront and people around the world consider giving monthly to the organizations they care about. For aa list of organization to consider donating to for the Black Lives Matter movement, see this list.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #33: Comparing one-time vs. monthly recurring donations to support racial justice organizations appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-33-comparing-one-time-vs-monthly-recurring-donations-to-support-racial-justice-organizations/feed/ 0 The Black Lives Matter movement has come to the forefront in the news media. People around the world are looking for ways to fight racial injustice. If you are in a position to donate to an organization fighting racial injustice, The Black Lives Matter movement has come to the forefront in the news media. People around the world are looking for ways to fight racial injustice. If you are in a position to donate to an organization fighting racial injustice, you are joining the ranks of individuals and companies who are supporting the movement with their dollars. While all these donations are great for all these organizations, I wanted to explore what happens once the news cycle dies down and whether donations will continue. I also make an argument for why you should consider donating on a monthly basis to organizations fighting racial injustice. This is the Google Sheet I discuss in the episode.



Source: Equal Justice Initiative



One-time donation model



As people rush to make donations to organizations fighting racial injustice, what will happen once the protests and police brutality finds their way out of the news cycle? Will donations dip? Here's an extremely oversimplified model of how an organization might see their donations over time:







Month 1 (today) we see a spike in donations, but then the total revenue from donations may decrease back to "normal" levels. What is the spike in month 7? This may be the organization's annual fundraising gala, when the grant money hits their bank account, or perhaps a large marketing campaign to drive new one-off donations.



The issue with this model is that it's--to give it a scientific term--lumpy. A lot of resources (human and monetary) is spent producing these fundraising events and applying to grants. To put the average donation size in perspective:



The average monthly online donation is $52 ($624 per year) compared to the average one-time gift of $128. According to Network for Good’s donation data, the average recurring donor will give 42% more in one year than those who give one-time gifts. Monthly donors also have a greater lifetime revenue per donor. Finally, 52% of Millennials are more likely to give monthly vs. a large one-time donation. Source: Donorbox



I used the same numbers from Network for Good in the Google Sheet. Let's see what happens with the monthly recurring donation model:



Monthly recurring donation model







This oversimplified model shows that total donation revenue (green bars) increases steadily over time. The reason the increase occurs is due to small increases in new donation revenue over the months but more importantly, there's a base of recurring donations every month.



Benefits of the monthly recurring donation model



As I discuss in the episode, the main benefits for the organization (and why you should consider donating monthly) for this recurring donation model include:



* Not relying on a large fundraising gala to raise funds for the year* Better predictability in terms of revenue which leads to better planning for costs and overhead* No time spent applying for grants



There are many other reasons for the organization, but to the individual donor there benefits as well:



* Lower dollar amount to "get started"* Less pain compared to a large one-time donation* More involvement with the community you are supporting



The list goes on and on. Another way to think about why donating monthly vs. a one-time donation is comparing your donation to p...]]>
Dear Analyst 33 45:31 49112
Dear Analyst #32: How to use the QUERY function in Google Sheets on COVID-19 data https://www.thekeycuts.com/dear-analyst-32-how-to-use-the-query-function-in-google-sheets-covid-19-data/ https://www.thekeycuts.com/dear-analyst-32-how-to-use-the-query-function-in-google-sheets-covid-19-data/#respond Mon, 01 Jun 2020 09:38:00 +0000 https://www.thekeycuts.com/?p=49090 The QUERY() function in Google Sheets gives you the ability to quickly filter and sort your data similar to how you might get data from a database. If you write SQL queries, the QUERY() function feels easy and natural to use. There are a few caveats as I discuss in this episode. If you want […]

The post Dear Analyst #32: How to use the QUERY function in Google Sheets on COVID-19 data appeared first on .

]]>
The QUERY() function in Google Sheets gives you the ability to quickly filter and sort your data similar to how you might get data from a database. If you write SQL queries, the QUERY() function feels easy and natural to use. There are a few caveats as I discuss in this episode. If you want to follow along with the exercises I discuss in this episode, make a copy of this Google Sheet which contains the QUERY() functions I mention in the episode.

Basic query to find confirmed cases greater than 50,000

Our data set is from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The data shows confirmed cases, deaths, and recovered cases by country (188 countries) on May 1st:

The first query simply pulls back the list of countries and confirmed cases where the number of confirmed cases is greater than 50,000. Notice how you reference the column letter name versus the actual name of the column in the header row:

The first parameter is covid_data which is a named range in Google Sheets. In this case, it references cells A1:E188 in our data set.

More SQL-like commands

You can do many database-like commands with the QUERY() function. The next example shows how you can use the ORDER BY command to find countries with deaths between 0 and 5 and the resulting list is sorted in descending order:

Check out Ben Collins’ blog post about the QUERY() function to see some of the other SQL commands you can use.

Adding in new calculated columns

In the third query, we get a little more advanced and use the LABEL command to create a new “column” called Case Fatality Rate. This calculation is simply Confirmed / Deaths. Unlike SQL, you put the LABEL at the end of the command instead of in the beginning of the SELECT statement:

Coming from SQL, you’ll need to account for the difference in the order of commands in the query in order for it to work correctly.

Inability to select column names

You’ll notice that you don’t put the actual names of the columns in your header row in the query. This can be a pro or con of the QUERY() function depending on how your underlying data set is structured.

Columns are changing a lot

If you underlying data is constantly “shuffling” where columns are moving around and the structure of the data is not set in stone, the QUERY() function will most likely break because you’re referencing the column letter instead of the column name like in a traditional SQL query.

Columns are fixed

If your columns are not shuffling around a lot, this syntax of selecting the column letter may actually be easier for you. This is because you don’t have to type out the long column name in the QUERY() function. If data is simply getting appended to the bottom of your data set, then the QUERY() function should work fine for you because the letters of the columns will always reference the correct columns of data.

PivotTables vs. the QUERY() function

One of the reasons I don’t use the QUERY() function too often is because I find PivotTables to be easy enough to use to filter, sort, and aggregate my data to do my analysis. Additionally, your columns can move around in the underlying data set and the PivotTable will still work since it’s not referencing columns by letter but rather by the name in your header row.

Plotting trend lines for COVID-19

One of the articles I discuss in this episode is this Vox article about how the Council of Economic Advisers may have applied a stock trendline in Excel to “forecast” deaths as a result of COVID-19. The article discusses the concept of “smoothing out” volatile data versus prescribing a forecast, and that line between these two concepts is a bit blurry. This is the cubic chart in Excel which you can easily build from the trendline features in Excel:

Source: Vox

And then this is the chart from a CEA Tweet that appears to show the cubic trendline as a potential forecast:

SUM by David Eagleman

A book I discuss at the end of this episode is SUM: Tales from the Afterlives by David Eagleman. I read a chapter from the book called Incentive and how it relates to some recent shows I’ve been watching like Westworld and Devs. Highly recommend checking out the book.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #32: How to use the QUERY function in Google Sheets on COVID-19 data appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-32-how-to-use-the-query-function-in-google-sheets-covid-19-data/feed/ 0 The QUERY() function in Google Sheets gives you the ability to quickly filter and sort your data similar to how you might get data from a database. If you write SQL queries, the QUERY() function feels easy and natural to use. The QUERY() function in Google Sheets gives you the ability to quickly filter and sort your data similar to how you might get data from a database. If you write SQL queries, the QUERY() function feels easy and natural to use. There are a few caveats as I discuss in this episode. If you want to follow along with the exercises I discuss in this episode, make a copy of this Google Sheet which contains the QUERY() functions I mention in the episode.







Basic query to find confirmed cases greater than 50,000



Our data set is from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The data shows confirmed cases, deaths, and recovered cases by country (188 countries) on May 1st:







The first query simply pulls back the list of countries and confirmed cases where the number of confirmed cases is greater than 50,000. Notice how you reference the column letter name versus the actual name of the column in the header row:







The first parameter is covid_data which is a named range in Google Sheets. In this case, it references cells A1:E188 in our data set.



More SQL-like commands



You can do many database-like commands with the QUERY() function. The next example shows how you can use the ORDER BY command to find countries with deaths between 0 and 5 and the resulting list is sorted in descending order:







Check out Ben Collins' blog post about the QUERY() function to see some of the other SQL commands you can use.



Adding in new calculated columns



In the third query, we get a little more advanced and use the LABEL command to create a new "column" called Case Fatality Rate. This calculation is simply Confirmed / Deaths. Unlike SQL, you put the LABEL at the end of the command instead of in the beginning of the SELECT statement:







Coming from SQL, you'll need to account for the difference in the order of commands in the query in order for it to work correctly.



Inability to select column names



You'll notice that you don't put the actual names of the columns in your header row in the query. This can be a pro or con of the QUERY() function depending on how your underlying data set is structured.



Columns are changing a lot



If you underlying data is constantly "shuffling" where columns are moving around and the structure of the data is not set in stone, the QUERY() function will most likely break because you're referencing the column letter instead of the column name like in a traditional SQL query.



Columns are fixed



If your columns are not shuffling around a lot, this syntax of selecting the column letter may actually be easier for you. This is because you don't have to type out the long column name in the QUERY() function. If data is simply getting appended to the bottom of your data set, then the QUERY() function should work fine for you because the letters of the columns will always reference the correct columns of data.



PivotTables vs. the QUERY() function



]]>
Dear Analyst 32 45:07 49090
Dear Analyst #31: Writing Google Apps Scripts to sync data from Coda to Google Sheets https://www.thekeycuts.com/dear-analyst-31-writing-google-apps-scripts-to-sync-data-from-coda-to-google-sheets/ https://www.thekeycuts.com/dear-analyst-31-writing-google-apps-scripts-to-sync-data-from-coda-to-google-sheets/#respond Mon, 18 May 2020 09:46:00 +0000 https://www.thekeycuts.com/?p=49069 I worked on a “small” side project recently to sync data between Google Sheets and tables in Coda. The full blog post tutorial is here, and the GitHub repository is here. I started using Google Apps Script last year and it’s a super powerful way to connect different apps you use in the G Suite […]

The post Dear Analyst #31: Writing Google Apps Scripts to sync data from Coda to Google Sheets appeared first on .

]]>
I worked on a “small” side project recently to sync data between Google Sheets and tables in Coda. The full blog post tutorial is here, and the GitHub repository is here. I started using Google Apps Script last year and it’s a super powerful way to connect different apps you use in the G Suite ecosystem. The impetus for creating these two scripts was seeing a few people in the Coda community talk about syncing data between their Google Sheets and Coda. The big caveat is that these are only one-way syncs, but there are several use cases where doing this could be useful in business workflows and making your team more productive.

Writing a script in Google Apps Script

Some Google Apps scripts can be super simple to set up. See this pretty simple workflow below of sending email automatically when there is data in your Google Sheet:

Most of the “work” with writing these scripts was transforming data so that the model in Google Sheets matches the model in Coda as per Coda’s API. Once that data munging is done, the rest of the script was relatively easy in terms of giving users the ability to add, delete, and modify data. I would highly recommend taking a look at Google Apps Script especially if you use a lot of Google Sheets. You’ll be able to connect your Google Sheet with other applications in G Suite and other 3rd-party apps you use for work.

Use cases for syncing data between Coda and Google Sheets

This comes straight from the blog post, but thought it was worth repeating again:

Data synced from your Google Sheet

  • HR & recruiting – All your candidates are stored in a Google Sheet but you want to be able to move candidates through different stages in the interviewing pipeline and Google Sheets isn’t sufficient for your needs. Having all your candidates in a table in Coda means you can use templates like this one to manage candidates more effectively.
  • E-commerce and ERP – Orders, customers, and POs may all be different tabs in a Google Sheet that gets updated through Shopify or some other e-commerce platform. In order to manage your e-commerce business, you may want to see charts, calendar of shipments, and reports that Google Sheets cannot provide easily. Syncing the data from Google Sheets to Coda means you can do ERP properly (see this template as an example).
  • Customer Feedback – You may have a ticketing system like Zendesk or Intercom and all feedback lands in a Google Sheet somewhere. You can do some basic analytics in the Google Sheet but to reply to the feedback means you have to go into Gmail and start replying to customers. If your customer feedback is all in a Coda doc, you can run analytics and send emails using the Gmail Pack (see this template).

Data synced to your Google Sheet

  • 3rd-party vendor reporting – Your vendors may not be using Coda yet, but you have all your vendor data in Coda and need to send them the data in a format they prefer. While you could publish your Coda doc, the vendor still wants the data in a Google Sheet you have edit access to.
  • Data “backup” – Your team may create thousands of rows of data every quarter in a Coda doc and want to start each quarter “fresh.” Coda docs grow with your teams and they may get slow as you add in more functionality, so having a backup of your data in Google Sheets is another reason to sync data from your Coda doc to Google Sheets.
  • Finance & Accounting – Most internal finance and accounting functions still use Excel and spreadsheets for month-end reporting, taxes, and other business-critical activities. As your data grows in Coda, you can keep your finance counterparts in the loop by having your data synced to a Google Sheet which your finance team can use for their reporting and forecasting purposes.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #31: Writing Google Apps Scripts to sync data from Coda to Google Sheets appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-31-writing-google-apps-scripts-to-sync-data-from-coda-to-google-sheets/feed/ 0 I worked on a "small" side project recently to sync data between Google Sheets and tables in Coda. The full blog post tutorial is here, and the GitHub repository is here. I started using Google Apps Script last year and it's a super powerful way to con... I worked on a "small" side project recently to sync data between Google Sheets and tables in Coda. The full blog post tutorial is here, and the GitHub repository is here. I started using Google Apps Script last year and it's a super powerful way to connect different apps you use in the G Suite ecosystem. The impetus for creating these two scripts was seeing a few people in the Coda community talk about syncing data between their Google Sheets and Coda. The big caveat is that these are only one-way syncs, but there are several use cases where doing this could be useful in business workflows and making your team more productive.







Writing a script in Google Apps Script



Some Google Apps scripts can be super simple to set up. See this pretty simple workflow below of sending email automatically when there is data in your Google Sheet:




https://www.youtube.com/watch?v=xxgQr-jSu9o




Most of the "work" with writing these scripts was transforming data so that the model in Google Sheets matches the model in Coda as per Coda's API. Once that data munging is done, the rest of the script was relatively easy in terms of giving users the ability to add, delete, and modify data. I would highly recommend taking a look at Google Apps Script especially if you use a lot of Google Sheets. You'll be able to connect your Google Sheet with other applications in G Suite and other 3rd-party apps you use for work.



Use cases for syncing data between Coda and Google Sheets



This comes straight from the blog post, but thought it was worth repeating again:



Data synced from your Google Sheet



* HR & recruiting - All your candidates are stored in a Google Sheet but you want to be able to move candidates through different stages in the interviewing pipeline and Google Sheets isn't sufficient for your needs. Having all your candidates in a table in Coda means you can use templates like this one to manage candidates more effectively.* E-commerce and ERP - Orders, customers, and POs may all be different tabs in a Google Sheet that gets updated through Shopify or some other e-commerce platform. In order to manage your e-commerce business, you may want to see charts, calendar of shipments, and reports that Google Sheets cannot provide easily. Syncing the data from Google Sheets to Coda means you can do ERP properly (see this template as an example).* Customer Feedback - You may have a ticketing system like Zendesk or Intercom and all feedback lands in a Google Sheet somewhere. You can do some basic analytics in the Google Sheet but to reply to the feedback means you have to go into Gmail and start replying to customers. If your customer feedback is all in a Coda doc, you can run analytics and send emails using the Gmail Pack (see this template).



Data synced to your Google Sheet



* 3rd-party vendor reporting - Your vendors may not be using Coda yet, but you have all your vendor data in Coda and need to send them the data in a format they prefer. While you could 49069
Dear Analyst #30: How to learn Excel while staying at home during COVID-19 https://www.thekeycuts.com/dear-analyst-how-to-learn-excel-while-staying-at-home-during-covid-19/ https://www.thekeycuts.com/dear-analyst-how-to-learn-excel-while-staying-at-home-during-covid-19/#respond Mon, 11 May 2020 10:41:00 +0000 https://www.thekeycuts.com/?p=49057 Now that you’re staying home and picking up new hobbies and taking classes online, here are a few tips on how to learn Excel and spreadsheets from an online class. I have seen viewership on my own Excel classes spike since COVID-19 hit which has led me to think about the best way to learn […]

The post Dear Analyst #30: How to learn Excel while staying at home during COVID-19 appeared first on .

]]> Now that you’re staying home and picking up new hobbies and taking classes online, here are a few tips on how to learn Excel and spreadsheets from an online class. I have seen viewership on my own Excel classes spike since COVID-19 hit which has led me to think about the best way to learn online.

First of all, why are so many people trying to learn Excel? Maybe since all schools and universities have pushed to online learning, students may be questioning the value of their college degrees. Maybe I should start learning skills that will actually help me land a job…enter stage left: Excel and spreadsheets.

Spreadsheets most sought after skill

In episode 22, I brought up an episode of Freakonomics where they discussed different stats around subjects Freakonomics listeners wished they had learned in high school to better prepare them for their current jobs. The high-level numbers:

Skills currently used on their jobs

  • Less than 5% – Percent of survey responders who said they still use calculus, trigonometry, or geometry in their current jobs
  • 70% – Those who use Excel or Google Sheets on a daily basis
  • 75% – Those who visualize data or present data to make an argument on a daily, weekly, or monthly basis

Skills people wished they had learned in high school

  • 0% – Those who wished they had learned other traditional math subjects in high school beyond what they had already learned
  • 65% – Those who wished they had learned skills around analyzing and interpreting data to uncover insights
  • 60% – Those who wished they had learned how to visualize and present data

It’s pretty clear that data-related skills are what’s actually being used on the job, and during a pandemic where you may have been furloughed, laid off, graduating from university, or really any scenario where your future is unclear and you want to secure a job, learning Excel and data skills may bubble to the top on your to-do list while you’re in quarantine at home. Hopefully these tips will help you gain the skills you need to learn Excel and spreadsheets to help land your next job.

1) Block out time on your calendar to take your class

If you’re a fan of David Allen’s Getting Things Done philosophy, you’ve probably head the phrase that if it doesn’t gets scheduled, it doesn’t get done. Blocking off time on your Google or Outlook calendar to actually take your Excel class versus taking the class when you feel like it will ensure you get through the material and get into a state of flow with the material.

2) Minimize distractions

While it’s easy to stay connected with family and friends while at home, you really need to put away your phone and apps for doing all your meetings and virtual hangouts. Turning off notifications for Facetime, Facebook, Houseparty, Slack, etc. will ensure you can get some uninterrupted time to learn Excel. There are small nuances to writing Excel formulas that can be easy to overlook when you are distracted by your friends or social media.

3) Connect with the instructor and community

Many online Excel classes encourage you to ask the instructor questions and many platforms such as Skillshare encourage students to participate in the community of other students who are taking the class with you. For my Excel classes, there are several discussions where students ask me questions and either I or another student taking the class will jump in an answer. Active participation ensures you are engaged with the class and the instructor and students can help keep you accountable.

4) Have Excel open alongside the video

It’s easy to simply watch a screenshare of an instructor doing something in Excel and say: “I get that, that looks easy to do.” It’s one thing to see the instructor write a VLOOKUP() formula but a completely different experience when you write the formula yourself. Have Excel or Google Sheets open next to the window where you are taking the class is important for you to get hands-on experience with using Excel. Pause the video and try doing what the instructor is doing in Excel.

5) Practice with real use cases from your daily life

Probably the most important tip. In order to take what you learn from the online Excel class marketable to the real world, you need to use spreadsheets for real-life scenarios. The main way I learned Excel was from looking at other people’s spreadsheets in a work environment. If you know someone who can share an Excel file they use at work (removing sensitive info, of course), this would give you a way to see how people use Excel in the real world. Then you can talk more intelligently about how you might design a spreadsheet during an interview.

Don’t have access to Excel files from people who use Excel every day? Try Googling “financial model Excel example” or “track customers Excel example” and you’ll get all sorts of nice templates. Better yet, take Google Sheets or Excel and start tracking something in your daily life. The number of home workouts you do every week. What you are spending on online deliveries. Track COVID-19 stats for your county or state. By building these simple reporting tools, you’ll get a feel for how to use spreadsheets for a real world use case.

Some of my favorite Excel teachers

Been following some of these instructors for a while now, and can definitely say their classes are worth checking out if you are new to Excel:

MAKRO is back!

One of my favorite Excel streamers is back with this livestream. He makes some good points about how Microsoft is dumbing down Excel for beginners and alienating advanced Excel users. Bless you MAKRO.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #30: How to learn Excel while staying at home during COVID-19 appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-how-to-learn-excel-while-staying-at-home-during-covid-19/feed/ 0 Now that you're staying home and picking up new hobbies and taking classes online, here are a few tips on how to learn Excel and spreadsheets from an online class. I have seen viewership on my own Excel classes spike since COVID-19 hit which has led me... Now that you're staying home and picking up new hobbies and taking classes online, here are a few tips on how to learn Excel and spreadsheets from an online class. I have seen viewership on my own Excel classes spike since COVID-19 hit which has led me to think about the best way to learn online.







First of all, why are so many people trying to learn Excel? Maybe since all schools and universities have pushed to online learning, students may be questioning the value of their college degrees. Maybe I should start learning skills that will actually help me land a job...enter stage left: Excel and spreadsheets.



Spreadsheets most sought after skill



In episode 22, I brought up an episode of Freakonomics where they discussed different stats around subjects Freakonomics listeners wished they had learned in high school to better prepare them for their current jobs. The high-level numbers:



Skills currently used on their jobs



* Less than 5% - Percent of survey responders who said they still use calculus, trigonometry, or geometry in their current jobs* 70% - Those who use Excel or Google Sheets on a daily basis* 75% - Those who visualize data or present data to make an argument on a daily, weekly, or monthly basis



Skills people wished they had learned in high school



* 0% - Those who wished they had learned other traditional math subjects in high school beyond what they had already learned* 65% - Those who wished they had learned skills around analyzing and interpreting data to uncover insights* 60% - Those who wished they had learned how to visualize and present data



It's pretty clear that data-related skills are what's actually being used on the job, and during a pandemic where you may have been furloughed, laid off, graduating from university, or really any scenario where your future is unclear and you want to secure a job, learning Excel and data skills may bubble to the top on your to-do list while you're in quarantine at home. Hopefully these tips will help you gain the skills you need to learn Excel and spreadsheets to help land your next job.







1) Block out time on your calendar to take your class



If you're a fan of David Allen's Getting Things Done philosophy, you've probably head the phrase that if it doesn't gets scheduled, it doesn't get done. Blocking off time on your Google or Outlook calendar to actually take your Excel class versus taking the class when you feel like it will ensure you get through the material and get into a state of flow with the material.



2) Minimize distractions



While it's easy to stay connected with family and friends while at home, you really need to put away your phone and apps for doing all your meetings and virtual hangouts. Turning off notifications for Facetime, Facebook, Houseparty, Slack, etc. will ensure you can get some uninterrupted time to learn Excel. There are small nuances to writing Excel formulas that can be easy to overlook when you are distracted by your friends or social media.



3) Connect with the instructor and community



Many online Excel classes encourage you to ask the instructor questions and many platforms such as Skillshare encourage students to participate in the community of other students who are taking the class with you. For my Excel classes,]]>
Dear Analyst 30 29:17 49057 Dear Analyst #29: Working with dynamic array functions and formulas that spill https://www.thekeycuts.com/dear-analyst-working-with-dynamic-array-functions-and-formulas-that-spill/ https://www.thekeycuts.com/dear-analyst-working-with-dynamic-array-functions-and-formulas-that-spill/#respond Mon, 13 Apr 2020 10:28:00 +0000 https://www.thekeycuts.com/?p=49038 Have you ever wondered what an “array-entered formula” is? It’s an intermediate/advanced concept in Excel but in late 2018, Microsoft released dynamic array functions and formulas that “spill” into the cells below your current cell with a function. This makes writing formulas easier and less prone to human error, but there are some tradeoffs to […]

The post Dear Analyst #29: Working with dynamic array functions and formulas that spill appeared first on .

]]>
Have you ever wondered what an “array-entered formula” is? It’s an intermediate/advanced concept in Excel but in late 2018, Microsoft released dynamic array functions and formulas that “spill” into the cells below your current cell with a function. This makes writing formulas easier and less prone to human error, but there are some tradeoffs to using these formulas which I discuss in this episode.

Source: Microsoft

Implicit intersection: what Excel does behind the scenes without you knowing

This is not meant to cause fear as in Excel is doing something “behind your back.” Many Excel users don’t know that Excel does some magic behind the scenes for formulas where the input may be a range of cells but the formula is not necessarily a formula that is meant to accept a range of cells. Excel does something called Implicit Intersection.

Source: Exceljet

With dynamic array functions turned on in your workbook, you may have to start using the “@” operator to tell Excel to keep implicit intersection “on.” There are a lot of edge cases where you would need to use the “@” operator so I’d recommend reading this blog post if you would like to learn more.

Bringing array formulas to the masses

I argue that dynamic array functions and spill formulas are giving new Excel users a way to quickly calculate, filter, and sort their data sets without needing to go through a myriad of menus in the toolbar. Given that more jobs these days require working with large data sets and familiarity with various data models (SQL, NoSQL, GraphQL), knowing how to quickly manipulate data that’s structured in one of these database models is becoming more important than ever.

I think that advanced Excel and SQL users will notice that Excel is getting closer to how PivotTables and SQL operate. With PivotTables, you have calculated fields which are similar to dynamic array functions in that you write the formula once and it applies to your entire PivotTable no matter how you slice and dice your data. In SQL, you are pretty much writing your own user-defined fields and aggregating data from other columns.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #29: Working with dynamic array functions and formulas that spill appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-working-with-dynamic-array-functions-and-formulas-that-spill/feed/ 0 Have you ever wondered what an "array-entered formula" is? It's an intermediate/advanced concept in Excel but in late 2018, Microsoft released dynamic array functions and formulas that "spill" into the cells below your current cell with a function. Have you ever wondered what an "array-entered formula" is? It's an intermediate/advanced concept in Excel but in late 2018, Microsoft released dynamic array functions and formulas that "spill" into the cells below your current cell with a function. This makes writing formulas easier and less prone to human error, but there are some tradeoffs to using these formulas which I discuss in this episode.



Source: Microsoft



Implicit intersection: what Excel does behind the scenes without you knowing



This is not meant to cause fear as in Excel is doing something "behind your back." Many Excel users don't know that Excel does some magic behind the scenes for formulas where the input may be a range of cells but the formula is not necessarily a formula that is meant to accept a range of cells. Excel does something called Implicit Intersection.



Source: Exceljet



With dynamic array functions turned on in your workbook, you may have to start using the "@" operator to tell Excel to keep implicit intersection "on." There are a lot of edge cases where you would need to use the "@" operator so I'd recommend reading this blog post if you would like to learn more.




https://www.youtube.com/watch?v=1HF0UGMF070




Bringing array formulas to the masses



I argue that dynamic array functions and spill formulas are giving new Excel users a way to quickly calculate, filter, and sort their data sets without needing to go through a myriad of menus in the toolbar. Given that more jobs these days require working with large data sets and familiarity with various data models (SQL, NoSQL, GraphQL), knowing how to quickly manipulate data that's structured in one of these database models is becoming more important than ever.



I think that advanced Excel and SQL users will notice that Excel is getting closer to how PivotTables and SQL operate. With PivotTables, you have calculated fields which are similar to dynamic array functions in that you write the formula once and it applies to your entire PivotTable no matter how you slice and dice your data. In SQL, you are pretty much writing your own user-defined fields and aggregating data from other columns.



Other Podcasts & Blog Posts



In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:



* SheetsCon: Not a podcast, but a great virtual conference all about Google Sheets and all the replays are free to watch* devMode.fm #66: Interviewing for a Webdev Job in 2020




]]>
Dear Analyst 37:20 49038
Dear Analyst #28: Filling a formula down to the last row of your data set https://www.thekeycuts.com/dear-analyst-28-filling-a-formula-down-to-the-last-row-of-your-data-set/ https://www.thekeycuts.com/dear-analyst-28-filling-a-formula-down-to-the-last-row-of-your-data-set/#respond Mon, 30 Mar 2020 10:32:00 +0000 https://www.thekeycuts.com/?p=49026 This spreadsheet tip is based on a question I get asked all the time when I teach (well taught) Excel at in-person classes: How do I fill a formula down to the last row of my data set without over-shooting the last row with keyboard shortcuts? This problem occurs with larger data sets where you […]

The post Dear Analyst #28: Filling a formula down to the last row of your data set appeared first on .

]]>
This spreadsheet tip is based on a question I get asked all the time when I teach (well taught) Excel at in-person classes: How do I fill a formula down to the last row of my data set without over-shooting the last row with keyboard shortcuts? This problem occurs with larger data sets where you have several hundred or thousands of rows and need to quickly apply a formula in a column for all these rows. This screenshot show the problem:

As the text on the screenshot shows, the Revenue per passenger formula needs to be applied to all rows in the data set, but we don’t know where the data set ends. It could be row 100, 500, or 800,000. If you want to try this exercise for yourself, see this Google Sheet and make a copy for yourself.

Method 1: Double-click the bottom-right of the cell

This is how most people approach this problem, but the downside is that it requires you to use your mouse or trackpad. You basically hover your cursor over the bottom-right corner of the cell that contains the formula (in this case E2) and wait until your cursor turn into a black plus sign. Then you double-click and the formula for Revenue per passenger fills down to the last row in the data set (in this case row 281):

Method 2: Drag-and-drop the formula until it reaches the last row of data set

Even less ideal, you can drag the bottom right corner of the cell down and basically wait until the window scrolls to the last row of the data set. The downsides of this method:

  1. You’re still using your mouse or trackpad
  2. You might under-shoot or over-shoot the last row of your data set because the scroll depends on how far down you are holding down your mouse

Method 3: Press page down while having the first cell selected

This method uses keyboard shortcuts so definitely more ideal compared to methods 1 and 2. You keep the first cell with the formula selected by holding down SHIFT and then press PAGE DOWN a few times until you get close to the bottom of your data set. The downside is that you might overshoot your data set which means you have to keep SHIFT pressed while pressing UP ARROW a few times to get the select to “stop” right on the last row (row 281) in the Google Sheet. You can then press CMD+D on the Mac or CTRL+D on the PC to fill the formula down:

Some people ask me about using CMD+DOWN ARROW at this point to get to the bottom of the column (column E in this case) but the problem is that since all of column E is pretty much empty (rows 2 and below), you will simply go to the last row of the spreadsheet. You are over-shooting the last for of your data set by a lot in this scenario.

Method 4 (most ideal): Go to the bottom of the data set in the column to the left and then use the fill formula down shortcut

This method involves using only keyboard shortcuts and hence the most ideal. These are the steps:

  1. Move your cursor to the column to the left of your column that contains the formula you want to fill (in this case column D)
  2. Press CMD+DOWN ARROW on the Mac or CTRL+DOWN ARROW on the PC and you’ll most likely go to the last row of the data set (column D in this case) since the data should be contiguous.
  3. Move your cursor to the right which puts you in the last row of your data set but also in the column that contains the formula you want to fill down (cell E281 in this data set)
  4. Press CTRL+SHIFT+UP ARROW to select all the empty cells including the first cell that contains your formula above your current empty cell (in this data set, you’ll have the E2:E281 range selected).
  5. Press CMD+D on the Mac or CTRL+D on the PC to fill the formula down

The reason this method works is because of step 4 where you’re able to select all the empty cells above your empty cell while the first cell of the selection contains the formula you want to fill down. While this is technically a workaround, I’ve found this is the easiest way to get the range selection properly set up before applying the fill formula down shortcut:

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #28: Filling a formula down to the last row of your data set appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-28-filling-a-formula-down-to-the-last-row-of-your-data-set/feed/ 0 This spreadsheet tip is based on a question I get asked all the time when I teach (well taught) Excel at in-person classes: How do I fill a formula down to the last row of my data set without over-shooting the last row with keyboard shortcuts? This spreadsheet tip is based on a question I get asked all the time when I teach (well taught) Excel at in-person classes: How do I fill a formula down to the last row of my data set without over-shooting the last row with keyboard shortcuts? This problem occurs with larger data sets where you have several hundred or thousands of rows and need to quickly apply a formula in a column for all these rows. This screenshot show the problem:







As the text on the screenshot shows, the Revenue per passenger formula needs to be applied to all rows in the data set, but we don't know where the data set ends. It could be row 100, 500, or 800,000. If you want to try this exercise for yourself, see this Google Sheet and make a copy for yourself.



Method 1: Double-click the bottom-right of the cell



This is how most people approach this problem, but the downside is that it requires you to use your mouse or trackpad. You basically hover your cursor over the bottom-right corner of the cell that contains the formula (in this case E2) and wait until your cursor turn into a black plus sign. Then you double-click and the formula for Revenue per passenger fills down to the last row in the data set (in this case row 281):







Method 2: Drag-and-drop the formula until it reaches the last row of data set



Even less ideal, you can drag the bottom right corner of the cell down and basically wait until the window scrolls to the last row of the data set. The downsides of this method:



* You're still using your mouse or trackpad* You might under-shoot or over-shoot the last row of your data set because the scroll depends on how far down you are holding down your mouse







Method 3: Press page down while having the first cell selected



This method uses keyboard shortcuts so definitely more ideal compared to methods 1 and 2. You keep the first cell with the formula selected by holding down SHIFT and then press PAGE DOWN a few times until you get close to the bottom of your data set. The downside is that you might overshoot your data set which means you have to keep SHIFT pressed while pressing UP ARROW a few times to get the select to "stop" right on the last row (row 281) in the Google Sheet. You can then press CMD+D on the Mac or CTRL+D on the PC to fill the formula down:







Some people ask me about using CMD+DOWN ARROW at this point to get to the bottom of the column (column E in this case) but the problem is that since all of column E is pretty much empty (rows 2 and below), you will simply go to the last row of the spreadsheet. You are over-shooting the last for of your data set by a lot in this scenario.



Method 4 (most ideal): Go to the bottom of the data set in the column to the left and then use the fill formula down shortcut



This method involves using only keyboard shortcuts and hence the most ideal. These are the steps:



* Move your cursor to the column to the left of your column that contains the formula you want to fill (in this case column D)* Press CMD+DOWN ARROW on the Mac or CTRL+DOWN ARROW on the PC and you'll most likely go to the last row of the data set (column D in this case) since the data should be contiguous.* Move your cursor to the right which puts you in the last row of your data set but also in the column that contains the formula you want to fill down (cell E281 in this data set)* Press CTRL+SHIFT+UP ARROW to select all the empty cells including the first cell that contains your formula above your ...]]>
Dear Analyst 30:09 49026
Dear Analyst #27: Splitting a cell diagonally to label y and x-axis and COVID-19 dashboard https://www.thekeycuts.com/splitting-a-cell-diagonally-to-label-y-and-x-axis-and-covid-19-dashboard/ https://www.thekeycuts.com/splitting-a-cell-diagonally-to-label-y-and-x-axis-and-covid-19-dashboard/#respond Mon, 16 Mar 2020 09:06:00 +0000 https://www.thekeycuts.com/?p=49014 This is an Excel trick that’s not super complicated but super useful for labelling a simple table in Excel. Let’s say you have one set of labels along the rows (e.g. “Region”) and then another set of labels along the columns (e.g. “Month”). Cell A1 is now empty because you don’t know which label to […]

The post Dear Analyst #27: Splitting a cell diagonally to label y and x-axis and COVID-19 dashboard appeared first on .

]]>
This is an Excel trick that’s not super complicated but super useful for labelling a simple table in Excel. Let’s say you have one set of labels along the rows (e.g. “Region”) and then another set of labels along the columns (e.g. “Month”). Cell A1 is now empty because you don’t know which label to put in that cell. Do you put “Region” or “Month?” With this trick using the distributed indent horizontal alignment option, you can get something like this:

“Month” is near the top right of cell A1 while “Region” is in the bottom left making it look like you have two separate labels even though it’s all in the same cell. The diagonal line is simply a diagonal border you can add through the Format Cells menu. This is a tip I learned from this YouTube video from Godesignow:

COVID-19 dashboard in a published Coda doc

I also discuss a COVID-19 dashboard I’ve been working on for a month or so that tracks data from Johns Hopkins University, Wikipedia, and departments of health from various states. There are some interesting visualizations once you triangulate these different data sources and start adding in country-level data like population and density. See the dashboard here.

One of the charts from the dashboard

CEO of Shopify discusses lessons learned from StarCraft

One one of my favorite episodes of the year so far from The Pylon Show: Tobi Lütke (CEO of Shopify) visits the show to talk about his experience playing StarCraft and how some of the lessons learned from the game can be applied to a growth mindset and even hiring at Shopify. I wrote a blog post about the lessons I’ve learned about life and startups inspired by Tobi talking publicly about his love for StarCraft. This is one of the best Tweets on how game recognizes game:

https://twitter.com/tobi/status/1183807041678299137

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #27: Splitting a cell diagonally to label y and x-axis and COVID-19 dashboard appeared first on .

]]>
https://www.thekeycuts.com/splitting-a-cell-diagonally-to-label-y-and-x-axis-and-covid-19-dashboard/feed/ 0 This is an Excel trick that's not super complicated but super useful for labelling a simple table in Excel. Let's say you have one set of labels along the rows (e.g. "Region") and then another set of labels along the columns (e.g. "Month"). This is an Excel trick that's not super complicated but super useful for labelling a simple table in Excel. Let's say you have one set of labels along the rows (e.g. "Region") and then another set of labels along the columns (e.g. "Month"). Cell A1 is now empty because you don't know which label to put in that cell. Do you put "Region" or "Month?" With this trick using the distributed indent horizontal alignment option, you can get something like this:







"Month" is near the top right of cell A1 while "Region" is in the bottom left making it look like you have two separate labels even though it's all in the same cell. The diagonal line is simply a diagonal border you can add through the Format Cells menu. This is a tip I learned from this YouTube video from Godesignow:




https://www.youtube.com/watch?v=EVRm0QMCRBU




COVID-19 dashboard in a published Coda doc



I also discuss a COVID-19 dashboard I've been working on for a month or so that tracks data from Johns Hopkins University, Wikipedia, and departments of health from various states. There are some interesting visualizations once you triangulate these different data sources and start adding in country-level data like population and density. See the dashboard here.



One of the charts from the dashboard



CEO of Shopify discusses lessons learned from StarCraft



One one of my favorite episodes of the year so far from The Pylon Show: Tobi Lütke (CEO of Shopify) visits the show to talk about his experience playing StarCraft and how some of the lessons learned from the game can be applied to a growth mindset and even hiring at Shopify. I wrote a blog post about the lessons I've learned about life and startups inspired by Tobi talking publicly about his love for StarCraft. This is one of the best Tweets on how game recognizes game:




https://twitter.com/tobi/status/1183807041678299137




Other Podcasts & Blog Posts



In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:



* The Pylon Show #79: CEO of Shopify joins TLO & Artosis to talk about StarCraft




]]>
Dear Analyst 1 33:15 49014