Dear Analyst https://www.thekeycuts.com/category/podcast/ A show made for analysts: data, data analysis, and software. Mon, 01 Mar 2021 12:30:57 +0000 en-US hourly 1 https://wordpress.org/?v=5.6.2 This is a podcast made by a lifelong analyst. I cover topics including Excel, data analysis, and tools for sharing data. In addition to data analysis topics, I may also cover topics related to software engineering and building applications. I also do a roundup of my favorite podcasts and episodes. KeyCuts clean episodic KeyCuts info@thekeycuts.com info@thekeycuts.com (KeyCuts) A show made for analysts: data, data analysis, and software. Dear Analyst https://www.thekeycuts.com/wp-content/uploads/2019/03/dear_analyst_logo-1.png https://www.thekeycuts.com/excel-blog/ TV-G New York, NY 50542147 Dear Analyst #62: Using data storytelling to close billions of dollars worth of deals at LinkedIn with Janie Ho https://www.thekeycuts.com/dear-analyst-62-using-data-storytelling-to-close-billions-of-dollars-worth-of-deals-at-linkedin-with-janie-ho/ https://www.thekeycuts.com/dear-analyst-62-using-data-storytelling-to-close-billions-of-dollars-worth-of-deals-at-linkedin-with-janie-ho/#respond Mon, 01 Mar 2021 11:47:11 +0000 https://www.thekeycuts.com/?p=50707 This episode is all about data storytelling at a “traditional” enterprise company like LinkedIn and also at a major news publication. Janie Ho is a former global account analyst at LinkedIn in NYC where she facilitated data-driven presentations to close revenue deals for LinkedIn’s top global strategic accounts. Currently, she is a senior editor in […]

The post Dear Analyst #62: Using data storytelling to close billions of dollars worth of deals at LinkedIn with Janie Ho appeared first on .

]]>
This episode is all about data storytelling at a “traditional” enterprise company like LinkedIn and also at a major news publication. Janie Ho is a former global account analyst at LinkedIn in NYC where she facilitated data-driven presentations to close revenue deals for LinkedIn’s top global strategic accounts. Currently, she is a senior editor in growth and audience at the New York Daily News under Tribune Publishing. This episode goes into best practices for creating data-driven presentations, learning new skills in non-traditional methods, and tools journalists use to find new stories to pursue.

Upleveling skills: from SEO to data

As a former journalist at various publications like ABC News and Businessweek, Janie forged a non-traditional path to a career in data.

In New York there was a popular platform called Mediabistro where they held these one-night courses. Many of them were free, and Janie took as many free courses as she could. She took many courses on SEO, and her SEO skills ended up being her gateway into data analytics.

I always find it interesting how people from all different backgrounds end up getting into data whether it’s learning Excel, SQL, or some other data tool. It further shows that no matter what your role is, you will come across a spreadsheet at one point or another. In the world of SEO, you have tons of data around keyword performance, traffic estimates, rank, and more to play with.

LinkedIn: an enterprise behemoth

Janie eventually found herself at LinkedIn in 2011 as the first analyst in her group focused on global revenue accounts. When she left LinkedIn three years later, there were 50 analysts. Most of the analysts were recruited from management consulting so these analysts most likely had some data experience. Luckily, LinkedIn emphasized professional development so Janie was able to not only learn data skills, but also how to build data-driven presentations.

Most people don’t realize that LinkedIn is expensive enterprise software that powers a lot of hiring functions around the world. Seats for the software cost $10K/year and above. When Janie joined LinkedIn, the company was in high-growth mode since there was so much demand for the product on the enterprise side.

LinkedIn was basically hiring salespeople as fast as they could, and the salespeople were expected to start selling the next day. There wasn’t an extended onboarding period; they just needed people to sell. With all these salespeople doing QBRs and creating new pitch decks for the C-suite, LinkedIn needed many analysts like Janie to help produce these presentations at a fast rate.

Concise business review data presentations

In order to create these presentations, Janie and her fellow analysts were basically downloading LinkedIn usage data and slicing and dicing the data in Excel. She had to show LinkedIn’s top strategic clients how things are going during these QBRs, but also what the opportunities are to spend more on LinkedIn.

Internally, LinkedIn had a program called Data-Driven University which was created by former Bain consultants. Janie would learn the key data storytelling skills from this “university” and turn around and train salespeople. Some examples of slides that Janie would create are below. These are the “after” slides that show how the data could tell a better story where there’s only one key takeaway per slide:

Compare these slides to the slide below where there are too many elements on the slide and the key takeaway for the audience is not clear:

One-click data-driven presentations

The insights team at LinkedIn ended up creating a tool called Merlin that was built on Tableau. All you needed as an analyst was the the client’s company ID and all the visualizations would get created with one click. The output was a 50-slide deck with takeaways written in plain English.

One of the neat features of this one-click dashboard was that it would create an “icebreaker” game in each deck depending on which clients you were talking to. You could just plug in all the names attending. the meeting into the tool, and it would create a slide asking the meeting attendees who the most popular person is on LinkedIn since the tool obviously had access to all meeting attendees’ LinkedIn information.

LinkedIn’s sales data—sometimes close to a petabyte or more—exists among internal databases, Google Analytics, Salesforce.com, and third party tools. Previously, one analyst on LinkedIn’s team serviced daily sales requests from over 500 salespeople, creating a reporting queue of up to 6 months.

In response, the business analytics team centralized this disparate data into Tableau Server to create a series of customer success dashboards. LinkedIn embeds Tableau Server into their internal analytics portal, nicknamed “Merlin.”

Today, thousands of sales people visit the portal on a weekly basis—equivalent to up to 90% of LinkedIn’s sales team—to track customer churn, risk indicators, and sales performance.

Source: Tableau

Janie still had to download additional usage data and do custom reports and PivotTables to get her clients the data they needed. She eventually learned SQL to further automate her data needs. Nonetheless, this solution in Tableau really helped salespeople get the slides they needed to tell data-driven stories and close deals.

Data visualization best practices

Through her training at LinkedIn, Janie learned all types of best practices for how to tell data-driven stories. One of the key questions she would ask herself is this: Can you explain the slide in plain English to someone who is not in that specific industry?

If you can’t, chances are the slide could be simplified and data can be removed. We talked about all types of best practices in this episode, but here were a few that stood out:

  • Slide headlines should be in the same position on each slide so your audience isn’t scanning the slide for the headline and instead focuses on the body of the slide.
  • Use colors and charts sparingly: you should have one specific bar, line, or color you want the audience to focus on to grasp the key takeaway from the slide
  • 3-5 second rule: if you look at the slide for 3-5 seconds you should be able to understand the takeaway

The slides are not for you. They are for your audience.

In this following slide, the audience is drawn to one specific bar and color to understand the key takeaway of the slide:

Janie saw parallels between her experience at LinkedIn and her former journalist days. You’re tempted to add more data and visualizations to the slides, but you don’t want your audience’s attention to be distracted. You want that one key trend or number to be stamped into your audience’s head which is like writing a really catchy news headline.

Learning and teaching Google Sheets/Excel

According to Janie, 80% of a data analyst’s job is cleaning data despite all the expensive tools and AI that have been developed over the years. Even with the Merlin to at LinkedIn, analysts still had to use Excel. That’s why she had to learn how to automate as much as she could in Excel and SQL and then pass on these tools to incoming analysts.

They say the best developer is a lazy developer.

After LinkedIn, Janie started working for smaller companies such as nonprofits and would report directly to the CEO. A lot of them were in Google Sheets all day and couldn’t write formulas like VLOOKUP. They were doing things by hand across thousands of rows and manually changing the formatting with the paintbrush tool in Excel.

To teach these CEOs how to use Excel, she would first walk them through the formulas she was building and the final product in Excel. Then she revert all her changes and ask them to do the exact same thing and say they have to create the same output as what she showed them.

They don’t know what they don’t know.

Speaking of acquiring skills, Janie made an interesting point about how many people learned web programming skills back in the early 2000s. This was during the heyday of Myspace and Xanga. Myspace users were teaching themselves HTML, CSS, and Javascript just to do simple things with their Myspace pages. That same same need to learn how to edit a website is not as common now with platforms like Facebook.

People were learning these 6-figure skills just to get a unicorn to pop out from their Myspace profiles.

Audience development at The New York Daily News

Janie oversees many different assets at The New York Daily News including homepage, social media platforms, podcasts, breaking news emails, mobile alerts, and newsletters just to name a few.

Data is still an important part of what she does in her current role. Tools like Chartbeat and Tableau are used for reporting purposes. OneSignal is used for pushing mobile/web alerts. All the data generated from these platforms are pushed into Google Analytics 360 dashboards built by the national Tribune team.

Twice daily, Janie reports on the best “meta” headlines to NY Daily News journalists (these are the SEO titles from top performing articles). For her team, the One Metric that Matters (OMTM) is getting new subscribers. I think many teams call their OMTM their “north star metric” or something similar. In the world of SaaS, that might be MAUs or DAUs. Here is an example of a chart Janie might show her team during one of these meetings showing the performance of stories:

We talked about how Janie’s team helps journalists predict which stories will be “hits.” The New York Daily News’ biggest news source is still news about NYC. They don’t do feature stories on Broadway openings and restaurants anymore given the size of the team. The stats Janie presents is only one-half of what journalists rely on to figure out what stories and beats to pursue.

Ultimately, it’s an art and science to find a story to pitch the editors.

You can find Janie on Twitter at @janieho16.

Other Podcasts & Blog Posts

No other podcasts or blog posts this week!

The post Dear Analyst #62: Using data storytelling to close billions of dollars worth of deals at LinkedIn with Janie Ho appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-62-using-data-storytelling-to-close-billions-of-dollars-worth-of-deals-at-linkedin-with-janie-ho/feed/ 0 This episode is all about data storytelling at a "traditional" enterprise company like LinkedIn and also at a major news publication. Janie Ho is a former global account analyst at LinkedIn in NYC where she facilitated data-driven presentations to clos... This episode is all about data storytelling at a "traditional" enterprise company like LinkedIn and also at a major news publication. Janie Ho is a former global account analyst at LinkedIn in NYC where she facilitated data-driven presentations to close revenue deals for LinkedIn's top global strategic accounts. Currently, she is a senior editor in growth and audience at the New York Daily News under Tribune Publishing. This episode goes into best practices for creating data-driven presentations, learning new skills in non-traditional methods, and tools journalists use to find new stories to pursue.







Upleveling skills: from SEO to data



As a former journalist at various publications like ABC News and Businessweek, Janie forged a non-traditional path to a career in data.



In New York there was a popular platform called Mediabistro where they held these one-night courses. Many of them were free, and Janie took as many free courses as she could. She took many courses on SEO, and her SEO skills ended up being her gateway into data analytics.







I always find it interesting how people from all different backgrounds end up getting into data whether it's learning Excel, SQL, or some other data tool. It further shows that no matter what your role is, you will come across a spreadsheet at one point or another. In the world of SEO, you have tons of data around keyword performance, traffic estimates, rank, and more to play with.



LinkedIn: an enterprise behemoth



Janie eventually found herself at LinkedIn in 2011 as the first analyst in her group focused on global revenue accounts. When she left LinkedIn three years later, there were 50 analysts. Most of the analysts were recruited from management consulting so these analysts most likely had some data experience. Luckily, LinkedIn emphasized professional development so Janie was able to not only learn data skills, but also how to build data-driven presentations.







Most people don't realize that LinkedIn is expensive enterprise software that powers a lot of hiring functions around the world. Seats for the software cost $10K/year and above. When Janie joined LinkedIn, the company was in high-growth mode since there was so much demand for the product on the enterprise side.



LinkedIn was basically hiring salespeople as fast as they could, and the salespeople were expected to start selling the next day. There wasn't an extended onboarding period; they just needed people to sell. With all these salespeople doing QBRs and creating new pitch decks for the C-suite, LinkedIn needed many analysts like Janie to help produce these presentations at a fast rate.



Concise business review data presentations



In order to create these presentations, Janie and her fellow analysts were basically downloading LinkedIn usage data and slicing and dicing the data in Excel. She had to show LinkedIn's top strategic clients how things are going during these QBRs, but also what the opportunities are to spend more on LinkedIn.



Internally, LinkedIn had a program called Data-Driven University which was created by former Bain consultants. Janie would learn the key data storytelling skills from this "university" and turn around and train salespeople. Some examples of slides that Janie would create are below. These are the "after" slides that show how the data could tell a better story where there's only one key takeaway per slide:
]]>
Dear Analyst 62 50:50 50707
Dear Analyst #61: Empowering businesses and individuals with data literacy skills with Oz du Soleil https://www.thekeycuts.com/dear-analyst-61-empowering-businesses-and-individual-with-data-literacy-skills-with-oz-du-soleil/ https://www.thekeycuts.com/dear-analyst-61-empowering-businesses-and-individual-with-data-literacy-skills-with-oz-du-soleil/#comments Mon, 22 Feb 2021 05:16:00 +0000 https://www.thekeycuts.com/?p=50671 Oz is one of the best creators of Excel content I know with his Excel on Fire YouTube channel. Unlike traditional “how-to” videos, his videos blend education with entertainment making the learning process feel like binging your favorite Netflix show. Oz and I met on Google+ way back in the day and in person at […]

The post Dear Analyst #61: Empowering businesses and individuals with data literacy skills with Oz du Soleil appeared first on .

]]>
Oz is one of the best creators of Excel content I know with his Excel on Fire YouTube channel. Unlike traditional “how-to” videos, his videos blend education with entertainment making the learning process feel like binging your favorite Netflix show. Oz and I met on Google+ way back in the day and in person at the 2014 Modeloff competition. While Oz is an Excel MVP and Excel trainer on LinkedIn, our conversation goes deeper into data literacy and understanding where your data is coming from before it gets into the spreadsheet.

Know just enough Excel to get your job done

When I first met Oz at the Modeloff in 2014, he told me a story about how he discovered the power of Excel for changing people’s lives. This story really shows the human side of a spreadsheet program that is typically associated with business and enterprise use.

Oz was teaching Excel at a medical school and helping the students in his class automate their reports. He met one student who was simply copying and pasting cells up and down the spreadsheet, and was spending an hour doing these manual operations. He realized the student just needed one formula to automate the task she was doing, she just didn’t know what that formula was.

I started learning about people who needed to know how to use certain features in Excel, but didn’t need to know how to learn how to use everything in Excel.

Once the student saw how the formula could eliminate all the tedious work she was doing, it changed how she worked and gave her so much more time to focus on more important aspects of her job.

I think a lot of people approach their tools and software with a similar mindset. You know there is probably a better or faster way of doing something, but you go with what you know. There’s a bit of the JTBD (jobs-to-be-done) framework here. Knowledge workers need to know just enough to solve the problems they face on the job, and can leave the rest of the software’s feature set for the power users.

You’ll work with data no matter what role you have

Prior to our conversation, Oz mentioned to me he wanted to talk about more than just Excel tips and tricks. These topics are covered at nauseum by other content creators; and for good measure as people need and want this training (yours truly has benefited from creating this type of content). What really tickles my fancy are the topics surrounding Excel, and there is no one better to go in-depth with me on these topics than Oz.

Analyst might not be in your title.

Nonetheless, you are or will be sorting, filtering, and summarizing data no matter what department or level your work in. Excel is merely a tool to get you from the raw data to the story you tell to your internal stakeholders to launch X feature or to external clients to purchase your product.

Oz talks about how people taking an Excel class will get them feeling comfortable about using the tool, but it only goes so far. As you get real world experience, you’ll start to ask questions about data quality and the data source(s). These are topics that go beyond Excel and into the realm of databases, data transformation, and data pipelines; topics I’m trying to cover more of on this podcast.

Oz opined about the dilemma one faces with duplicate data. Do you de-duplicate at the source (perhaps in a view in a database) or do you do it in the spreadsheet? Most analysts (present company included) will make the necessary changes in Excel or Google Sheets for one reason: it’s fast. Harkening back to the previous section’s takeaway: I just need to get a job done and and don’t care (for now) how it gets completed.

Before data storytelling, there’s data literacy

I’ve talked about data storytelling on numerous episodes (see the data storytelling episode with the New York Times). It’s a hot topic for a lot of companies as they start incorporating software into their product offerings (if you’re a SaaS company, you’re already swimming in a big data lake).

Before one can create these masterful data-driven stories, Oz believes there is a more fundamental skill one needs to acquire: data literacy. When you look at a report, you should be able to answer questions like “Can I trust the data source?” and “What am I really looking at with this data?”.

A recent article by Sara Brown at MIT Sloan highlights the following data literacy skills today’s knowledge worker should have:

  • Read with data, which means understanding what data is and the aspects of the world it represents.
  • Work with data, including creating, acquiring, cleaning, and managing it.
  • Analyze data, which involves filtering, sorting, aggregating, comparing, and performing other analytic operations on it.
  • Argue with data, which means using data to support a larger narrative that is intended to communicate some message or story to a particular audience.

The article goes on to explain the different steps a company can take to build an effective data literacy plan. An interesting stat Brown highlights in the study is this one from a Gartner survey conducted by Accenture:

In a survey of more than 9,000 employees in a variety of roles, 21% were confident in their data literacy skills.

Should we be surprised by this finding? I think not.

Did you ever need to take an Intro to Data Literacy course in middle or high school? Was learning spreadsheets part of the curriculum? Things change a bit at the university level as deans and presidents realize their students are not meeting the demands of hiring managers. I reference an episode of Freakonomics in episode 22 where they break down the deficiencies in the U.S.’s math curriculum. Key takeaway: a majority of what you learn in the K-12 system does not prepare you for a job requiring data literacy.

Empowering small businesses to use Excel

Oz made a great point about not just the content produced about Excel, but the features many bloggers and trainers decide to demonstrate in their content.

I worry that so much conversation has enterprises in mind, or the start-ups that want to get huge. But there are a lot of small businesses, and they’re lost in conversations that they don’t know aren’t meant for them.

Naturally, the type of professional who can spend a few hundred or few thousands dollars on a comprehensive Excel training probably works at a large enterprise or well-funded startup. But there are millions of flower shops, retail stores, and non-profits who may still be using Excel like the way Oz’s student was using Excel at that medical school.

This is an area Oz is passionate about and there is clearly a need to provide Excel training for this demographic. Chances are the flower shop won’t need to do complex VLOOKUPs and mess with Power Query. They just need to know the features–hope you’re starting to see the theme here–to get their jobs done.

Is Excel a database?

For many of these small businesses, yes.

Oz has seen small 5-person companies have some database platform installed and no one in the company uses the database because no one knows how to. He saw a non-profit where the DBA was a woman who worked half a day a week. If anyone needed to get data or add data to that database, they had to wait for the 4 hours a week she was available to handle their requests.

While it pains many of you (I include myself here) to see businesses inefficiently store their data in Excel or a Google Sheet, we must come to accept that not every business scenario needs to have auto-refreshing PivotTables and VBA macros.

Oz talks about the need to have more honesty and empowerment around what is possible with Excel. He hears the database vendors and data science crowd talk about using the latest and greatest database platforms or programming in R or Javascript. These are all great solutions for the enterprise, but who is going to implement these solutions at the flower shop? Perhaps this is the realm for the no-code platforms like Shopify to make e-commerce as simple as possible.

At the end of the day, Oz realized (like many analysts) that his Excel skills are necessary for many businesses whose data are trapped in databases. He would be in conversations with companies who need to create detailed reports, but then argue about which cost center is going to “fund” the project. Then you have green light committees who need to approve the SOW.

You’ll find these types of internal battles at corporates all over the world. But Oz knows if he just gets the data dump from the database, he can clean up the data and get the business the reports and stats they need with his knowledge of Excel, but more importantly, his understanding of the business logic.

Build vs buy

At the very end we talked a bit about a podcast I listened to recently (see Other Podcasts section below) where the classic dichotomy between build vs. buy was brought up. The main idea is that software engineers are not always great at putting a dollar value on the time it takes to build an application (versus just buying the off-the-shelf version).

Like Oz, I agree that Excel and Google Sheets should be treated as development platforms. Oz talked about working on a consulting project where the client was paying something like $60K/year for an industry-specific software application. The issue was that his client was only using a fraction of the features the software offered. When you purchase expensive software like this, you may also need to purchase the customer support for situations where the software breaks.

Instead, Oz was able to develop a prototype in Excel that had just the features the client needed and was using from the expensive enterprise software.

So there are situations where building can be more beneficial than buying the shiny software that’s targeted for your use case and industry. Additionally, you become the customer support because you know the ins and outs of the solution you created which is an empowering feeling.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #61: Empowering businesses and individuals with data literacy skills with Oz du Soleil appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-61-empowering-businesses-and-individual-with-data-literacy-skills-with-oz-du-soleil/feed/ 2 Oz is one of the best creators of Excel content I know with his Excel on Fire YouTube channel. Unlike traditional "how-to" videos, his videos blend education with entertainment making the learning process feel like binging your favorite Netflix show. Oz is one of the best creators of Excel content I know with his Excel on Fire YouTube channel. Unlike traditional "how-to" videos, his videos blend education with entertainment making the learning process feel like binging your favorite Netflix show. Oz and I met on Google+ way back in the day and in person at the 2014 Modeloff competition. While Oz is an Excel MVP and Excel trainer on LinkedIn, our conversation goes deeper into data literacy and understanding where your data is coming from before it gets into the spreadsheet.







Know just enough Excel to get your job done



When I first met Oz at the Modeloff in 2014, he told me a story about how he discovered the power of Excel for changing people's lives. This story really shows the human side of a spreadsheet program that is typically associated with business and enterprise use.



Oz was teaching Excel at a medical school and helping the students in his class automate their reports. He met one student who was simply copying and pasting cells up and down the spreadsheet, and was spending an hour doing these manual operations. He realized the student just needed one formula to automate the task she was doing, she just didn't know what that formula was.



I started learning about people who needed to know how to use certain features in Excel, but didn't need to know how to learn how to use everything in Excel.



Once the student saw how the formula could eliminate all the tedious work she was doing, it changed how she worked and gave her so much more time to focus on more important aspects of her job.







I think a lot of people approach their tools and software with a similar mindset. You know there is probably a better or faster way of doing something, but you go with what you know. There's a bit of the JTBD (jobs-to-be-done) framework here. Knowledge workers need to know just enough to solve the problems they face on the job, and can leave the rest of the software's feature set for the power users.



You'll work with data no matter what role you have



Prior to our conversation, Oz mentioned to me he wanted to talk about more than just Excel tips and tricks. These topics are covered at nauseum by other content creators; and for good measure as people need and want this training (yours truly has benefited from creating this type of content). What really tickles my fancy are the topics surrounding Excel, and there is no one better to go in-depth with me on these topics than Oz.



Analyst might not be in your title.



Nonetheless, you are or will be sorting, filtering, and summarizing data no matter what department or level your work in. Excel is merely a tool to get you from the raw data to the story you tell to your internal stakeholders to launch X feature or to external clients to purchase your product.



Oz talks about how people taking an Excel class will get them feeling comfortable about using the tool, but it only goes so far. As you get real world experience, you'll start to ask questions about data quality and the data source(s). These are topics that go beyond Excel and into the realm of databases, data transformation, and data pipelines; topics I'm trying to cover more of on this podcast.



Oz opined about the dilemma one faces with duplicate data. Do you de-duplicate at the source (perhaps in a view in a database) or do you do it in the spreads...]]>
Dear Analyst 61 48:22 50671
Dear Analyst #60: Going from a corporate accountant to building an Excel training academy with John Michaloudis https://www.thekeycuts.com/dear-analyst-60-going-from-a-corporate-accountant-to-building-an-excel-training-academy-with-john-michaloudis/ https://www.thekeycuts.com/dear-analyst-60-going-from-a-corporate-accountant-to-building-an-excel-training-academy-with-john-michaloudis/#respond Mon, 15 Feb 2021 15:57:53 +0000 https://www.thekeycuts.com/?p=50657 It’s a story we’ve all heard before. You’re working a full-time job, and you have more fun doing your side hustle than your 9 to 5. This is what happened to John Michaloudis. He was a financial controller at General Electric but found his passion sharing Excel tips and tricks on an internal GE newsletter […]

The post Dear Analyst #60: Going from a corporate accountant to building an Excel training academy with John Michaloudis appeared first on .

]]>
It’s a story we’ve all heard before. You’re working a full-time job, and you have more fun doing your side hustle than your 9 to 5. This is what happened to John Michaloudis. He was a financial controller at General Electric but found his passion sharing Excel tips and tricks on an internal GE newsletter which his colleagues ate up. John decided to become an entrepreneur and built an Excel training company from the ground up. We chatted about how he got started, his favorite marketing tactics, and of course, why he loves Excel.

10,000 followers on an internal company blog

At General Electric, there was an internal blog called Colab where employees could could write and publish articles only for GE employees to see. As a financial controller, John became well-versed in Excel and decided to contribute to the internal blog. He started posting Excel tips, and eventually he had a weekly column just devoted to being better at Excel.

GE’s Colab

John quickly amassed more than 10,000 subscribers to his column as he saw how hungry people were for Excel knowledge. But it was only his side gig at GE.

I liked doing the blog more than my actual job. I felt the subscribers valued me more than my boss valued me.

After getting this positive feedback from his colleagues around the world, he wanted to find a way to take his Excel column to the next level. For the next 12 months, he went off and created a course all about PivotTables. He asked his boss at the time if he could sell his course to the 10K subscribers to his column, but of course compliance told him no. He decided to leave GE, and as a last salvo sent out a message to his followers about a webinar he was going to host about his PivotTables course.

Creating a library of Excel content

Based on the feedback he got from his subscribers to his weekly Excel column, John was able to find a few topics to build additional Excel classes about. A perennial favorite of mine, keyboard shortcuts were high on the list. Creating charts was also a big topics since most of his now students work at companies, and presenting data in a compelling way is important.

John is constantly learning new Excel features but ultimately the content he produces is determined by what his students, the customers, want to learn. He periodically sends his students a survey and asks them what they want to learn about. These topics are what you’ll see on MyExcelOnline, John’s Excel training company.

Taking the leap to become an entrepreneur

While the idea of going off on your own and being your boss is a romantic one, for many the decision is a matter of dollars and cents. John was (and currently still is) working in Spain, and started earning a few grand from offering his courses on Udemy. He realized this was enough for him and his family to live on, and went full-time on his training company in January 2015.

His advice for aspiring entrepreneurs is don’t just leave with nothing. Create a product and test it out. Use cheap methods like Adwords to validate your idea 4-Hour Workweek style.

On Udemy, John’s PivotTable course originally would earn him about $2K/month but this went up to $7K/month. The problem was he was also selling his course on his website for $290. If his customers who bought the course fro his website found out they could get it cheaper on Udemy, it would result in a bad customer experience. So he decided to pull his course off of Udemy

These online education platforms are a blessing and a curse. While you can earn a lot more from publishing your courses on your own website and domain, these MOOCs spend the money and time to acquire customers for you. I’ve been teaching Excel on Skillshare since 2014 and have always thought about starting my own course off my website, but the Skillshare just makes it so easy to tap into a “built-in” audience and I can just focus on creating the educational content.

Early marketing tactics to get customers

For new entrepreneurs, the key to the early game is distribution. For John, the marketing tactics he employed for the start of MyExcelOnline revolved around affiliate sales. A tried and true method.

Some of those affiliates included Chandoo, My Online Training Hub, Excel Campus, and Contextures. The Excel community and the Excel training community especially are small and tight knit. From reading blog posts and attending webinars over the years from many of these content creators and trainers, I can tell how much dedication and work goes into creating these valuable resources.

To further build interest in his classes, John also hosts free webinars that give students a taste of what you can learn in his Excel classes. He’s been doing these webinars for the last 5 years and it’s driven the most interest in his classes. Then there is the coveted email newsletter which gives you (the content creator) a direct line of access to current and potential students.

We also chatted a bit about the creative ways other Excel trainers are using social media platforms to reach their target audiences. For instance, Kat Norton runs a TikTok channel called Miss Excel and has creates super entertaining videos with Excel tips (she also has her own Excel course linked in her bio):

https://www.tiktok.com/@miss.excel/video/6888079232983993605

John is all about experimenting with new channels and social media strategies, but his target customer is not using platforms like TikTok. His customers are a bit older, and most likely using platforms like Facebook and LinkedIn.

The other factor to consider is that the younger audience on TikTok might not turn into paying customers at a high rate compared to a “traditional” marketing channel like an email list. Nonetheless, it’s great to see so many young people wanting to learn Excel tips and tricks via short video content on TikTok.

New and unknown Excel features

One of John’s annual podcasts is the Excel tips roundup for the year (see the 2020 roundup here). He created a roundup of audio clips from some of the top Excel content creators sharing their favorite Excel tips.

Most of the tips John already knew, but one that stood out for him was importing from PDF using Power Query. This is a relatively unknown feature because it requires you to have Office 365. Exporting and importing from PDF is a huge topic and a lot of people over the years have built custom add-ins to do this in Excel (and made money doing it). Microsoft finally decided to build a native feature and put this into Power Query directly. I tend to think that Power Query and Power BI feel like separate applications from Excel, but they really extend the power and functionality of Excel in new ways.

Source: MyExcelOnline

Near the end of the episode we talked a bit about strategies to improve the speed and performance of your models based on a blog post I read a few weeks ago (see the other podcasts and blog posts section below). John’s advice? Put your data into a PivotTable to build your model versus using formulas to summarize everything.

I’ve never tried this myself, but you could build an entire P&L from a PivotTable and in the cases where you can’t do it in the PivotTable directly, you can use the GETPIVOTDATA formula to pull the data out of the PivotTable you need for analysis.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #60: Going from a corporate accountant to building an Excel training academy with John Michaloudis appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-60-going-from-a-corporate-accountant-to-building-an-excel-training-academy-with-john-michaloudis/feed/ 0 It's a story we've all heard before. You're working a full-time job, and you have more fun doing your side hustle than your 9 to 5. This is what happened to John Michaloudis. He was a financial controller at General Electric but found his passion shari... It's a story we've all heard before. You're working a full-time job, and you have more fun doing your side hustle than your 9 to 5. This is what happened to John Michaloudis. He was a financial controller at General Electric but found his passion sharing Excel tips and tricks on an internal GE newsletter which his colleagues ate up. John decided to become an entrepreneur and built an Excel training company from the ground up. We chatted about how he got started, his favorite marketing tactics, and of course, why he loves Excel.







10,000 followers on an internal company blog



At General Electric, there was an internal blog called Colab where employees could could write and publish articles only for GE employees to see. As a financial controller, John became well-versed in Excel and decided to contribute to the internal blog. He started posting Excel tips, and eventually he had a weekly column just devoted to being better at Excel.



GE's Colab



John quickly amassed more than 10,000 subscribers to his column as he saw how hungry people were for Excel knowledge. But it was only his side gig at GE.



I liked doing the blog more than my actual job. I felt the subscribers valued me more than my boss valued me.



After getting this positive feedback from his colleagues around the world, he wanted to find a way to take his Excel column to the next level. For the next 12 months, he went off and created a course all about PivotTables. He asked his boss at the time if he could sell his course to the 10K subscribers to his column, but of course compliance told him no. He decided to leave GE, and as a last salvo sent out a message to his followers about a webinar he was going to host about his PivotTables course.



Creating a library of Excel content



Based on the feedback he got from his subscribers to his weekly Excel column, John was able to find a few topics to build additional Excel classes about. A perennial favorite of mine, keyboard shortcuts were high on the list. Creating charts was also a big topics since most of his now students work at companies, and presenting data in a compelling way is important.



John is constantly learning new Excel features but ultimately the content he produces is determined by what his students, the customers, want to learn. He periodically sends his students a survey and asks them what they want to learn about. These topics are what you'll see on MyExcelOnline, John's Excel training company.



Taking the leap to become an entrepreneur



While the idea of going off on your own and being your boss is a romantic one, for many the decision is a matter of dollars and cents. John was (and currently still is) working in Spain, and started earning a few grand from offering his courses on Udemy. He realized this was enough for him and his family to live on, and went full-time on his training company in January 2015.







His advice for aspiring entrepreneurs is don't just leave with nothing. Create a product and test it out. Use cheap methods like Adwords to validate your idea 4-Hour Workweek style.



On Udemy, John's PivotTable course originally would earn him about $2K/month but this went up to $7K/month. The problem was he was also selling his course on his website for $290. If his customers who bought the course fro his website found out they could get it cheaper on Udemy,]]>
Dear Analyst 60 35:54 50657
Dear Analyst #59: Enterprise data tools and the rise of data engineering with Priyanka Somrah, VC analyst at Work-Bench https://www.thekeycuts.com/dear-analyst-59-enterprise-data-tools-and-the-rise-of-data-engineering-with-priyanka-somrah-vc-analyst-at-work-bench/ https://www.thekeycuts.com/dear-analyst-59-enterprise-data-tools-and-the-rise-of-data-engineering-with-priyanka-somrah-vc-analyst-at-work-bench/#respond Mon, 08 Feb 2021 05:36:00 +0000 https://www.thekeycuts.com/?p=50645 The NY Enterprise Tech meetup was one of my favorite events to attend in-person prior to the pandemic. While the meetup is now all virtual, the speakers they bring in continue to be top-notch–particularly in the data space. The host of the meetup is Work-Bench, a VC focused on enterprise software companies. Priyanka Somrah is […]

The post Dear Analyst #59: Enterprise data tools and the rise of data engineering with Priyanka Somrah, VC analyst at Work-Bench appeared first on .

]]>
The NY Enterprise Tech meetup was one of my favorite events to attend in-person prior to the pandemic. While the meetup is now all virtual, the speakers they bring in continue to be top-notch–particularly in the data space. The host of the meetup is Work-Bench, a VC focused on enterprise software companies. Priyanka Somrah is a VC analyst at Work-Bench who speaks with the most innovative startups building data tools on a weekly basis. In this episode, Priyanka talks about how data tools can effectively sell into the enterprise, how the data engineering profession has grown in the last few years, and the most effective Go-to-Market (GTM) strategies for data companies.

Note: There were some recording issues with this episode so apologies in advance for some of the gaps in the conversation as well as the background noise.

Flipping VC on its head

Work-bench is early-stage enterprise VC firm that flips the VC model upside down. Unlike most VCs that go off and try to source the most innovative companies, Work-Bench hosts quarterly corporate roundtables with Fortune 500 companies. From speaking with the end customer of these data infrastructure and engineering tools, Work-Bench finds the pain points these large enterprises face. From there, they go and find the companies that offer a solution to these problems.

This aligns with what I hear a lot in the B2B SaaS space in terms of creating an effective marketing strategy. Paint point selling or solution selling are some buzzwords you might hear in this regard. Instead of selling the features of your product, you focus on the problems your product solves for your target customer. Seems like a reasonable strategy marketing and sales people should adopt, but all the cold spam InMails and connections on LinkedIn might convince you otherwise.

How to become enterprise-ready

One of the main challenges for data infrastructure tools is selling into the enterprise. A company’s data is one of the most important assets the company has, and Priyanka provides some best practices on how data tools can better prepare themselves to be a legitimate enterprise solution:

  1. Security – Being able to safeguard the customer’s data is priority number one. This means getting SOC 2 certification or offering a single sign-on feature.
  2. Scale – Can the tool reduce the administrative overhead and grow and scale as the customer’s requirements grow and scale
  3. Flexibility – This is becoming a table stakes feature. Is the tool modular enough to integrate with a customer’s existing tool stack?
Source: Work-Bench SOC 2 Playbook

Assuming the data tool meets these enterprise standards, Work-Bench then acts as the matchmaker between the various Fortune 500 companies they are connected with and the startup that provides a relevant solution.

Big data and the rise of data engineering

Big data has really changed the face of the world. As a result, the ETL process and data modeling have changed considerably as well. With tools and processes changing, the skills and expertise required by data professionals needs to adapt. We saw with the last episode about data engineering at Canva, the data engineer is more than just an individual contributor focused on BI tools. The data engineer now needs to have knowledge about the entire data pipeline and how the different aspects of the pipeline interact with each other.

The modern data landscape is merging analytics and engineering.

What are some tools that exemplify the merging of these two professions? One tool Priyanka called out is dbt (data build tool), an open-source data transformation tool that empowers the data analyst with transformation powers. Check out this blog post Priyanka wrote to learn more about some of these tools.

Moving into the world of data operations, Priyanka talked about metadata tools that modernize the data cataloging process. Some of these tools (open-source) include Amundsen from Lyft and Nemo from Facebook.

Metadata tools can programmatically capture the important points about your data as it flows through your pipeline into your data warehouse. The goal is to map out the lineage of the data, understand what’s causing delays in the pipeline, and assist with pipeline debugging. These tools gives you a view of how your data is transformed to make sure the quality is high throughout the lifecycle.

Governance, risk, and compliance

I think this an overlooked area of the data infrastructure space which means it’s ripe for innovation and new entrants. As mentioned earlier, one of the most important factors for a data tool to become enterprise-ready is providing security and privacy for your customer’s data. Priyanka spoke about new regulations such as GDPR and CCPA forcing companies to be more strict with how they handle their users’ data. Priyanka wrote this blog post highlighting the different areas of compliance companies should be aware of along with a B2B staple: a map of all the tools.

Whether it’s evidence collection or providing tools for your auditors, more data startups are creating innovative solutions to help customers with a myriad of compliance needs.

The GTM motion for data infrastructure companies

This was one of the most important topics I wanted to discuss with Priyanka given my own background in this area. Unfortunately, this is one of the questions that got messed up in the recording, so I’ll try my best to summarize Priyanka’s response.

Priyanka first started by saying a key consideration for early-stage companies is asking yourself: do you go directly to the enterprise or start by selling to other startups? Figuring out your “wedge” can be an early differentiator for your brand and sales strategies.

In Work-Bench’s portfolio, the companies that go after SMB/mid-market have products that customers can quickly get started with. These are tools and platforms that most likely offer a generous free tier for the data team to experiment with. Product-led growth is probably also common with these tools since the SMB/mid-market consists of companies of all shapes and sizes. For enterprise-focused GTM, the founders probably have a lot of enterprise experience and are well-versed in selling a solution to multiple buyers at a large enterprise.

Work-Bench has their own GTM playbooks where they take best practices from a GTM “win” from one company and try to find a pattern that other companies can apply to their own GTM processes.

What to look out for in the data management space

Priyanka is most excited about tools for data operations space like dbt and Fivetran, and data warehouse tools like Snowflake and BigQuery. The most important feature of these tools: they give people a self-serve way to query data.

Thinking further down the pipeline, you have the actual consumers of the data analytics. In terms of data democratization, Priyanka thinks a shared interface where data is unlocked for end consumers could be an interesting feature for data tools to adopt. A world where the data engineer and the data analyst can collaborate together on the pipeline instead of each individual focusing on just their part of the story.

Conclusion

You don’t have to work at a VC like Work-Bench to stay on top of all the companies in the data infra space. Priyanka started the Data Source to give people an inside scoop on all things data and data infrastructure. Priyanka is always talking a lot of people in the data space to help hone her theses about the space, and you can follow her newsletter to see Priyanka “learn in public” as she does her research.

I’m always looking to learn. I want to know when I’m wrong.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #59: Enterprise data tools and the rise of data engineering with Priyanka Somrah, VC analyst at Work-Bench appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-59-enterprise-data-tools-and-the-rise-of-data-engineering-with-priyanka-somrah-vc-analyst-at-work-bench/feed/ 0 The NY Enterprise Tech meetup was one of my favorite events to attend in-person prior to the pandemic. While the meetup is now all virtual, the speakers they bring in continue to be top-notch--particularly in the data space. The NY Enterprise Tech meetup was one of my favorite events to attend in-person prior to the pandemic. While the meetup is now all virtual, the speakers they bring in continue to be top-notch--particularly in the data space. The host of the meetup is Work-Bench, a VC focused on enterprise software companies. Priyanka Somrah is a VC analyst at Work-Bench who speaks with the most innovative startups building data tools on a weekly basis. In this episode, Priyanka talks about how data tools can effectively sell into the enterprise, how the data engineering profession has grown in the last few years, and the most effective Go-to-Market (GTM) strategies for data companies.







Note: There were some recording issues with this episode so apologies in advance for some of the gaps in the conversation as well as the background noise.



Flipping VC on its head



Work-bench is early-stage enterprise VC firm that flips the VC model upside down. Unlike most VCs that go off and try to source the most innovative companies, Work-Bench hosts quarterly corporate roundtables with Fortune 500 companies. From speaking with the end customer of these data infrastructure and engineering tools, Work-Bench finds the pain points these large enterprises face. From there, they go and find the companies that offer a solution to these problems.







This aligns with what I hear a lot in the B2B SaaS space in terms of creating an effective marketing strategy. Paint point selling or solution selling are some buzzwords you might hear in this regard. Instead of selling the features of your product, you focus on the problems your product solves for your target customer. Seems like a reasonable strategy marketing and sales people should adopt, but all the cold spam InMails and connections on LinkedIn might convince you otherwise.



How to become enterprise-ready



One of the main challenges for data infrastructure tools is selling into the enterprise. A company's data is one of the most important assets the company has, and Priyanka provides some best practices on how data tools can better prepare themselves to be a legitimate enterprise solution:



* Security - Being able to safeguard the customer's data is priority number one. This means getting SOC 2 certification or offering a single sign-on feature.* Scale - Can the tool reduce the administrative overhead and grow and scale as the customer's requirements grow and scale* Flexibility - This is becoming a table stakes feature. Is the tool modular enough to integrate with a customer's existing tool stack?



Source: Work-Bench SOC 2 Playbook



Assuming the data tool meets these enterprise standards, Work-Bench then acts as the matchmaker between the various Fortune 500 companies they are connected with and the startup that provides a relevant solution.



Big data and the rise of data engineering



Big data has really changed the face of the world. As a result, the ETL process and data modeling have changed considerably as well. With tools and processes changing, the skills and expertise required by data professionals needs to adapt. We saw with the last episode about 50645
Dear Analyst #58: Canva’s data warehouse initiative to increase reliability and tooling for analysts with Krishna Naidu https://www.thekeycuts.com/dear-analyst-58-canvas-data-warehouse-initiative-to-increase-reliability-and-tooling-for-analysts-with-krishna-naidu/ https://www.thekeycuts.com/dear-analyst-58-canvas-data-warehouse-initiative-to-increase-reliability-and-tooling-for-analysts-with-krishna-naidu/#comments Tue, 02 Feb 2021 05:04:00 +0000 https://www.thekeycuts.com/?p=50598 Data warehouses have come a long way since the days of Oracle and Microstrategy. A data warehouse should be able to grow and expand with the business it supports. At Canva, the data amount of data coming into the data warehouse has exploded in the last few years given the platform’s surging usage. In this […]

The post Dear Analyst #58: Canva’s data warehouse initiative to increase reliability and tooling for analysts with Krishna Naidu appeared first on .

]]> Data warehouses have come a long way since the days of Oracle and Microstrategy. A data warehouse should be able to grow and expand with the business it supports. At Canva, the data amount of data coming into the data warehouse has exploded in the last few years given the platform’s surging usage. In this episode, Krisha Naidu, a data engineer at Canva, talks about how his team is making it easier for analysts to get the data they need and the tooling to analyze data. The goal of the data warehouse team at Canva is to maintain reliability, improve access and tooling, and oversee compliance with regulations. At the end of the episode, we also discuss our mutual love for keyboard shortcuts.

A design tool for the rest of us

I never considered myself a “good” designer or artist. I still feel lost sometimes in Photoshop and Figma, but Canva makes the design process super seamless for a newbie like myself. I use the tool at least once a week for a variety of use cases. All the thumbnails on my YouTube channel were created on Canva because I can create a decent design in five minutes or less.

Krishna didn’t use Canva before he joined the company. His first foray into Canva was creating a birthday invitation for his daughter. But he quickly saw the power and potential of Canva after seeing his family members use the tool and a family friend who uses Canva for creating marketing brochures. Once Krishna joined Canva, the scope of the mission became clear to him. It’s not just about making design easy, but also giving people the ability to get their designs seen by the right people. Like many other SaaS tools, Canva has also added more collaboration features as more teams become distributed.

Canva invitations

Structure of Canva’s data team

Given Canva’s size (1,500+ employees according to LinkedIn), the data team is quite mature relative to other SaaS companies. They have data analysts, scientists, and engineers.

The data engineering team (where Krishna works) is broken out into three sub-teams:

  1. Streaming – Internally this team is know as Canvalytics and they focus on capturing all the clickstream data from the product. This team helps Krishna’s team with getting data into the data warehouse.
  2. Platforms – They manage the data lake and tooling for data scientists
  3. Data Warehouse – This is Krishna’s team, and they provide tooling for the users of the data warehouse. They also enforce controls and governance of the data warehouse, and their primary business stakeholders are Canva’s data anlaysts.

The data coming into the data warehouse is constantly growing which is a good sign because it means the number of Canva users is growing. On top of that, new product features being added to the platform means more clickstream data needs to be captured and transformed in the data warehouse. To better cope with the expanding data footprint, Krishna’s team has architected some interesting solutions to cope with the company’s growth.

A sandboxed build environment for analysts

When the data team was smaller, it was easier for all analysts to work in the same data warehouse environment. If an analyst made a change to a dataset, then they might work with the data engineering team to roll the change out and that change would be communicated out to the rest of the analysts.

With more analysts, it becomes easier to step on each others’ toes since one analyst might make a change on one dataset (where they are building and testing their models), but then another analyst might be doing another a separate analysis on that dataset. Before you know it, collisions occur and the “source of truth” gets lost as the data team tries to figure out which changes need to be applied to all the data sets.

Krishna and team created mini build environments for analysts so that each analyst has their own sandboxed view of the data to experiment with. If an analyst needs to make a change to a dataset, they would submit a pull request the dev environment and this goes through a bunch of CI/CD checks set up by the data engineering team. This is pretty similar to the software development process (more on this later). Almost 30 analysts will be able to use these new build environments. In a nutshell, these build environments re-create all the schemas and views and clone tables from the data warehouse so that analysts get a quick copy of all the main fact tables.

Inspiration from GitLab

The inspiration for this project came from a company with a full-distributed team: GitLab. When your entire team is distributed, it’s even more important that any changes you make to the codebase are properly tested and communicated to all your colleagues who are working on the same codebase.

GitLab co-founders Dmitriy Zaporozhets and Sid Sijbrandij.

The secret to success: efficient cloning

The Canva data engineering team makes use of Snowflake’s “cloning” feature. As I mentioned above, the build environment makes a quick copy of the tables in the data warehouse without the expensive operations normally associated with copying tables. It’s done entirely in the cloud.

Snowflake and other modern data warehousing platforms are revolutionizing the way analysts can access the large amounts of data being produced within their organizations.

Historically, a data warehouse would slow down if there are a lot of users using the warehouse concurrently or if there’s a big batch process taking place. Snowflake separates where the data is stored and processed (different compute resources for load and transform). “Cloning” your dataset means creating a pointer to the dataset. As changes are made to the dataset, you just get the diffs on the data (just like you would when committing code changes to a repo).

Reducing headaches and increasing confidence in the warehouse

It is to be determined on the amount of time that shifting to Snowflake and these new build environments will yield for Canva. The most important benefit of this new architecture is that the data pipeline has become more reliable.

Krishna mentioned that the amount of time analysts might be spending in the data warehouse might increase because they have to run their own tests now on changes they make to datasets. The bigger picture, however, is that analysts and the data engineers that support them don’t have to worry about explaining why a given report is broken (because something in the pipeline broke). We talked about how you might be preparing for a board meeting and the last thing you want to be faced with is a report that won’t update because something in the pipeline broke.

You can’t underrate peace of mind and confidence in your data warehouse performing as it should :).

Crossing over from analytics to engineering

As Krisha explained the new build environments for analysts, it became clear that the skills analysts will need resemble those of software engineers. Data “development” at Canva is starting to look similar to application development. In addition to core analysis and reporting responsibilities, analysts will need to know how to write proper documentation, write and execute tests, submit pull requests, and do peer reviews. These are all practices common in software engineering, not data analytics.

We’ve seen the blend of data analysis and data engineering in previous episodes (see episode 55 and the FirstMark conversation about what the definition of a data analyst is). The fine folks at dbt coined the phrase “analytics engineering” which encompasses a lot of the skills Canva analysts have:

Analytics engineers provide clean data sets to end users, modeling data in a way that empowers end users to answer their own questions. While a data analyst spends their time analyzing data, an analytics engineer spends their time transforming, testing, deploying, and documenting data. Analytics engineers apply software engineering best practices like version control and continuous integration to the analytics code base.

Source: dbt blog

Do new analysts need all these “engineering” skills to succeed as an analyst at Canva? Krishna says no:

We care more about the analyst’s creativity and skills in seeking answers from the data.

During the Canva onboarding process, analysts get training on how to do things like submit pull requests and running tests. These skills can be taught. What’s harder to teach is the curiosity one needs to dig into the data and the creativity to tell a data-driven story.

What’s next for the Canva data eng team?

According to Krishna, the data warehouse team should never settle. Krishna believes the team should continue to focus on increasing productivity for analysts.

Other bottlenecks in the pipeline might include getting access to data (e.g. PII data). The team may also want to know what’s coming into the data warehouse so this means getting observability stats on the data coming in. Perhaps the team wants to let analysts know the data might be 3 days old. Then comes all the automation and testing of these notifications so that the rest of the organization is made aware of these “health” metrics. Sounds like the data eng team is going through their own version of continuous development and improvement :).

Another interesting project the team may focus on in the future is not limiting the data warehouse for internal reporting purposes. What if you could surface interesting insights back to actual Canva users? What are the access requirements in this case? As a Canva user myself, I think it would be super interesting to see how my designs are being used and viewed by others.

Productivity hack: keyboard shortcuts

If you’ve followed my blog for some time, you’ve probably seen that I’m a huge fan of keyboard shortcuts (particularly in Excel). In fact, I created a whole class on this subject.

Krishna and I spoke about I podcast I recently listened to about Vim (see the “Other Podcasts & Blog Posts” section below). In the episode, Alex Smith, a software engineer at DEV, talks about how he first learned Excel keyboard shortcuts while working in finance. Then he transitioned to a job in software engineering, and saw how fast his colleagues were with using the keyboard to navigate Vim.

Krishna also uses a few keyboard shortcuts to be productive. He spoke about two in detail:

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #58: Canva’s data warehouse initiative to increase reliability and tooling for analysts with Krishna Naidu appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-58-canvas-data-warehouse-initiative-to-increase-reliability-and-tooling-for-analysts-with-krishna-naidu/feed/ 1 Data warehouses have come a long way since the days of Oracle and Microstrategy. A data warehouse should be able to grow and expand with the business it supports. At Canva, the data amount of data coming into the data warehouse has exploded in the last... Data warehouses have come a long way since the days of Oracle and Microstrategy. A data warehouse should be able to grow and expand with the business it supports. At Canva, the data amount of data coming into the data warehouse has exploded in the last few years given the platform's surging usage. In this episode, Krisha Naidu, a data engineer at Canva, talks about how his team is making it easier for analysts to get the data they need and the tooling to analyze data. The goal of the data warehouse team at Canva is to maintain reliability, improve access and tooling, and oversee compliance with regulations. At the end of the episode, we also discuss our mutual love for keyboard shortcuts.







A design tool for the rest of us



I never considered myself a "good" designer or artist. I still feel lost sometimes in Photoshop and Figma, but Canva makes the design process super seamless for a newbie like myself. I use the tool at least once a week for a variety of use cases. All the thumbnails on my YouTube channel were created on Canva because I can create a decent design in five minutes or less.



Krishna didn't use Canva before he joined the company. His first foray into Canva was creating a birthday invitation for his daughter. But he quickly saw the power and potential of Canva after seeing his family members use the tool and a family friend who uses Canva for creating marketing brochures. Once Krishna joined Canva, the scope of the mission became clear to him. It's not just about making design easy, but also giving people the ability to get their designs seen by the right people. Like many other SaaS tools, Canva has also added more collaboration features as more teams become distributed.



Canva invitations



Structure of Canva's data team



Given Canva's size (1,500+ employees according to LinkedIn), the data team is quite mature relative to other SaaS companies. They have data analysts, scientists, and engineers.



The data engineering team (where Krishna works) is broken out into three sub-teams:



* Streaming - Internally this team is know as Canvalytics and they focus on capturing all the clickstream data from the product. This team helps Krishna's team with getting data into the data warehouse.* Platforms - They manage the data lake and tooling for data scientists* Data Warehouse - This is Krishna's team, and they provide tooling for the users of the data warehouse. They also enforce controls and governance of the data warehouse, and their primary business stakeholders are Canva's data anlaysts.







The data coming into the data warehouse is constantly growing which is a good sign because it means the number of Canva users is growing. On top of that, new product features being added to the platform means more clickstream data needs to be captured and transformed in the data warehouse. To better cope with the expanding data footprint, Krishna's team has architected some interesting solutions to cope with the company's growth.



A sandboxed build environment for analysts



When the data team was smaller, it was easier for all analysts to work in the same data warehouse environment. If an analyst made a change to a dataset, then they might work with the data engineering team to roll the change out and that change would be communicated out to the rest of the analysts.



With more analysts, it becomes easier to step on each others' toes since one analyst might make a change on one dataset (where they are building and testing their models),]]>
Dear Analyst 58 42:36 50598 Dear Analyst #57: Automating weekly reports, working with stakeholders, and data definitions with Nadja Jury of Education Perfect https://www.thekeycuts.com/dear-analyst-57-automating-weekly-reports-working-with-stakeholders-and-data-definitions-with-nadja-jury-of-education-perfect/ https://www.thekeycuts.com/dear-analyst-57-automating-weekly-reports-working-with-stakeholders-and-data-definitions-with-nadja-jury-of-education-perfect/#respond Mon, 25 Jan 2021 05:21:00 +0000 https://www.thekeycuts.com/?p=50616 One of the best feelings is knowing you’ve streamlined and automated a report such that your colleague doesn’t have to spend hours creating the report every week. Whether you automate the report through a bunch of Excel formulas or setting up some sort of data pipeline, analysts are always thinking about driving internal efficiencies. This […]

The post Dear Analyst #57: Automating weekly reports, working with stakeholders, and data definitions with Nadja Jury of Education Perfect appeared first on .

]]>
One of the best feelings is knowing you’ve streamlined and automated a report such that your colleague doesn’t have to spend hours creating the report every week. Whether you automate the report through a bunch of Excel formulas or setting up some sort of data pipeline, analysts are always thinking about driving internal efficiencies. This was one of my first jobs as an analyst, and also what Nadja Jury, a data scientist at Education Perfect, is doing at her company. In this episode, Nadja discusses how she went about automating weekly reports for the head of customer support, communicating data with internal stakeholders, and setting up data definitions so the whole company is on the same page (spoiler: it involves “ski passes”).

From customer support to backend junior developer

Education Perfect supports students in an online learning environment by providing assessment and collection of student feedback for teachers. Nadja started in customer support, then moved to the enrollment team, and ultimately became a backend junior developer. You don’t come across many people who take that kind of career trajectory, so I asked Nadja how she found herself as a backend developer:

At university, I studied computer science and six months before I graduated, the team pulled me aside to see what I wanted to do. I was interested in an internal data role.

Starting in customer support most likely gave Nadja a lot more context on what types of issues Education Perfect’s customers care about as she moved to a more data-focused role. Sometimes it can be easy to “detach” from the front lines of the business when you’re deep in spreadsheets and databases all day.

Education Perfect didn’t always have a dedicated data person. In September 2019, a product manager laid out plans for a data team which now consists of 4 people who help the company make sense of all the data they have on how students and teachers are using the product.

Weekly reports that don’t suck

One of Nadja’s projects involves automating the reports for the customer support team. Efficiency is already top of mind for the CS team since they aim for a 10-minuted turnaround time on answering tickets.

The manager of the Inbound team used to take data coming from their email system to build end-of-week reports. These reports include tracking against the CS team’s KPIs, data coming from their email system, and more. The end result? Multiple spreadsheets that need to be copied and pasted in multiple places (who hasn’t been there before?).

So what is the new workflow that Nadja came up with? The head of the Inbound team just needs to export the data and and drop the files into a singleGoogle Drive. There is a Fivetran integration with Google Drive that automatically connects the data from that folder with the company’s main data warehouse. Nadja then set up dashboards in Mode that built off of the data warehouse that generate all the necessary reports for the CS team to show how they are tracking towards their KPIs. We were already using Fivetran so it was easy to start the integration with Google Drive.

Time saved? Typically this would take the head of customer support 2-4 hours every Friday morning to build these reports, and now it takes less then 30 minutes.

What’s interesting about Nadja’s process with designing the Mode dashboard is that she first saw what the head of customer support had created in Google Sheets. Nadja then drew some new charts in a paper notebook and met with the head of customer support to see if these were the charts that would show what the CS team needs to report every week. Instead of focusing on the numbers and trends, this strategy allows both the analyst and the stakeholder to focus on the charts and see if they would tell the right story.

Helping designers make better decisions about content

Another report that Nadja and her colleague are working on is for the company’s designers. This report shows how the educational content is actually being used by students. With over 1TB in the main questions table, there is a ton of data to analyze. Since students on the platform are submitting answers to questions, the data team can look at every attempt on every question in every lesson and pull trends and insights to give to the designers. These trends can then help determine what type of content needs to be tweaked to ensure students have a good experience on the platform.

Working effectively with internal stakeholders on data definitions

Given how new the data team is at Education Perfect, most of 2020 was spent identifying the gaps and potential confusion around existing reporting. Thus far, the data team has been meeting with their colleagues to find the best solutions to their data and reporting needs.

Nadja told an interesting story highlighting how even the definitions of certain metrics and data points can be a bit ambiguous. One of her colleagues asked Nadja a question about some metric, but Nadja happened to be on her lunch break. The colleague then sent the same question to another member on the data team. When Nadja returned to her desk and responded to her colleague’s question, she realized the answer she gave and the one her data teammate gave were completely different.

I imagine this problem is prevalent even in organizations with mature data teams. Ask yourself, what does an active user mean for your company? For Nadja and Education Perfect, this could be referring to teachers, students, or some other user type. What does “active” really mean?

This led to Nadja spearheading a “data definition doc” where the entire company’s metrics are listed in one doc for everyone to see. This doc creates transparency for the organization, and also ensures new teammates have one place to look to understand the common “data language” spoken across the company.

Inventing new metrics based on your company culture

One specific metric that Nadja and the data team had difficulty defining was that of a student accessing a subject in the curriculum. Given the current data definition doc, there were no metrics that her team could use to describe this activity. They researched a bunch of SaaS metrics but still couldn’t find something that resonated with internal stakeholders.

Turns out many of Nadja’s colleagues are based in the South Island (New Zealand) where skiing is popular in the winter. Since everyone likes to ski, the data team called this metric a “ski pass.” When they introduced this metrics to the team, it just clicked with all internal stakeholders. Nothing like some shared cultural context to inform how your data should be labeled. “Ski pass” is literally the name of a column in a table in their data warehouse.

Machine learning and Python

I asked Nadja what tools she’s interested in learning next, and she is currently exploring machine learning and getting data into a Python notebook. She took a 10-week course last year and is hoping to apply these skills to the growing data team at Education Perfect. Additionally, she is learning some of the custom features of dbt for transforming data in their data warehouse. It’s always great to see analysts going beyond spreadsheets and formulas and learning about all aspects of the data pipeline.

The post Dear Analyst #57: Automating weekly reports, working with stakeholders, and data definitions with Nadja Jury of Education Perfect appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-57-automating-weekly-reports-working-with-stakeholders-and-data-definitions-with-nadja-jury-of-education-perfect/feed/ 0 One of the best feelings is knowing you've streamlined and automated a report such that your colleague doesn't have to spend hours creating the report every week. Whether you automate the report through a bunch of Excel formulas or setting up some sort... One of the best feelings is knowing you've streamlined and automated a report such that your colleague doesn't have to spend hours creating the report every week. Whether you automate the report through a bunch of Excel formulas or setting up some sort of data pipeline, analysts are always thinking about driving internal efficiencies. This was one of my first jobs as an analyst, and also what Nadja Jury, a data scientist at Education Perfect, is doing at her company. In this episode, Nadja discusses how she went about automating weekly reports for the head of customer support, communicating data with internal stakeholders, and setting up data definitions so the whole company is on the same page (spoiler: it involves "ski passes").







From customer support to backend junior developer



Education Perfect supports students in an online learning environment by providing assessment and collection of student feedback for teachers. Nadja started in customer support, then moved to the enrollment team, and ultimately became a backend junior developer. You don't come across many people who take that kind of career trajectory, so I asked Nadja how she found herself as a backend developer:



At university, I studied computer science and six months before I graduated, the team pulled me aside to see what I wanted to do. I was interested in an internal data role.



Starting in customer support most likely gave Nadja a lot more context on what types of issues Education Perfect's customers care about as she moved to a more data-focused role. Sometimes it can be easy to "detach" from the front lines of the business when you're deep in spreadsheets and databases all day.



Education Perfect didn't always have a dedicated data person. In September 2019, a product manager laid out plans for a data team which now consists of 4 people who help the company make sense of all the data they have on how students and teachers are using the product.







Weekly reports that don't suck



One of Nadja's projects involves automating the reports for the customer support team. Efficiency is already top of mind for the CS team since they aim for a 10-minuted turnaround time on answering tickets.



The manager of the Inbound team used to take data coming from their email system to build end-of-week reports. These reports include tracking against the CS team's KPIs, data coming from their email system, and more. The end result? Multiple spreadsheets that need to be copied and pasted in multiple places (who hasn't been there before?).







So what is the new workflow that Nadja came up with? The head of the Inbound team just needs to export the data and and drop the files into a singleGoogle Drive. There is a Fivetran integration with Google Drive that automatically connects the data from that folder with the company's main data warehouse. Nadja then set up dashboards in Mode that built off of the data warehouse that generate all the necessary reports for the CS team to show how they are tracking towards their KPIs. We were already using Fivetran so it was easy to start the integration with Google Drive.



Time saved? Typically this would take the head of customer support 2-4 hours every Friday morning to build these reports, and now it takes less then 30 minutes.



What's interesting about Nadja's process with designing the Mode dashboard is that she first saw what the head of customer support had created in Google Sheets. Nadja then drew some new charts in a paper notebook an...]]>
Dear Analyst 57 39:06 50616
Dear Analyst #56: Self-serve dashboards, Excel, and data accuracy with BI Analyst John Napoleon-Kuofie of Farfetch https://www.thekeycuts.com/dear-analyst-56-self-serve-dashboards-excel-and-data-accuracy-with-bi-analyst-john-napoleon-kuofie-of-farfetch/ https://www.thekeycuts.com/dear-analyst-56-self-serve-dashboards-excel-and-data-accuracy-with-bi-analyst-john-napoleon-kuofie-of-farfetch/#comments Mon, 18 Jan 2021 05:05:00 +0000 https://www.thekeycuts.com/?p=50601 In this episode, I had the pleasure of speaking with John Napoleon-Kuofie, a senior business intelligence analyst at Farfetch. In this conversation, we talked about how John’s career led him to Farfetch, a traffic dashboard he’s built for his stakeholders at Farfetch, and how Excel was his gateway into SQL and the wonderful world of […]

The post Dear Analyst #56: Self-serve dashboards, Excel, and data accuracy with BI Analyst John Napoleon-Kuofie of Farfetch appeared first on .

]]>
In this episode, I had the pleasure of speaking with John Napoleon-Kuofie, a senior business intelligence analyst at Farfetch. In this conversation, we talked about how John’s career led him to Farfetch, a traffic dashboard he’s built for his stakeholders at Farfetch, and how Excel was his gateway into SQL and the wonderful world of data. One of the reasons I enjoy conversations like these is because you get to learn from someone who is in the data trenches, as it were.

The path to Farfetch

Farfetch is an e-commerce company focused on boutique fashion companies. Before landing at Farfetch as a customer insights analyst, John was studying mathematics at university and thought he was going to work at a bank after graduation. He ended up working at a media agency where he help built statistical models demonstrating the value of advertising for the agency’s clients.

What’s so great about this part of John’s career (and many entry-level analysts) is that you get to do a little bit of everything. John was working in Excel, R, and other bespoke tools. During this phase of your career, you are constantly learning and experimenting with new tools to figure out what type of career you want to end up in. John wanted to stay focused on analytics for his company’s customers.

Storytelling may be more important than the data itself

Data storytelling is both and art and science. It’s not just doing the number crunch and creating the analysis, but pulling the salient points out and creating a compelling story with the data. This skill is so important that big news outlets like the New York Times have created data bootcamps to help its journalists become more proficient in data analysis.

John discusses working with a telecoms client at his former media agency, and the client was cycling through different creatives in their online ads. Each ad had a different celebrity, and John noticed that the efficacy of their ads could be improved. Using data and a bit of marketing, his team convinced the client to adopt a more consistent advertising strategy with one celebrity instead of multiple. In marketing speak, this led to stronger brand recall and the numbers backed it up.

I think many online classes teach you how to use all the knobs and switches in Excel, R, SQL, and Python, but the real value analysts can provide is creating these data-driven stories to make decisions. (I’m really passionate about this subject and have an online class about this topic).

Self-service “traffic” dashboards

In order to help its clients generate sales, Farfetch utilizes multiple marketing strategies including pay-per-click advertising, affiliate advertising, and SEO. In order to help internal stakeholders figure out the proper marketing mix to maximize sales, John created traffic dashboards in Looker and Tableau. The key to these dashboards are that they are self-service so that his colleagues can slice and dice the data they way they want.

An example of a metric these dashboards track is website visits. The dashboards allow people to find out which channel is driving the most traffic so you can figure out whether to invest more or less into that channel. Revenue, costs, and conversion rates are additional metrics you can find on these dashboards.

John brought up an interesting problem that many marketing teams would be envious of: Farfetch gave the marketing team an unlimited marketing budget.

Of course, this freedom came with one big condition: the marketing team had to achieve a specific cost per sale.

Thus, the marketing team needs to pull different levers in order to maximize ROI on every dollar it spends on Google Adwords, Facebook advertising, and other channels.

In order to build out these dashboards, John had to triangulate multiple data sources and “make them talk to each other.” These sources include Google Analytics, app providers, data from Facebook/Google, and of course 1st party data from Farfetch’s own database. This is one of the most challenging types of projects because multiple data sources have their own definition of a “Customer” or “Sale,” and it’s your job to do all the VLOOKUPs and custom SQL views to unite these data sources together.

The next-level questions these dashboards could solve might include which products should be promoted in which regions? By incorporating cost of goods sold (COGS), the dashboard might be able to identify a product that is selling really well but is ultimately unprofitable for the business because the shipping costs eat into the entire profit margin.

Excel is the gateway drug to SQL

John talked about how when he first started at the media agency, he thought he knew Excel inside out. After seeing how his colleagues were using Excel, he realized he was still a beginner in the tool. John picked up Excel from analyzing the files his colleagues and from simply Googling when he didn’t know how to do something (one of the key skills I mentioned in my 2020 lessons learned post).

As John became more proficient in Excel, he started picking up SQL as well. He discusses a pattern I’ve seen with other analysts who are learning SQL: querying small datasets in SQL that you can also query with formulas in a spreadsheet. Jumping straight into SQL can be difficult, so by using a visual IDE like Excel, you can double-check your work to ensure the SQL query returns the result you expect.

At Farfetch, John is using BigQuery along with Connected Sheets so that analysts can quickly see data from BigQuery directly in a Google Sheet.

Moving on up from analyst to manager

John has moved from an independent contributor to a manager role. To John, being a manager has its own challenges, but there are times he misses being “close to the data”:

As an analyst, you like touching the data and doing the work. You like the satisfaction of having a problem to solve and using the tools you have to solve it. As a manager, problems are more abstract. It’s all about passing on the wisdom. Your focus shifts to creating the right environment for other people to excel (no pun intended) at their job.

Despite being a manager, John’s passion for data and learning new skills has not waned. He is currently learning Python for IT use cases, and hopes to apply some of his new Python skills at Farfetch. His company has created a Python package to automatically spit data into Excel so that others can quickly build reports off of the data.

In closing, John said you still need to earn your stripes in Excel before you move on to other languages (like Python). I couldn’t agree with him more. Build a solid foundation in a spreadsheet before picking up the latest tools.

The post Dear Analyst #56: Self-serve dashboards, Excel, and data accuracy with BI Analyst John Napoleon-Kuofie of Farfetch appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-56-self-serve-dashboards-excel-and-data-accuracy-with-bi-analyst-john-napoleon-kuofie-of-farfetch/feed/ 1 In this episode, I had the pleasure of speaking with John Napoleon-Kuofie, a senior business intelligence analyst at Farfetch. In this conversation, we talked about how John's career led him to Farfetch, a traffic dashboard he's built for his stakehold... In this episode, I had the pleasure of speaking with John Napoleon-Kuofie, a senior business intelligence analyst at Farfetch. In this conversation, we talked about how John's career led him to Farfetch, a traffic dashboard he's built for his stakeholders at Farfetch, and how Excel was his gateway into SQL and the wonderful world of data. One of the reasons I enjoy conversations like these is because you get to learn from someone who is in the data trenches, as it were.







The path to Farfetch



Farfetch is an e-commerce company focused on boutique fashion companies. Before landing at Farfetch as a customer insights analyst, John was studying mathematics at university and thought he was going to work at a bank after graduation. He ended up working at a media agency where he help built statistical models demonstrating the value of advertising for the agency's clients.



What's so great about this part of John's career (and many entry-level analysts) is that you get to do a little bit of everything. John was working in Excel, R, and other bespoke tools. During this phase of your career, you are constantly learning and experimenting with new tools to figure out what type of career you want to end up in. John wanted to stay focused on analytics for his company's customers.







Storytelling may be more important than the data itself



Data storytelling is both and art and science. It's not just doing the number crunch and creating the analysis, but pulling the salient points out and creating a compelling story with the data. This skill is so important that big news outlets like the New York Times have created data bootcamps to help its journalists become more proficient in data analysis.



John discusses working with a telecoms client at his former media agency, and the client was cycling through different creatives in their online ads. Each ad had a different celebrity, and John noticed that the efficacy of their ads could be improved. Using data and a bit of marketing, his team convinced the client to adopt a more consistent advertising strategy with one celebrity instead of multiple. In marketing speak, this led to stronger brand recall and the numbers backed it up.







I think many online classes teach you how to use all the knobs and switches in Excel, R, SQL, and Python, but the real value analysts can provide is creating these data-driven stories to make decisions. (I'm really passionate about this subject and have an online class about this topic).



Self-service "traffic" dashboards



In order to help its clients generate sales, Farfetch utilizes multiple marketing strategies including pay-per-click advertising, affiliate advertising, and SEO. In order to help internal stakeholders figure out the proper marketing mix to maximize sales, John created traffic dashboards in Looker and Tableau. The key to these dashboards are that they are self-service so that his colleagues can slice and dice the data they way they want.



An example of a metric these dashboards track is website visits. The dashboards allow people to find out which channel is driving the most traffic so you can figure out whether to invest more or less into that channel.]]>
Dear Analyst 37:00 50601
Dear Analyst #55: Using Google Translate to quickly translate text with Le Grand Débat National data https://www.thekeycuts.com/dear-analyst-55-using-google-translate-to-quickly-translate-text-with-le-grand-debat-national-data/ https://www.thekeycuts.com/dear-analyst-55-using-google-translate-to-quickly-translate-text-with-le-grand-debat-national-data/#comments Tue, 12 Jan 2021 05:27:00 +0000 https://www.thekeycuts.com/?p=50583 This is a super simple formula in Google Sheets, and I don’t want to understate its utility. You can literally translate text from any language into another language. This formula came out on Google Sheets in early 2019 I believe. You basically don’t have to copy and paste into Google Translate anymore to get the […]

The post Dear Analyst #55: Using Google Translate to quickly translate text with Le Grand Débat National data appeared first on .

]]>
This is a super simple formula in Google Sheets, and I don’t want to understate its utility. You can literally translate text from any language into another language. This formula came out on Google Sheets in early 2019 I believe. You basically don’t have to copy and paste into Google Translate anymore to get the translation you need. This function could be helpful for those of you who are are working with PDFs that contain tables of text in a different language, and you need to convert the text to your native tongue. Google Sheet for this episode is here.

If you don’t get the reference above in relation to language translation, watch this video 🙂

Le Grand Débat National dataset

One of the reasons I thought about this function is that I am trying to learn a new language myself (French). I came across this dataset on Kaggle and it is a heap of data from Emmanuel Macron’s initiative to increase debate all over France in 2019 regarding issues like taxes, democracy and citizenship, and the structure of government. I have no idea how the French government collected and aggregated all this data. I believe if you were not able to attend a debate in person, you can also answer questions online (which probably contributes to the bulk of the responses).

The dataset contains a whopping 170 million words with contributions from 250,000 French citizens. It looks like there are mixed reviews about the effectiveness of Macron’s initiative. For analysts and democracy, this was a big step in engaging citizens in public discourse and increased transparency around the data that was collected. We also get a rich dataset to utilize the GOOGLETRANSLATE function.

Applying the function to our dataset

I have a subset of one of the CSVs from this dataset in the Google Sheet. You’ll see in Column E we have some messy text with random characters but it’s clearly in French:

If I want to translate that title column into English, I start writing the GOOGLETRANSLATE function in column F. You’ll notice that when you start typing “GOOGLE” in cell F1, you’ll see only two built-in Google functions:

I imagine that Google will come out with many more functions that allow you to query other Google services. I think it’s strange there are only two built-in Google functions right now, but it may be Google’s way of nudging you to purchase an upgraded account to Google for Work to utilize more features (similar to Office 365). I can see more functions that allow you query your Gmail, Google Calendar, and Tasks. There are some built-in Google Bigquery functions if your organization has Connected Sheets and is using BigQuery as a data warehouse.

Anyway, back to GOOGLETRANSLATE. The function takes 3 arguments:

  • Text to translate
  • Source language (current language of the text)
  • Target language (language you want to translate to)

The caveat is you need to know the two-letter abbreviation for the last two parameters (full list here). If we want to translate the text in column E from French to English, the formula looks like this:

Drag this formula down, and you’re able to translate all the text into the language you need:

Translating text in Excel

I don’t believe a similar function exists in Excel, but you can translate text you’ve selected in Excel. This might work for one-off scenarios where you just have a passage of text you need translated, but I find the Google Sheets function more applicable to real-world scenarios. You also need to have an Office 365 subscription just to utilize the translate selection feature.

Assuming you have an Office 365 subscription and you have a dataset like the Le Grand Débat National data (text in multiple cells that need to be translated), I can see a scenario and workflow like this that might be appropriate for your use case:

  1. Get the table of data from the PDF into Excel using the Insert data from picture feature
  2. Copy/paste the data into Google Sheets
  3. Use GOOGLETRANSLATE to translate the text to the language you want
  4. Copy/paste back into Excel (if that’s the format you need)

Are business analysts technologists?

One of the blog posts/podcasts I mention in this episode is this fireside chat from Data Driven NYC. Jeremiah Lowin (CEO of Prefect) and Tristan Handy (CEO of Fishtown Analytics) are asked to define what a “data analyst” is, what the data analyst responsibilities are, and how they compare with data engineers and data scientists.

Tristan says that analysts don’t necessarily identify themselves as “technologists,” but use technology in service of answering business questions. If you are an analyst, the whole fireside chat is worth a watch/listen as you consider new skills to develop for 2021 and how you should work with your fellow data engineer and data science colleagues.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #55: Using Google Translate to quickly translate text with Le Grand Débat National data appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-55-using-google-translate-to-quickly-translate-text-with-le-grand-debat-national-data/feed/ 1 This is a super simple formula in Google Sheets, and I don't want to understate its utility. You can literally translate text from any language into another language. This formula came out on Google Sheets in early 2019 I believe. This is a super simple formula in Google Sheets, and I don't want to understate its utility. You can literally translate text from any language into another language. This formula came out on Google Sheets in early 2019 I believe. You basically don't have to copy and paste into Google Translate anymore to get the translation you need. This function could be helpful for those of you who are are working with PDFs that contain tables of text in a different language, and you need to convert the text to your native tongue. Google Sheet for this episode is here.



If you don't get the reference above in relation to language translation, watch this video :)



Le Grand Débat National dataset



One of the reasons I thought about this function is that I am trying to learn a new language myself (French). I came across this dataset on Kaggle and it is a heap of data from Emmanuel Macron's initiative to increase debate all over France in 2019 regarding issues like taxes, democracy and citizenship, and the structure of government. I have no idea how the French government collected and aggregated all this data. I believe if you were not able to attend a debate in person, you can also answer questions online (which probably contributes to the bulk of the responses).



The dataset contains a whopping 170 million words with contributions from 250,000 French citizens. It looks like there are mixed reviews about the effectiveness of Macron's initiative. For analysts and democracy, this was a big step in engaging citizens in public discourse and increased transparency around the data that was collected. We also get a rich dataset to utilize the GOOGLETRANSLATE function.







Applying the function to our dataset



I have a subset of one of the CSVs from this dataset in the Google Sheet. You'll see in Column E we have some messy text with random characters but it's clearly in French:







If I want to translate that title column into English, I start writing the GOOGLETRANSLATE function in column F. You'll notice that when you start typing "GOOGLE" in cell F1, you'll see only two built-in Google functions:







I imagine that Google will come out with many more functions that allow you to query other Google services. I think it's strange there are only two built-in Google functions right now, but it may be Google's way of nudging you to purchase an upgraded account to Google for Work to utilize more features (similar to Office 365). I can see more functions that allow you query your Gmail, Google Calendar, and Tasks. There are some built-in Google Bigquery functions if your organization has Connected Sheets and is using BigQuery as a data warehouse.



Anyway, back to GOOGLETRANSLATE. The function takes 3 arguments:



* Text to translate* Source language (current language of the text)* Target language (language you want to translate to)



The caveat is you need to know the two-letter abbreviation for the last two parameters (full list here). If we want to translate the text in column E from French to English,]]>
Dear Analyst 55 22:32 50583
Dear Analyst #54: 5 lessons learned in 2020 and 5 skills for data analysts to learn in 2021 https://www.thekeycuts.com/dear-analyst-54-5-lessons-learned-in-2020-and-5-skills-for-data-analysts-to-learn-in-2021/ https://www.thekeycuts.com/dear-analyst-54-5-lessons-learned-in-2020-and-5-skills-for-data-analysts-to-learn-in-2021/#respond Tue, 22 Dec 2020 12:02:12 +0000 https://www.thekeycuts.com/?p=50393 With the last episode for 2020, I wanted to take a look back and pull out some of the main themes and topics from the podcast. As people rushed to pick up new data skills to adapt to our changing environment this year, I think the precedent has been set for how one can learn […]

The post Dear Analyst #54: 5 lessons learned in 2020 and 5 skills for data analysts to learn in 2021 appeared first on .

]]>
With the last episode for 2020, I wanted to take a look back and pull out some of the main themes and topics from the podcast. As people rushed to pick up new data skills to adapt to our changing environment this year, I think the precedent has been set for how one can learn almost anything from home. You may have picked up Excel and data analysis skills this year. What are some skills you should think about for 2021? This episode provides five skills for data analysts to consider adding to their toolbet in 2021.

First, a look back at some of the main topics from 2020:

1. Excel errors will still happen

Aside from the latest features Microsoft or Google releases for Excel and Google Sheets, the only spreadsheet news that makes it to mainstream media are mistakes that lead to a financial loss.

It all started with episode 38 where I spoke about the JPMorgan Chase trader that caused a $6.2B loss. Given the popularity of this episode, I did a few follow up episodes including episode 40 on the two Harvard professors who created a report that may have led to incorrect austerity policies. There was also episode 49 all about how Tesla might have underpaid by $400M for the acquisition of SolarCity. Many of these stories can be found on the EuSpRIG website (which I covered in episode 47).

These stories go back to the early 90s. After 30 years, I expect these stories to continue making headlines. When analysts are lazy, put under time constraints, and are hastily putting together spreadsheets, these errors will undoubtably occur.

2. Excel’s custom data types, what’s the hype?

Excel’s custom data types came onto the scene a few months ago. The Excel community raved about the new feature, but I think the jury is still out on real use cases and audience for this feature. Touted as going “beyond text and numbers,” the features is still in active development and is still missing some crucial features. I did deeper into this feature in episode 51. A more recent feature announcement called LAMBDA may give analysts the ability to extend Excel beyond what it’s meant to do. Just like the custom data types feature, we’ll wait to see how this feature actually gets used in the workplace.

It’s rare to see new features for Excel completely change the way analysts do their work. This video from MAKRO makes a good point about new features kind of dumbing down the existing features that power users have come to love (and master).

3. Google Apps Script for the win

This year was all about Google Apps Script for me. From doing some heavy VBA scripting during my financial analyst days, writing some Google Apps Scripts reminded me of how powerful scripts can be when you have a specific problem to solve that native features in Google Sheets cannot solve.

Specifically, I wanted to sync data from Google Sheets into Coda and vice versa. I detailed my experience writing this Google Apps Script to sync data in episode 31 along with some use cases I think this script unlocks for the workplace. Give the robustness of the Google Sheets API and the fact that the scripting language feels like Javascript, I think it’s a relatively simple platform to pick up for data analysts. Another Google Apps Script I talked about was in episode 42 where I showed how a script can fill values down (also show the VBA script as well).

While scripting is not usually a skill analysts may think of when it comes to creating models and analyses, it is an invaluable tool for building workflows.

4. Filling formulas down

This next theme is more of a surprising one for me. Filling formulas down a column seems like a pretty mundane operation in a spreadsheet. In episode 28, I discuss four methods for filling formulas down to the last row with data. Since publishing this episode/blog post, this blog post continues to be one of the top most visited posts on the blog.

My theory is that analysts who become better at spreadsheets will start running into the edge cases of what the spreadsheet can do. This results in searching Google for very specific questions that allow them to perform some operation in the spreadsheet faster. While filling formulas down is not that difficult, doing it in the context of not “overshooting” your last row of data is not something that’s built natively into Excel or Google Sheets. Sometimes the most mundane things in a spreadsheet (and life) can have an outsize impact on your overall happiness with the tool.

5. Learning in public with Shawn Wang

The final highlight for 2020 would have to be episode 50 where I interviewed Shawn Wang about his 4,000-line VBA script for trading derivatives. So far this is the only episode where I had a guest on the podcast (but I plan on doing more of these in 2021).

While the VBA script was interesting to walk through, the more interesting part of this episode was Shawn’s journey from Excel to python and eventually to Javascript. While many analysts become masters of the spreadsheet, many more opportunities for those analysts who go on to web development, data science, and other related fields. Shawn’s “learn in public” attitude is also something I see more of in the developer tools community as people tinker with new web frameworks. Through this active learning and building in public, you get the interesting cross-pollination of ideas and innovation that is akin to academics who write and publish papers to their peers.

Whatever your thing is, make the thing you wish you had found when you were learning. Don’t judge your results by “claps” or retweets or stars or upvotes – just talk to yourself from 3 months ago. I keep an almost-daily dev blog written for no one else but me. Guess what? It’s not about reaching as many people as possible with your content. If you can do that, great, remember me when you’re famous. But chances are that by far the biggest beneficiary of you trying to help past you is future you. If others benefit, that’s icing.

-Shawn Wang

Now onto five skills and tips for data analysts to think about acquiring in 2021:

1. Master the tools of your craft: Excel, Google Sheets, SQL

Most of you are already proficient in Excel or Google Sheets, but I would highly recommend getting proficient in SQL as well. Knowing how to pull your data from whatever database technology you use will prevent you from having to ask your engineering or data science counterparts from having to write queries for you. This also gives you an additional skill to add to your toolbet.

Database platforms and the data visualization companies built on top of them (Mode, Looker, Tableau) are becoming more accessible to non-engineers (at least that’s how these platforms are marketing themselves). This means that anyone has the ability to query the databases as long as they have access to the databases. This gives you agency and control over your own data workflow.

Just knowing the basics of these tools in 2021 is a given, but I would challenge you to go deeper so that you have to search Google for things like how to fill formulas down since you are getting deep into your tools. I wrote a bit about mastering your tools earlier this year on my Coda profile which might be relevant for this tip.

2. The art of data storytelling

You may not think your job is to tell stories, but this is probably the most important skill to learn for 2021 since it directly impacts the rest of your organization.

In addition to building PivotTables, setting up dashboards, and creating a scalable reporting system, the best data analysts are able to summarize their findings that lead to inspiration. The New York Times actually holds an internal data bootcamp to teach their reporters and journalists the fundamentals of data analytics (I talk about this in episode 16). Their reporters are already amazing storytellers. Equipping them with the ability to tell data-driven stories only makes their stories more compelling and interesting to read.

Source: New York Times

3. Think like a coder

In my opinion, the lines between data analysts, data engineers, and software engineers will continue to blur. This is happening because the tools that we use continue to democratize who has the access to build on these tools.

If you are used to just modeling or building out your reports in spreadsheets or some data visualization platform, I encourage you to “think like a coder.” What that means is asking yourself the question: “What would happen to my model/analysis/tool if someone else had to run it 100 times? would it break?

When your spreadsheet needs to be used by your colleagues in a repeatable fashion, you’ll start thinking about all the edge cases where the tool might break. You’ll consider adding in error checks, abstract certain parts of the tool behind dropdowns or checkboxes. All these prevent the end user from inputting something they shouldn’t. Extending this analogy further, your turning your spreadsheet into something that could be used by millions of people for a long time. This is just like software we use today that is constantly stress-tested for bugs and errors.

4. Understand the full data pipeline

Similar to the previous tip, data analysts should understand the full end-to-end workflow for how their models and reports are created. I sometimes show the following image during my data analysis trainings:

When I was an analyst, I was mostly concerned with the “presentation” or “data warehouse” portion of this diagram. This means building reports in spreadsheets or perhaps Tableau. Given that all parts of the pipeline have moved to the cloud, anyone within your organization can have access to the ingestion tools, data prep tools, and database tools (assuming you have security access).

The reason you need to understand the full data pipeline is to increase your efficiency with pulling data and knowing the data provenance for your analysis. If your company has questions about how your data was collected, cleaned, or transformed, you can say more than just “it came from this database.” This will help specifically with answering questions about data discrepancies between different reporting systems your organization uses. If you have control over your data pipeline, this means you can setup your data structures for long-term success (see episode 45).

5. Always be curious and ask questions

This skill doesn’t just apply to 2021. I want to end on this skill because it’s something I’ve carried with me throughout the years and has done well for me from two perspectives:

  1. Finding the key drivers and trends behind a model or analysis
  2. Learn new tools and platforms outside of spreadsheets

This kind of goes back to the previous tip of understanding the full data pipeline. If you aren’t curious about your data pipeline or data lineage, you won’t find the need to learn how to use AWS’ S3 or Lambda function features. By asking questions about the data pipeline, you’ll naturally start learning how your systems are glued together.

From a soft skills perspective, I spoke about the skills a data analyst should have in episode 5 where I reviewed a blog post by Mode CEO Derek Steer. In the blog post, one of the skills Derek talks about is asking good follow-up questions. It’s almost like you’re a detective or reporter trying to find the true culprit for a crime even though the data might convince you that this other thing is actually the cause of the problem. My former manager once told me that a good analyst “always asks questions until they get to the truth.” So that’s what you do for 2021: find the truth.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

The post Dear Analyst #54: 5 lessons learned in 2020 and 5 skills for data analysts to learn in 2021 appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-54-5-lessons-learned-in-2020-and-5-skills-for-data-analysts-to-learn-in-2021/feed/ 0 With the last episode for 2020, I wanted to take a look back and pull out some of the main themes and topics from the podcast. As people rushed to pick up new data skills to adapt to our changing environment this year, With the last episode for 2020, I wanted to take a look back and pull out some of the main themes and topics from the podcast. As people rushed to pick up new data skills to adapt to our changing environment this year, I think the precedent has been set for how one can learn almost anything from home. You may have picked up Excel and data analysis skills this year. What are some skills you should think about for 2021? This episode provides five skills for data analysts to consider adding to their toolbet in 2021.







First, a look back at some of the main topics from 2020:



1. Excel errors will still happen



Aside from the latest features Microsoft or Google releases for Excel and Google Sheets, the only spreadsheet news that makes it to mainstream media are mistakes that lead to a financial loss.



It all started with episode 38 where I spoke about the JPMorgan Chase trader that caused a $6.2B loss. Given the popularity of this episode, I did a few follow up episodes including episode 40 on the two Harvard professors who created a report that may have led to incorrect austerity policies. There was also episode 49 all about how Tesla might have underpaid by $400M for the acquisition of SolarCity. Many of these stories can be found on the EuSpRIG website (which I covered in episode 47).



These stories go back to the early 90s. After 30 years, I expect these stories to continue making headlines. When analysts are lazy, put under time constraints, and are hastily putting together spreadsheets, these errors will undoubtably occur.







2. Excel's custom data types, what's the hype?



Excel's custom data types came onto the scene a few months ago. The Excel community raved about the new feature, but I think the jury is still out on real use cases and audience for this feature. Touted as going "beyond text and numbers," the features is still in active development and is still missing some crucial features. I did deeper into this feature in episode 51. A more recent feature announcement called LAMBDA may give analysts the ability to extend Excel beyond what it's meant to do. Just like the custom data types feature, we'll wait to see how this feature actually gets used in the workplace.



It's rare to see new features for Excel completely change the way analysts do their work. This video from MAKRO makes a good point about new features kind of dumbing down the existing features that power users have come to love (and master).




https://www.youtube.com/watch?v=xubbVvKbUfY




3. Google Apps Script for the win



This year was all about Google Apps Script for me. From doing some heavy VBA scripting during my financ...]]>
Dear Analyst 54 28:07 50393
Dear Analyst #53: Making your Google Sheets do more for you with Google Apps Script and how to become more data-driven https://www.thekeycuts.com/dear-analyst-53-making-your-google-sheets-do-more-for-you-with-google-apps-script-and-how-to-become-more-data-driven/ https://www.thekeycuts.com/dear-analyst-53-making-your-google-sheets-do-more-for-you-with-google-apps-script-and-how-to-become-more-data-driven/#respond Mon, 14 Dec 2020 05:40:00 +0000 https://www.thekeycuts.com/?p=50349 When I worked in FP&A, I discovered that VBA could automate a lot of tedious tasks I was doing in Excel. From creating charts to formatting data, I realized that there possibilities with VBA were endless. As I started using Google Sheets more, I found that Google Apps Script offers similar functionality to extend what […]

The post Dear Analyst #53: Making your Google Sheets do more for you with Google Apps Script and how to become more data-driven appeared first on .

]]>
When I worked in FP&A, I discovered that VBA could automate a lot of tedious tasks I was doing in Excel. From creating charts to formatting data, I realized that there possibilities with VBA were endless. As I started using Google Sheets more, I found that Google Apps Script offers similar functionality to extend what your Google Sheets can do. The specific use case I wanted to solve was syncing data to and from my Google Sheets from other workplace tools. This episode talks about how I picked up Google Apps Script, and how you can level-up your skills to be more data driven in your job. Original slides for this episode are here.

This episode was adapted from a talk I gave for Promotable.io’s “Breaking into data” series. The original presentation I gave is here.

Starting with the macro

The way I started with VBA was simply recording a macro. You hit record, do a bunch of stuff in Excel, and then see what code is outputted from those actions you took. For example, this little script selects the range A1:A6 in your spreadsheet and applies a right-align formatting to the cells (among other things):

What’s nice about these macros is that you don’t have to know how to write code. At least initially. Just by doing stuff in Excel, you can see how VBA interprets those actions in the VBA editor (as shown above).

The first thing I tried to do with VBA back in the day was simply select some cells. This is the Range("A1:A6").Select portion of the script above. Then you can hit “play” in the macro, and Excel will select these cells for you without you touching your mouse or keyboard! The first time I saw this happen in my Excel file was a mind-blowing event. I realized I could control my spreadsheet just from pushing play.

Doing more with Google Sheets with Google Apps Script

Google Apps Script is the VBA of Google Sheets. Since Google Sheets has an extensive API, you can access pretty much any part of the Google Sheets UI. The reason I like using Google Apps Script include:

  • It’s free
  • The language looks and feels like Javascript
  • Lots of built-in services to access not only Google Sheets, but also Gmail, Google Calendar, and other products in Google Workspace (formerly G Suite)

I was worried that learning Google Apps Script would be difficult since it’s different from VBA. I started with simple tutorials like this one (teaches you how to programmatically create a Google Doc file and send you the link via Gmail). Google is clearly trying to target “citizen developers” like myself who don’t really have any forma programming experience but know just enough to be dangerous. Tutorials like the one below make it seem like anyone can use Google Apps Script and take advantage of its robustness:

Just like I first selected a range of cells with VBA, I did super simple tasks and workflows with Google Apps Script like selecting some cells or applying some formatting to numbers.

The Google Apps Script editor

Data transformation and munging

The first time I heard the term “munging,” I thought it was some kind of disease. This is all data munging is:

The process of transforming and mapping data from one “raw” data form into another format with the intent of making it more appropriate and valuable for a variety of downstream purposes such as analytics.

Source: Wikipedia

I started using Google Apps Script with one goal in mind: I wanted to “sync” data from a table in Coda (my company’s product) to a Google Sheets file. In order to do this, I had to utilize the Coda API to get data out of my tables in Coda, and then use the Google Sheets API to get data in to my Google Sheet. The problem is that data returned from the Coda API is structured in a format that is different from what Google Sheets needs.

When you view your data in Coda via the Google Apps Script debugger, the data looks like this:

At first, I had no idea what I was look at. After further analysis, I realized that the “rows” in my Coda table were each of the numbers with a “+” next to it (the second arrow in the screenshot above). Each column of data in my table is represented by a “row” in this debugger output. Once I realized all my data is there, I just needed to figure out how to transform the arrays of data from the Coda API into the multi-dimensional array format that Google Sheets needs via the API like this:

var values = [  
  [ "Green", "1,000,000", "$2.99" ],
  [ "Red", "3,000,000", "$1.99" ]
];

If you are used to playing around with formulas in Excel or Google Sheets to get data to look exactly the way you like (see previous episode on extracting text from a middle of a cell), playing with these arrays of data should be pretty straightforward for you.

Working with the “data model” instead of the spreadsheet UI

The biggest lesson I learned from building this Google Apps Script out (you can see the script in this repo) is this: use the data “behind” the spreadsheet instead of moving the cursor around the UI.

What does this mean?

When you record a macro, the code ends up looking like this where you’ll inevitably see a bunch of Range selects:

Source: eduCBA

This is fine on a small spreadsheet. But if you have a spreadsheet with hundreds of thousands of rows, moving the cursor from one cell to the next to pull data out of Excel or Google Sheets can get quite slow. “Pasting” data into each cell one at a time is also inefficient.

Instead, I started pulling data one table at a time so that the entire data model was stored in a variable somewhere in my Apps Script. This means I can cycle through each row on the backend, apply some transformations, and then sync the data over in my Google Sheet. I would also avoid syncing one cell or row at time. When possible, I would try to sync an entire range of data into the Sheet to avoid unnecessary operations.

Shifting your mindset from selecting cells and setting values in cells via Google App Script to working with the data model is important as you’re dealing with bigger tables of data.

Key takeaways on being more data driven

If you work in a marketing, customer support, human resources, or some other role where working with data is not your primary responsibility, you’ve probably had to deal with a spreadsheet at one point or another. These takeaways are a few tips for you to consider to learn how to be more data-driven in your role. I think any knowledge worker who works with spreadsheets every day has the ability to create powerful workflows in Google Apps Script.

1. Be skilled in workflows in addition to tools

Sometimes I get asked about which tools to use once you’ve “mastered” Excel or Google Sheets. First of all, there is always more to learn about your tools; especially spreadsheets.

Unless your job requires you to learn SQL or Python or some hot new tool, I would encourage you to think about becoming proficient in workflows.

Understand how your team’s Salesforce data ends up in a database and then gets reported out by a different tool. Find processes where people are copying/pasting data from Google Sheets into email to send out weekly updates and use tools like Zapier to automate these manual processes.

2. Work through short tutorials to learn Google Apps Script (or any workflow tool)

Instead of signing up for a 5-hour long class on how to use Google Apps Script or some other tool, take small tutorials (<10 minutes) to learn the basics. Why? Chances are 80% of what you are trying to automate on your job can be accomplished by a few small features in Google Apps Script.

Many people think they need to learn all the little details about Google Apps Script by taking multiple classes, but most of your learning will happen via doing and Googling. Plain and simple.

3. Know VLOOKUP and PivotTables if you claim you are “proficient” in Excel or Google Sheets

Many people put “proficient” in Excel or Google Sheets on their resume, but the minute you ask them about doing a VLOOKUP or summarizing data in a PivotTable, they freeze.

Once you understand all the nuances of lookups and PivotTables, you’ll already be in the upper echelon of all spreadsheet users out there. The big advantage here is that you’ll start to see how all the tools you use at work are basically lookups to each other and data is all stored in rows and columns. Think about every tool that you end up doing an export to CSV in.

4. Break your problem down into small pieces

Are you tasked with doing a big analysis? Need to create a data-driven presentation to close a sale? The data problem, like many other problems in life, should be broken down into small steps.

With Google Apps Script, I knew I wanted to sync data over into Google Sheets which felt like an insurmountable problem at the time. I started with the basic (like selecting cells in a Google Sheet using Google Apps Script). Then I moved on to writing data into cells. And before you know it, you’ve solved a bunch of small problems, and the big problem doesn’t feel so big anymore.

5. You don’t need to have a computer science or data science degree

I majored in marketing. I didn’t touch Excel until my first job out of college. The idea that you need a computer science or data science degree (or take a bootcamp) to be data-driven is debunked, in my opinion.

More and more knowledge workers who are comfortable with using spreadsheets and SaaS tools realize that they are actually “developers” in their own right. When you think something looks complicated or you think something can be automated, just Google it.

I really like this Stack Overflow blog post from 2016 digging into whether developers need college degrees. Guess how most developers learned how to do their jobs? Self-teaching.

Source: Stack Overflow

As the blog post discusses, at the end of the day it needing a degree depends on the specific job you’re applying for. For most roles out there (especially at startups), I think being data-driven, curious, and knowing how different tools play together will get you going a long ways.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:

  • No other episodes/blog posts this week!

The post Dear Analyst #53: Making your Google Sheets do more for you with Google Apps Script and how to become more data-driven appeared first on .

]]>
https://www.thekeycuts.com/dear-analyst-53-making-your-google-sheets-do-more-for-you-with-google-apps-script-and-how-to-become-more-data-driven/feed/ 0 When I worked in FP&A, I discovered that VBA could automate a lot of tedious tasks I was doing in Excel. From creating charts to formatting data, I realized that there possibilities with VBA were endless. As I started using Google Sheets more, When I worked in FP&A, I discovered that VBA could automate a lot of tedious tasks I was doing in Excel. From creating charts to formatting data, I realized that there possibilities with VBA were endless. As I started using Google Sheets more, I found that Google Apps Script offers similar functionality to extend what your Google Sheets can do. The specific use case I wanted to solve was syncing data to and from my Google Sheets from other workplace tools. This episode talks about how I picked up Google Apps Script, and how you can level-up your skills to be more data driven in your job. Original slides for this episode are here.







This episode was adapted from a talk I gave for Promotable.io's "Breaking into data" series. The original presentation I gave is here.



Starting with the macro



The way I started with VBA was simply recording a macro. You hit record, do a bunch of stuff in Excel, and then see what code is outputted from those actions you took. For example, this little script selects the range A1:A6 in your spreadsheet and applies a right-align formatting to the cells (among other things):







What's nice about these macros is that you don't have to know how to write code. At least initially. Just by doing stuff in Excel, you can see how VBA interprets those actions in the VBA editor (as shown above).



The first thing I tried to do with VBA back in the day was simply select some cells. This is the Range("A1:A6").Select portion of the script above. Then you can hit "play" in the macro, and Excel will select these cells for you without you touching your mouse or keyboard! The first time I saw this happen in my Excel file was a mind-blowing event. I realized I could control my spreadsheet just from pushing play.







Doing more with Google Sheets with Google Apps Script



Google Apps Script is the VBA of Google Sheets. Since Google Sheets has an extensive API, you can access pretty much any part of the Google Sheets UI. The reason I like using Google Apps Script include:



* It's free* The language looks and feels like Javascript* Lots of built-in services to access not only Google Sheets, but also Gmail, Google Calendar, and other products in Google Workspace (formerly G Suite)



I was worried that learning Google Apps Script would be difficult since it's different from VBA. I started with simple tutorials like this one (teaches you how to programmatically create a Google Doc file and send you the link via Gmail). Google is clearly trying to target "citizen developers" like myself who don't really have any forma programming experience but know just enough to be dangerous. Tutorials like the one below make it seem like anyone can use Google Apps Script and take advantage of its robustness:




https://www.youtube.com/watch?v=JE4pF40ujh8&feature=emb_logo




Just like I first selected a range of cells with VBA, I did super simple tasks and workflows with Google Apps Script like selecting some cells or applying some formatting to numbers.



The Google Apps Script editor



Data transformation and munging



The first time I heard the term "munging," I thought it was some kind of disease. This is all data munging is:



]]>
Dear Analyst 53 29:01 50349