As a flex player on a data team, you might play the role of a data scientist, data analyst, or data engineer. Sarah Krasnik is one of those people who has held all these roles. In this conversation, Sarah gets into the weeds of what most data analysts do: helping business partners make better decisions with data. Prior to her current role as an independent consultant, she worked on different data challenges faced by operations, marketing, and customer support functions. Eventually, she managed a data engineering team focused on the data platform and infrastructure. From speaking with Sarah, it conjured up memories of working with bad data, manual data tasks, and playing the role of a mediator for your business stakeholders. We also chat about a popular blog post Sarah wrote on SaaS debt.
Building and maintaining a homegrown data pipeline
Sarah’s last role before striking it off on her own was at Perpay, a financial services company focused on the buy now/pay later space. The company is a data-driven organization (as are most companies these days). The data that Sarah’s team was looking at was all marketing data. Specifically, data that influencers customer conversion rates. The problem that they were trying to solve was how the marketing team could send more personalized emails and messages to potential customers to get them to convert.
The marketing team originally used a tool called Iterable where you send customer data to the platform and the platform would know when to send the right customized email. For instance, abandoned cart e-mails are super effective at increasing conversions and Iterable could help with this task.
The data engineering team’s goal was to figure out how to get data about the customer and get it into Iterable. This is a classic data activation scenario. Over time, Sarah’s team started building a solution in-house. The biggest challenge was getting the data out of the data warehouse and having it notify Iterable’s API. As the the use case for Iterable and the in-house solution grew, the data engineering team had to constantly figure out what was in Iterable and checking diffs (seeing what changed from the previous state to the current state) to debug issues. Eventually the team moved to a paid solution called Census to help with the movement of data from the data warehouse to Iterable. Sarah reflected on the evolution of the solution:
At a startup you have to be ruthless with prioritization. I realized that the data eng team was spending too much time maintaining this in-house solution. This stood out to me as a generic problem where you spend hours per month maintaining the system. When is the cost of the paid solution cheaper than the hours required for maintaining something in-house?
Automating a manual forecasting process with SQL scripts
Sarah was also a quantitative analyst at OneMain, a private lender in the fintech space. The affiliate marketing team was responsible for marketing loans so that they show up on sites like NerdWallet, Credit Karma, and Lending Club. The problem was how to increase conversions by reducing costs–another very common marketing problem that can be solved with bette data. Sarah’s team was in charge of forecasting metrics like cost per loan and cost per conversion for these affiliate marketing channels.
If anyone has ever built a manual forecast in an FP&A role, at some point you’re comparing “actuals” with the forecast. The goal is to get the two to line up closely. If they don’t, you have to figure out what led to the variance.
In OneMain’s case, comparing the forecast to actuals was a super manual process. Sarah’s goal was to simply reduce the time it took to pull the data and compare the actuals to forecast. Through various SQL scripts and a dashboard in Looker, she was able to save ~20 hours/week of work across a variety of people. The solution Sarah built was also version-controlled so you could see the updates she made to the process over time (more on version-controlling later).
Building consensus on metric definitions
DAU, MAU, ARPU. There are a ton of standardized metrics in the SaaS world. While these metrics are all great for showing your company’s performance to investors, there may not be agreement internally on what these metrics mean. How do you ensure all teams and stakeholders are on the same page on what a DAU even means?
Sarah was a data engineer at Slyce, a visual search API and SDK. A typical use case for their technology would be integrating with Macy’s, and a customer takes a picture of a dress and is shown similar dresses. Whether you upload an existing image, take a picture of an image, or take a picture of text, these are all considered searches. As you can imagine, the definition of a “search” can get quite ambiguous with all the different ways someone can do a “search.”
Sarah met with the sales and product teams to ensure cross-functional alignment on the definitions of a search. The way the sales team was reporting and communicating on searches to customers was different from how the product team was defining searches. The goal was to create a document that defined the various type of searches and create a dashboard that the sales team could send to customers.
This type of “glue” work involves gathering requirements, understanding what’s important to different teams, and building consensus across teams. The data team stayed neutral during the process and simply acted as the mediator between the different teams.
In addition to getting alignment, Sarah’s team also streamlined the process for pulling data about searches. Before she got involved, the team was pulling data from three different data sources to get a complete picture of searches. The data engineering team unified everything into one system.
Nothing better than having a single source of truth for all teams to pull from. Easier to debug if there’s only one system that is causing problems. And if there’s something wrong with the numbers, at least you know everyone‘s data is wrong. With multiple data sources, some people might be wrong and some people might be right. It will just take the data team more time to diagnose the true problem.
Paying down your SaaS tool debt
One of the reasons I really wanted to have Sarah on the podcast was to have her discuss this blog post she wrote about “SaaS debt.” You often hear of technical debt, but analytics teams can also develop data debt, according to Sarah. What exactly is SaaS debt, data debt, or “process debt”? Consider this scenario:
- You work on a marketing team and everyone uses Mailchimp to send email campaigns.
- Person A on your team drafts the content to send the email.
- Person B manages the list of contacts to send the email campaign to.
- Person C pulls the latest list of customers from a database to give to Person B to upload into Mailchimp.
- Person C goes on vacation, and Person B is in charge of pulling the data and uploading to Mailchimp. Person C makes a mistake but doesn’t realize it.
- The e-mail goes out to the wrong customers, and the team is scrambling to figure out what went wrong in the upload process, the list of customers that was pulled from the database, etc.
This is a contrived example, but imagine this type of process happening across a bunch of SaaS tools your team uses. In the blog post, Sarah talks about automating things with Zapier and Google Sheets. These tools add more debt to the process where one simple mistake can be costly.
In the software world, there are more robust solutions to preventing incorrect code from being pushed to production. Namely, using Git and versioning control allows people to see what’s going to be checked into the main branch before it gets pushed live. Sarah’s argument is that some of these best practices in the software world kind of exist in the SaaS tools we use every day, but not really. The solution Sarah proposes is three-fold:
- Templating – Templates in the SaaS world are pretty rare. By using templates that have been tested, this reduces errors and redundancy.
- Testing – There are a lot of testing frameworks for analytics teams to use on their data. Our SaaS tools should also have similar tests that can be automated without affecting “live” data.
- Versioning – In the Mailchimp example above, perhaps Person B could submit a change to the email campaign. Then other team members on the marketing team could review the changes and catch any bugs before the email campaign goes live.
Sarah’s blog post got a ton of responses from people who work in marketing and revenue operations. These teams are typically using a lot of SaaS tools and sometimes stitching and integrating them together with Zapier or Google Apps Script. If our SaaS tools could implement some of these changes Sarah proposes, Sarah believes these tools could work a lot better for the end user.
Other Podcasts & Blog Posts
No other podcasts mentioned in this episode!