Unlike the NBA, you can’t easily pick and choose the top “plays” of 2021 in the data analytics field like you can with basketball games. This list is by no means exhaustive and is just based on my reading of news and trends in the data analytics world. It’s easy to talk about big changes to tool and platforms that occurred throughout the year, but I’ll try my best to focus on actual job or organizational trends that affect the data analytics profession. The bigger swag will be my predictions for 2022. They might be just a continuation of trends from 2021 or heck…even 2011. Certain things won’t change much (like the use of Excel at work) but who uses data and in what forms will constantly change.
What happened in 2021?
1. The “Analytics Engineer” job title
While a data analyst spends their time, well, analyzing data, the analytics engineer helps prepare data for data analysts. The analytics engineer is similar to a data engineer in many ways, but is a bit closer to the business. It’s probably common for an analytics engineer to become a business analyst and vice versa. I think the organization that put this role on the map is dbt (read more here).
I used to think an enterprising data or business analyst did the job of an analytics engineer. This may still be true at smaller companies. In most large organizations, I think there is a clear separation now between those who transform data and those who analyze the data. Take a listen to episode #58 with Krishna Naidu of Canva to see how analysts at Canva need to think more like software engineers; further showing why the analytics engineer role is here to stay.
A good video on what an analytics engineer does:
2. Democratization of datasets and visualizations
I remember working in a role where one of the head business analysts was tired of getting ad-hoc requests to see customer data sliced and diced different ways. This would require the analyst to write SQL queries to answer some C-suite executive’s question. While these questions are important for making business decisions, these one-off questions and subsequent SQL writing was not sustainable. The business analyst ended up organizing a 1-hour session for all his business stakeholders to do a crash course on SQL and the data warehouse so the executives would feel empowered to pull the data themselves.
Can you guess what happened during that training?
Eyes were glazed over, people ended up answering emails on their laptops, and the business analyst was still fielding one-off questions after the training.
With the data visualization tools enterprises use today, the C-suite doesn’t need to run their own SQL queries anymore. A basic understanding of filtering and sorting means charts can be manipulated to answer the questions you have for your data. Business and data analysts are no longer the bottlenecks for a dataset. Data is permissioned up and down the organization and access is granted as long as you have a business need for the data. I would take a listen to episode #57 with Nadja Jury of Education Perfect to hear how data analysts partner with business stakeholders to deliver the exact data stakeholders need.
3. Low-code and no-code tools
How could a 2021 wrap-up of data trends be complete without mentioning the words “low-code” or “no-code?” I’d say this trend started in industries and professions outside of data analytics. We saw tools for design, website builders, and general business automation pop up over the last year or two and they’ve taken enterprises by storm. Data analysts, business analysts, and operations professionals can now stitch together their own mini data pipelines and keep data sources in sync.
These tools have led to the democratization of data mentioned above. Data the marketing team cares about can now be viewed and integrated with data the customers team cares about. Bugs the engineering team is working on can be viewed and commented on by the design team and the design team can attach the “fix” to the bug using various low-code solutions. Similar to the analytics engineer role, when there are job titles mentioning “no-code” like this one from Deloitte, it’s a sign that the trend is here to stay.
Excel is still the #1 low-code tool in my opinion, it was just never branded as such. Take a listen to episode #21 where I talk about no-code solutions in Excel. I also gave a talk about this subject at the no-code conference back in 2019:
Now onto predictions for 2022:
1. Python and R will become required languages
Wait, I thought only data scientists need to know Python and R? There are a couple reasons why I think data analysts will need to know Python and R. One, given the amount of data being stored in data warehouses these days, using Excel to analyze the data won’t be scalable unless you first query the data (using SQL) and spit out a smaller summary dataset to analyze. Guess what tools are great for analyzing large amounts of data? You guessed it: Python and R.
Second, I think the demarcation between different data roles within a company (data analyst, data engineer, data scientist) will fade over time. As a data analyst, you’ll eventually need to figure out how to set up a notebook and use different libraries. Not knowing these languages will only hamper the value you can provide to your company.
Finally, you can get a sense of what companies are looking for in new employees entering their data teams by the classes and bootcamps offered by universities and 3rd party training companies. Excel training classes have been around since Microsoft took over the enterprise. In the last few years, however, data science “certificates” and bootcamps have been the new sexy programs universities are offering potential students. Excel and SQL are part of these programs, but Python and R are definitely the focus. Take a look at episode #65 where Caiti Donovan discusses the data science bootcamp she took at Columbia University’s School of Engineering.
2. Domain-specific machine learning tools
This is in-line with the no-code trend. There will be less manual work in massaging datasets to pull out the insights or for making predictions. Machine learning tools are starting to fall into the hands of regular analysts who can build and run models off of their datasets without having to write any code. All the big cloud companies are offering these low-code machine learning platforms for business analysts and “citizen analysts” to run these models off of datasets in their data warehouse.
These tools will get more specific over time. Perhaps there is a model just for customer and HR data. If your company uses Salesforce, there may be an application that specializes in doing machine learning just on your Salesforce data.
For the data analysts stuck in Excel or Google Sheets, an example of this trend is the “Explore” tab in Google Sheets. On the surface, it looks like a simple add-on that does the data visualization work for you. All you need to feed it is your dataset. But this Explore tab could get much more advanced as it understands the shape and type of data you are feeding it so that the recommended visualizations are domain-specific and help you make real business decisions.
3. Excel remain the tool to do everything
Other Podcasts & Blog Posts
In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting:
- Freakonomics #483: What’s Wrong With Shortcuts?