Dear Analyst #92: Generating insights from vehicle telemetry data and crafting a data strategy with Victor Rodrigues
Podcast: Play in new window | Download
Subscribe: Google Podcasts | Spotify | Stitcher | TuneIn | RSS
Data can come from different places, and one area I don’t hear about too often is from vehicles. Victor Rodrigues is from Brazil and transitioned into a career in data six years ago. Before that, he was working in various IT including network and infrastructure administration. He eventually relocated to Dublin working as a cloud specialist for Microsoft helping organizations with digital transformation. We discuss a specific data project involving vehicle telemetry data, when data and the real world collide, and selling data solutions into the enterprise.
Pulling performance data about bus fleets and trucks
Who knew that cars generated so much data? With all the sensors and chips installed in cars these days, data is constantly being generated in real-time? According to statista, 25GB of data is generated every hour by modern connected cars:
One of Victor’s first roles in the data field was as a data engineer setting up data pipelines for collecting data from vehicles. The startup he worked at helped its customers collect telemetry data about their buses, trucks, and other vehicles. They would deliver insights from this data back to the customer. The North Star goal is to reduce the costs per kilometer by any means possible.
Vehicles have various apps and IoT devices collecting and producing data. The data collected would include how long is a car stopping in traffic? How often is the engine running? Victor’s job involved building ETL/ELT pipelines to collect this raw data and transform it into a data model that could be used for analytics and reporting.
Don’t sleep on data strategy and architecture
Before Victor could get into the fun parts of analyzing the data, he had to build out a data strategy and architecture. This is the part where you have to decide which tools is best for a specific part of the data pipeline.
Do you go with Google Cloud or Microsoft Azure? What is the best architecture? What’s the most cost-effective solution? I remember when I was studying for the AWS Solutions Architect exam, I came across AWS’ Well-Architected Framework. These are playbooks for picking the right tools (within the AWS ecosystem) for various use cases and scenarios:
In setting the data strategy and architecture, the main variable that affected Victor’s decision was cost. His team first started with Google Cloud and piping data into BigQuery, Google Cloud’s main data warehouse solution. All of the big data warehouse tools allow you to trial the platform before throwing all your data in. He found that BigQuery was the most cost effective solution, but collecting data in Google Cloud wasn’t as great as other cloud providers.
The ultimate architecture looked something like this:
- Ingest billions of rows of data in Microsoft Azure
- Pipe the data into Google Cloud Bigquery for data modeling and analytics
- Use Tableau and PowerBI for data visualization
Finding insights from the data and delivering impact
Victor had all this data streaming into his multi-cloud architecture, so what happens next? He helped figure out what KPIs to track and what insights his team would deliver to the customer. Here are a few insights Victor gleaned from the data and the recommendations they suggested to the customer.
1) Hitting the clutch too often
One of his customers managed a fleet of buses. Through the data, Victor found that certain bus drivers were pressing the clutch too often. This would lead to the clutch wearing out and ultimately would hurt the engine on the bus. This leads to more costs for maintaining the buses. The simple recommendation was to reduce hitting the clutch. Perhaps this came in the form of new training for bus drives.
2) Running air conditioning while vehicle is idle
It gets pretty hot in Brazil. Victor found that delivery trucks would sometimes sit idle but the engine and air conditioning in the truck would still be running. This doesn’t seem too strange if a delivery person is making a quick delivery. They drive to their destination, get out of the truck, make the delivery, and get on with their way. It wouldn’t be out of the ordinary for the engine and AC to stay running for 5-10 minutes while these activities are happening.
What is out of the ordinary is the engine and AC running for an hour or more while the truck is idling. Turns out delivery drivers would stop for their lunch break and keep the AC running so that when they got back in their trucks, the cabin was nice and cool. Across a fleet of trucks, this behavior would obviously add to gas costs. The tough recommendation here is to deal with the heat when you return to the truck and keep the AC off while you’re at lunch.
With both of these insights, Victor says it’s one thing to see the data in a data viz tool on your screen. It’s a whole other world when you comprehend the real-world impact and effects of the data. In the air conditioning example, the data shows the engine running for long periods of time. As an analyst, you have to be a detective and figure out what the real underlying cause is. I think it’s easy forget that this data is usually driven by human beings taking normal human actions.
Creating and selling data solutions for organizations
At Microsoft, Victor’s role involves helping customers figure out what they can do with their data at lower costs. Like many who work in enterprise sales, customer discovery is the number one priority. The tools, platform, and technology are secondary.
Victor says what he sees in the field a lot are organizations who are simply collecting data, but not doing anything with the data. He meets with C-level executives who are very bullish on digital transformation at their respective organizations. The problem is that some these people who join these organizations may not have come from a data background (similar to Victor). At the C-level, it’s a lot of education in terms of tying data solutions to business goals. While these conversations are happening, Victor also works “bottoms-up” by getting the organization’s developers on board.
As Victor reflects on his wide-ranging career in data, he offers some advice to those who are also thinking about transitioning to a career in data. You gain all this product experience in each organization you join, and it could culminate in being a data consultant for one of the big public cloud companies.
Building your data career through skills, certifications, and community
One thing I didn’t value before is the ability to sell something.
Getting a job in data is like any job. You’re going to be selling your skills (listen to episode #90 with Tyler Vu to learn more about this). As a potential shortcut, Victor suggests getting certifications. All the big cloud companies like AWS and Google offer industry-accepted certifications that show you have the fundamentals to get the job done. Furthermore, Victor suggests participating in the various communities behind the tools and platforms you’re learning like Python and SQL.
Speaking from experience, the Excel community is full of intelligent, helpful, and creative people. I started blogging about Excel almost 10 years ago because I saw others doing it in the community. One of my favorite blog posts from the archives is my recap of the 2013 Modeloff competition where the Excel community came together to solve Excel riddles. This image with Mr. Excel always brings me joy:
What’s coming up next for Microsoft
I’ve been pretty impressed with the pace of updates from the Excel team. You’ll generally see updates every few weeks. For instance,
SPLIT() is a function that was available in Google Sheets for a while. Excel launched their own version called
TEXTSPLIT() last month (among other updates). I asked Victor what new shiny toys are coming out on the Azure side, and he talked about data mesh, HoloLens, metaverse apps, 5G, and more.
Other Podcasts & Blog Posts
No other podcasts mentioned in this episode!
[…] For instance, Wacarra brought up a data point showing that moms were buying Hollister apparel for themselves (and not their teenage kids). During her interviews with Abercrombie & Fitch store managers, Wacarra found that a late 20s/early 30s Hispanic persona was shopping at the store. The data further proved that the qualitative research was true. It’s always fun when the data matches up with reality. […]