Dear Analyst #55: Using Google Translate to quickly translate text with Le Grand Débat National data

This is a super simple formula in Google Sheets, and I don’t want to understate its utility. You can literally translate text from any language into another language. This formula came out on Google Sheets in early 2019 I believe. You basically don’t have to copy and paste into Google Translate anymore to get the translation you need. This function could be helpful for those of you who are are working with PDFs that contain tables of text in a different language, and you need to convert the text to your native tongue. Google Sheet for this episode is here.

If you don’t get the reference above in relation to language translation, watch this video 🙂

Le Grand Débat National dataset

One of the reasons I thought about this function is that I am trying to learn a new language myself (French). I came across this dataset on Kaggle and it is a heap of data from Emmanuel Macron’s initiative to increase debate all over France in 2019 regarding issues like taxes, democracy and citizenship, and the structure of government. I have no idea how the French government collected and aggregated all this data. I believe if you were not able to attend a debate in person, you can also answer questions online (which probably contributes to the bulk of the responses).

The dataset contains a whopping 170 million words with contributions from 250,000 French citizens. It looks like there are mixed reviews about the effectiveness of Macron’s initiative. For analysts and democracy, this was a big step in engaging citizens in public discourse and increased transparency around the data that was collected. We also get a rich dataset to utilize the GOOGLETRANSLATE function.

Applying the function to our dataset

I have a subset of one of the CSVs from this dataset in the Google Sheet. You’ll see in Column E we have some messy text with random characters but it’s clearly in French:

If I want to translate that title column into English, I start writing the GOOGLETRANSLATE function in column F. You’ll notice that when you start typing “GOOGLE” in cell F1, you’ll see only two built-in Google functions:

I imagine that Google will come out with many more functions that allow you to query other Google services. I think it’s strange there are only two built-in Google functions right now, but it may be Google’s way of nudging you to purchase an upgraded account to Google for Work to utilize more features (similar to Office 365). I can see more functions that allow you query your Gmail, Google Calendar, and Tasks. There are some built-in Google Bigquery functions if your organization has Connected Sheets and is using BigQuery as a data warehouse.

Anyway, back to GOOGLETRANSLATE. The function takes 3 arguments:

  • Text to translate
  • Source language (current language of the text)
  • Target language (language you want to translate to)

The caveat is you need to know the two-letter abbreviation for the last two parameters (full list here). If we want to translate the text in column E from French to English, the formula looks like this:

Drag this formula down, and you’re able to translate all the text into the language you need:

Translating text in Excel

I don’t believe a similar function exists in Excel, but you can translate text you’ve selected in Excel. This might work for one-off scenarios where you just have a passage of text you need translated, but I find the Google Sheets function more applicable to real-world scenarios. You also need to have an Office 365 subscription just to utilize the translate selection feature.

Assuming you have an Office 365 subscription and you have a dataset like the Le Grand Débat National data (text in multiple cells that need to be translated), I can see a scenario and workflow like this that might be appropriate for your use case:

  1. Get the table of data from the PDF into Excel using the Insert data from picture feature
  2. Copy/paste the data into Google Sheets
  3. Use GOOGLETRANSLATE to translate the text to the language you want
  4. Copy/paste back into Excel (if that’s the format you need)

Are business analysts technologists?

One of the blog posts/podcasts I mention in this episode is this fireside chat from Data Driven NYC. Jeremiah Lowin (CEO of Prefect) and Tristan Handy (CEO of Fishtown Analytics) are asked to define what a “data analyst” is, what the data analyst responsibilities are, and how they compare with data engineers and data scientists.

Tristan says that analysts don’t necessarily identify themselves as “technologists,” but use technology in service of answering business questions. If you are an analyst, the whole fireside chat is worth a watch/listen as you consider new skills to develop for 2021 and how you should work with your fellow data engineer and data science colleagues.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting: