Dear Analyst #90: Biostatistics, public health, and the #1 strategy to land a job in data with Tyler Vu

You go to a family gathering and everyone is fawning over you cousin who has a cushy stats job at Harvard. Knowing your cousin, you think to yourself: if my cousin can do it, so can I. Next thing you know, you are a research fellow at Harvard University. Tyler Vu was studying applied math at Cal State Fullerton and didn’t realize he had a passion for Biostatistics until his fellowship at Harvard. He is currently getting his PhD in Biostatistics at UCSD and is the youngest person to ever pursue a PhD in Biostats at UCSD. In this episode we talk about doing network analysis for the public health sector, facial/voice recognition, and Tyler’s #1 strategy he thinks everyone should use to land their next job or internship in data.

Predicting HIV rates when you are missing data

As a neophyte to the data science and machine learning space, Tyler definitely veered into concepts that were quite foreign to me as he discusses his current PhD thesis. His thesis involves analyzing social networks knowing that there’s a lot of missing data within the context of public health. We talk about why finding the HIV rate in a sample is different from other metrics you could get from a sample.

For instance, if you want to get the average height of people in the U.S., you pick a random sample of people, find the average height, and extrapolate this to the rest of the population (roughly). This is a straightforward analysis since each person’s height is independent of each other.

In the case of public health, people are connected via social networks. With HIV, predicting whether someone tests positive or negative is dependent on the people you are connected with and whether those people have tested positive or negative. In this type of analysis there’s a lot of bias and “non-parametric estimation of network properties,” according to Tyler. I’m not even going to pretend I know what these terms mean. There’s actually very little published work on this subject so Tyler’s thesis would be adding a lot to the current research on this subject.

Source: Alteryx community

Training a voice and face machine learning model

Tyler has a history of working on one-of-a-kind projects. During his undergrad years, he worked on a project that combined face and voice recognition. Kind of like having a double authenticator system if you wanted to unlock an iPhone, for instance. Since you’re combining both image and voice features to train a model, it creates a “highly dimensional problem.”

Tyler helped with coding the project all in MATLAB. Given the tools and frameworks available, Tyler was pleasantly surprised to see the speed in which they were able to go from hypothesis to working app on this project.

Predicting “fragile” countries

During Tyler’s research at Harvard, he worked on a project to help predict which countries will become “fragile.” This is the definition of a “fragile state” according to the United States Institute of Peace:

Each fragile state is fragile in its own way, but they all face significant governance and economic challenges. In fragile states, governments lack legitimacy in the eyes of citizens, and institutions struggle or fail to provide basic public goods—security, justice, and rudimentary services—and to manage political conflicts peacefully. 

The project’s aim was basically trying to predict which countries might become fragile in the future so that the governments could better plan for these issues in the future.

Tyler’s project involved using a super learner machine learning method created by Mark J. van der Laan. Eventually his team settled on an Occam’s Razor model to finding a model that would help them predict future fragile countries. This model was the a simple classification tree which had a 90% test accuracy.

Tyler brought up an interesting point about simplicity and machine learning models. Usually the model will be super simple if the data was well collected and accurate. In the absence of good data, Tyler says this is where you start doing the more advanced neural network type of analysis.

The #1 strategy to get your next internship, job, or grad school program

We shifted the conversation from data and machine learning to landing a job in data. Tyler had a lot to say on this subject and I think any aspiring data analysts and data scientists could learn a thing or two from Tyler’s strategy.

Tyler describes the current state of affairs: blind resume submissions. Recruiters have to sift through hundreds of resumes for popular internships and jobs, and the only way you can stand out during this screening phase is:

  1. You went to an Ivy League school or
  2. You interned or worked at a FANG company (Facebook, Apple, Netflix, Google)
Source: George Pipis, Medium

Tyler says the job search is less about finding a job and more about standing out. The question is: how far are you willing to go to stand out?

Tyler didn’t go to an Ive League school when applying for his current program and also didn’t have big tech experience. He says the way any applicant can stand out is by finding the email of the hiring manager, and send them a cold email indicating why you would be a good candidate for the position. Here’s an interesting (although somewhat devious) strategy to find the correct email address for the hiring manager according to Tyler:

  1. Create a fake resume that has the top credentials like attended Harvard and was a SWE at Google
  2. Submit that fake resume to the job or internship you’re interested in
  3. This fake resume will most likely get past the recruiter screen so you get an email from the recruiter or hiring manager on next steps
  4. You then email the hiring manager from your real email with a personalized message

Can’t knock the hustle!

Building a sales agency on the side

As if doing a PhD in Biostatistics doesn’t keep Tyler busy enough, he also found time to start a side business helping marketing agencies close deals. This came out of left field but further shows how scrappy Tyler is when it comes to executing on an idea.

Tyler was scrolling through Twitter and some Tweets from people talking about making money from remote sales work. This is not your ordinary insurance or car sales type of work. This is the type of sale where a large company is trying to land a new client to spend thousands of dollars on services. When marketing agencies and coaches need help closing new clients, they use Tyler’s network of sales people to seal the deal.

We ended the conversation with Sarah Silverman talking about shooting her shot when it comes to playing pickup basketball and standup comedy. From getting into a PhD program at UCSD to strategies to land your dream internship, it’s clear Tyler believes in shooting shots.

Other Podcasts & Blog Posts

In the 2nd half of the episode, I talk about some episodes and blogs from other people I found interesting: