Jean-Nicholas Hould On Data Science

Getting your first job in data science

A few weeks ago, I came across a great post from David Robinson about his first year as a data scientist at Stack Overflow. The post went into great details about how David landed his job there and the things he’s been working on since then.

In a section of the post, David advised graduate students who wish to get into data science to create public artifacts. David landed his job partly because of some public artifacts he created: blog posts and answers to questions on StackExchange.

His advice really resonated with me. Many of the great things that happened to me in the last few years are the result of making my work public: meeting new people, landing a job at PasswordBox, creating and selling a side project. The best way to get a job if you don’t have any experience is to make your work public.

Making My Work Public

I learned to code fairly late by the tech-world standards. I was 24 years old. At that age, I was temporarily living in Chicago to attend the Starter League, a three months’ intensive coding boot camp. As part of the program, we had a final project where we had to form a small team and ship a web project of our choice. At the time, I was already passionate about data. I had a few years of experience as a digital analytics consultant at a creative agency under my belt.

At the Starter League, I met Sam and Enrique. Two great guys who eventually became my teammates for the final project. After a few iterations on the idea, we decided to build an analytics platform for Tumblr. We called it MountainMetrics. The project was solving a pain Enrique was experiencing managing the Chicago History Museum Tumblr account: tracking the number of followers over time.

At the end of the three months boot camp, we had a fully functional product and at least one user, the Chicago History Museum. We open sourced the code. Little did I know at the time, but the project would be featured in Hacker News and attract the likes of many interesting people, including the Tumblr engineering team. More importantly, this project helped me land a job at PasswordBox in data science.

MountainMetrics wasn’t in any way a technological feat. It was a simple Rails web application that queried data from multiple API’s and reported back the data in a sensible way to the end-user. However, it demonstrated that I had a few very important skills in data science: I can ship, I am passionate about data and I have some tech skills to make things happen.

Done is better than perfect

We always want to show our best side. We fear of getting criticized. Psychologically, we humans want to be loved and accepted. This is one of the reasons why we want our work to be perfect before showing it to the world. This is also why so many people struggle shipping anything.

In his post, David Robinson talks about that how we used to work on scientific papers during his Ph.D. Those papers need to be “perfect” before they are published. They need to go through a slow revision process and often times are never made public.

The good news is that you don’t have to make your work perfect before making it public. What you ship is not set in stone. You can come back and improve it. Don’t get lost in the details, just get some interesting work out of the door. The worst that can happen is that nobody notices.

What should you share?

Share things that can provide value to people. Don’t take for granted that everybody knows what you know. It might be trivial for you to write about statistical concepts like the Beta Distribution, but it’s not the case for everyone.

Here are a few ideas on what you can do:

  • Write a post about a new concept you learned
  • Analyze open data sets.
  • Open source some code you wrote
  • Answer questions on public forums

The list could go and on. What matters is that you start small and that you deliver.

A long journey starts with a single step

If you are not willing to play the long game, stop now. There will always be something new to learn in our field. You need to embrace that. Every week, there is a new skill you can pick up, a new paper on machine learning, a new technology that you could learn. Don’t try to learn everything before starting to apply your knowledge.

People that want to get into data science generally want to know all of the skills they should learn before getting a job. They spend an absurd amount of time discussing on forums the skills they should learn to get a job. I think that’s a form of procrastination. Start applying what you know. Learn to extract value from data, no matter how you do it.

Whether you are transitioning from another career or you are just starting out, leverage your experiences. If you have worked as an accountant in the past, how can you use those skills to transition into data science? Perhaps there are startups out there that need a data analyst to understand the financials of their marketing acquisition channels. Over time, you can incorporate more advanced techniques to your work.

There are many jobs that involve working with data to make better decisions. They are not labelled “data scientist” or “data analyst”. Cast your net wide. The transition from being an accountant to a data scientist building predictive models is generally not done in a single step. You have to progress your way there. Find a way to progressively make that transition.

As you progress in your career, making your work public will help you create new opportunities, meet new people and get external feedback on your work. Take a moment and think about the people you look up to in data science. They most likely have one thing in common: they created public artifacts.