How I Am Preparing For A Career Transition To Data Science Position From Scratch
“There’s a growing realization across all sectors that data science skills have become essential for competing and improving in today’s marketplace,” says Michael Galvin, executive director of Data Science Corporate Training for Metis, a data science skills training company that works with individuals and businesses.
Data Science is an amalgamation of Technology, Domain Knowledge and Applied Mathematics. When it comes to data science, ideal candidates should showcase a superset of these two types of skills compounded with domain and business knowledge. It is such a vast field with multitude applications that it becomes overwhelming for an individual to decide where to begin their journey. On top of this there is an information overload and a lot of hype that has been created by marketers that it ends up confusing anyone who wants to step into this field.
Broadly speaking there are three main kind of roles that one can choose and in turn develop specialized skill set related to it. The three roles are as below:
1. Data Scientists - They help companies interpret and manage data and solve complex problems using expertise in a variety of data niches. They generally have a foundation in computer science, modeling, statistics, analytics, and math - coupled with a strong business sense.
2. Business Analysts - They are responsible for bridging the gap between IT and the business using data analytics to assess processes, determine requirements and deliver data-driven recommendations and reports to executives and stakeholders. They generally have deep domain/business expertise and foundational knowledge in data modelling, statistics , analytics and ML Models
3. Data Engineers - They feed data into models defined by data scientists. They're also responsible for taking theoretical data science models and helping scale them out to production-level models that can handle terabytes of real-time data. They have deep knowledge in technology eg. Machine learning models, Data Hosting, Cloud computing etc. and foundational knowledge in statistics and mathematics.
Here is an interesting layout which shows the difference in three roles :
![]()
Nowadays , new automation tools are allowing analytics models to be created by those with lower expertise levels, so diversified, related skills like business knowledge and effective communication skills are also becoming important to set the data science aspirants apart.
How my Data Science Journey Started..
My interest in data science arose as I started to work in software solutions business for my company. My job required me to collaborate with the software R&D teams who were working on developing specialized Apps using ML algorithms trained on large datasets. I was fascinated with the potential of data science to solve real world business problems and understood that this would become the future norm.
To prepare myself for the future, I began my data science journey almost one year ago by taking an introductory course on"Data Analytics" from Boston University. Here is a link to the free course : https://g.co/kgs/1fXddu .
The course raised my intellectual curiosity and desire to pursue formal learning in data science . This was because I could envision how my product management background and domain knowledge could be raised to a new level if I could apply analytics techniques to come up with data backed decisions. I also did some informal research and found that this was an identified critical skill gap within my organization .
Broadly speaking in any organization the analytics team supporting the business units is comprised of people who can speak the language of data science (Data Analysts, ML Engineers, Data Architects etc.). The business needs insights to make decisions. However only a handful of people within analytics team understand business dynamics and are able to translate the findings into business insights. My plan was to learn the data science skill set so that I could become the critical bridge between business and data science team. I have a solid product management background and therefore already halfway in the journey. This plan also helped me to choose the right data science program and select the most appropriate courses.
The next step was to search for an accredited program . I knew it had to be an online program because I did not want to leave my current job. Additionally, I did not want it to be very expensive because of the ROI concerns. I already hold an MBA degree and therefore did not want to commit for yet another Masters degree. After doing a thorough research online and weighing top 10 data science programs on several key factors, I narrowed down my choice to the Graduate Program in Applied Business Analytics offered by Boston University. This specific Graduate level program offers course credits which I could use in future if I am ever interested in pursuing Masters or Phd. program.
Here is the link in case you are interested :
https://www.bu.edu/met/programs/graduate/applied-business-analytics/
I am now in the final term of the program and will be finishing the course by November of this year. There are quite a few real life lessons and personal realizations from attending the program, which I would like to share for the benefit of future aspirants:
1. It is important that we apply the course learning's beyond classroom teachings. It is not enough to attend the course and submit assignments to pass and get credits. I am applying the key techniques and relate it to my product management work. It helps me to reinforce the concepts learning and I am able to get astonishing outcomes at my workplace.
2. Deeper understanding of core concepts is very important. The course has taught a number of ML techniques and practical applications which are in use by numerous companies in building their core business model eg. Yelp, Instacart, Netflix, Zillow etc. In order to grasp a solid understanding of core concepts I have been building my own version of business models of the popular companies leveraging the publicly available datasets (via Kaggle) and applying ML algorithms. For this purpose I have also undertaken basic/intermediate level free courses in Python, SQL, ML etc. via Coursera or eDX websites
3. Storytelling is the key. No matter how knowledgeable you become but if you cannot articulate your finding by packaging them together in the form of story all the efforts are wasted. This is an art they don't teach you in these programs but at the same time one gets plenty of opportunities to present their story via Capstone Projects, Assignments etc. I have also attended storytelling courses and developed several executive level presentations in my job which has helped me lift the presentations part.
The Bumpy Road Ahead..
Now comes the most difficult part of the journey. In spite of all the hype around Data Science and the plentiful job opportunities, the reality is very different. There are now more job seekers in the market and this is partially due to the current unemployment and job layoffs created due to Covid-19 situation. Secondly, the term "Data Science" is being loosely interpreted by companies to classify a wide variety of roles ranging of Data Analyst to Decision Scientist under one umbrella term. The expectations out there are very high in terms of knowledge, experience and qualifications which could leave a majority of job aspirants very demotivated.
Upon doing some preliminary research on available job postings I noticed that almost 70% of the jobs out there are related to Product Analytics. There are only 20% roles for Data Modelling and the rest 10% are for Data Engineers. Generally for Data modelling roles the companies require PhD level qualification and likewise for Data Engineers they require Computer Science degrees. With my background and education in Data Science, I feel Product Analytics roles are the most suitable ones to focus. Here is a summary of key roles and their skill set requirements:
Product Analytics (~70% on the market)
Requirements: practical experience launching products; strong business acumen; advanced SQL skills
Examples: Product Analytics at Airbnb; Data Scientist at Lyft; Data Scientist at Facebook; Product Analyst at Google
Data Modeling (~20% on the market)
Requirements: knowledge of machine learning (not only how to use it but also the underlying math and theory); strong coding ability
Examples: Data Scientist, Algorithms at Lyft; Data Scientist, Algorithms at Airbnb; Applied Scientist at Amazon; Research Scientist at Facebook
Data Engineering (~10% on the market)
Requirements: end to end data scientists with data engineering skills; knowledge of distributed systems; MapReduce and Spark; practical experience working with Spark; strong coding ability
Examples: Data Scientist, Foundation at Airbnb; Data Scientist at some startups
As you can clearly see that there is still a long way to go before starting the home run in this journey. The good news is that I do have a line of sight to all the required skill sets to take me to the finish line. I am also sharing the specific topics that I plan to study over the next 3 months in order to complete this part of the journey.
Preparation for Specific Topics
Product Experience
In order to answer interview questions related to Product Knowledge I have complied the below resources that I am using to augment my product knowledge skills :
Resources:
Cracking the PM Interview by Gayle Laakmann McDowell and Jackie Bavaro
Decode and Conquer by Lewis C. Lin
Case Interview Secrets by Victoria Cheng
SQL
Practice makes you perfect! Although I have acquired basic SQL knowledge.
Nowadays , I am working on gaining intermediate/expert knowledge on this
topic. Here are some of the resources that I have been using to improve my
skill
Resources:
Statistics and Probability
Resources:
Khan Academy has Statistics and Probability course.
This Online Stat Book covers all the basic statistical inference.
Harvard has a Statistics 110: Probability course which is an introductory course on probability with practical problems. If you prefer reading than listening, PennState has an Introduction to Probability Theory course .
A/B Testing
Machine Learning
Resources:
To start I recommend this free Applied Machine Learning course by Andreas Mueller
Coursera - Machine Learning by Andrew Ng
Udacity - Machine Learning Engineering Nanodegree
I have written this blog after seeing and hearing from a lot of people that they do not know how to make a career transition to data science field. I am hoping that such people do get inspired and are able to take the first step towards the journey by reading this article and several other resources provided here. I know this is an overwhelming area and if you want more advice feel free to contact me . My email id : tarunth@yahoo.com.
In my next blog I will share with you some tips and resources to ace the data science interviews.
Stay Tuned !
Best Wishes,
Tarun Thadani
Well Written View Point on the vast field that perhaps would become the most integral and important faulty of information sciences as we ride into the horizon of what is undeniably the real of knowledge society
ReplyDelete