Machine learning, Artificial Intelligence are current buzzwords in the corporate world. In very simple terms, machine learning is the art and science of building programs that learn from the data that the program processes. With machine learning, programs should get better at performing a task as it learns from the data. In other words, machine learning programs are very good at learning patterns in the data and based on these patterns a machine learning program makes decisions on the fly.
A new role with high demand
Machine learning has revolutionised several application areas. Some of the more popular ones are Computer vision and speech recognition. However, we are still at an early stage of building applications that apply machine learning algorithms. Just like in the 1990s HTML was a hot technology which powered the entire internet, we seem to be at a similar stage with machine learning and its superset which is called Artificial Intelligence. Building machine learning applications are the responsibility of a new type of employee called the “Data Scientist”. While software developers and statisticians have been around for the last three decades at least, Data scientist is a relatively new role which a lot of firms are actively hiring for.
What are the qualifications required?
The qualifications for a data scientist are mostly related to computer science, statistics, and mathematics. In fact, most firms are ok with hiring qualified people in either of the 3 fields: Computer Science, Statistics and Mathematics. A good data scientist is part software engineer, part mathematician and part statistician. Also, a key skill a data scientist must have is the ability to write code. It is not an easy role to get into simply because just being a computer science graduate is not enough to work as a data scientist. A data scientist must also have a superior understanding of statistics and mathematics and must be able to apply them in the field of data science. While a data scientist need not be a mathematician or a statistician, he/she must have a superior handle on these subjects to succeed as a data scientist.
Aditya Narvekar
Value for money
The fact that being a data scientist requires expertise in computer science, mathematics and statistics means that firms are willing to pay top dollar for well qualified individuals. As a per a survey conducted by IEEE in the US, data scientists had a median salary of 164,500 in 2020 which was an 8% increase over the median salary of 152,500 in 2019.
Data Scientist has to specialise in many areas
Data Scientist is a relatively new role which has come to the fore in the last 10 years or so. With the rise in internet firms like Amazon, Facebook and Google came increasingly large datasets which these firms use by building predictive models on top of them. The data science professionals are broadly classified into 2 groups now: Data Scientist and Data Engineer. While Data Engineers are people who specialise in capturing data and then extracting it into a format that can be easily used to build predictive models. Data Scientists on the other hand use the datasets created by Data Engineers to build predictive models. In terms of skills, the data engineer is required to have a better understanding of distributed computing which is required for processing the large datasets while a Data Scientist must specialise in understanding the different types of models and their application to various problems.
Young industry with a lot of potential
Organisations are still relatively data illiterate, and data science remains a relatively young industry. As data science continues to grow, acquiring and retaining experienced and knowledgeable data scientists and data engineers will be important. By 2025, Humanity will generate 175 Zeta Bytes of data, it is therefore very likely that industry will have a robust demand for data scientists and data engineers to mine this data to build applications that will solve several of our most common problems.
The author is an Assistant Professor, Data Science at SP Jain School of Global Management.