Shweta Gupta, VP – Technology at Digital Vidya: Data Scientists always have their learning modes on. This initiates them to upgrade their skills & helps them to stay updated with the trends and insights of the industry. Sometimes, it becomes quite cumbersome to read a lot of information present online like blogs, reports and case studies. To make things really easy for aspiring as well as existing Data Scientists to stay updated with the latest trends, we interview Data Science Industry Experts. It was a refreshing conversation with Abhishek Sharma that extended my horizons of knowledge about Data Science. Without further adieu, I invite you to read his wonderful revelations.
Abhishek graduated with Bachelors in Computer Science and started working as Software Engineer. Simultaneously, he was learning more & more about Data Science and Machine Learning. As he delved deep into the subject he realized the power and usefulness of the ideas and decided to concentrate my full efforts into learning the craft.
How did you get into Data Analytics? What interested you in learning Data Analytics?
Abhishek: In my final year of engineering I was first exposed to Neural Networks and Probabilistic Computing and that’s when I learnt more about data analysis and methods to derive insights out of it and it quickly struck a chord with me. The fact that these methods are able to derive meaningful insights out of the data really got me hooked and sparked an interest.
What was the first data set you remember working with? What did you do with it?
Abhishek: The first data set was from a competition hosted by IISC where we had to classify tweets into different classes. This was the first time I was faced with a real dataset and had to learn about the whole Data Science pipeline of how to process the data and generate new and useful features for the model. I ended up using a Naive Bayes Classifier which I wrote myself and was a great learning experience.
Was there a specific “aha” moment when you realized the power of data?
Abhishek: The process of generating new insights is not a one time process where there is a fixed template that one adheres and applies to a problem. It is an exploratory process where you look deep into the data and at every point, you encounter an “aha” moment which builds you towards a complete understanding of the problem but to get to that “aha” moment you have to go through several setbacks which I think are necessary to solve any task.
What is your typical day-in-a-life in your current job? Where do you spend most of your time?
Abhishek: I am working in Digital Advertising and on most days we are work towards improving our model’s performance and devise new ways to make them more interpretable so that every stakeholder knows the ins and outs of the process. Most of the time is spent on understanding the problem statement and identifying the right kind of datasets required for this problem and posing the right metric for evaluation.
How do you stay updated on the latest trends in Data Analytics? Which are the Data Analytics resources (i.e. blogs/websites/apps) you visit regularly?
Team, Skills and Tools
Which are your favourite Data Analytics Tools that you use to perform in your job, and what are the other tools used widely in your team?
Abhishek: Python’s scientific stack is pretty mature and it covers most of the stuff that I need in my job. For deep learning I prefer PyTorch.
What are the different roles and skills within your data team?
Abhishek: We have a mix of domain experts, data engineers and data scientists in our team who help us in asking the right kind of questions and look at the right places for their answers.
Help describe some examples of the kind of problems your team is solving in this year?
Abhishek: We are working on various different optimization problems for our advertisers which would help them achieve their KPI and help them better understand user behaviour.
How do you measure the performance of your team?
Abhishek: There are certain metrics that one pose at the start which gives us a fair sense of how much improvement was we able to make using the current approach but interpretability and communication is important as well and is a big criterion for performance measurement.
Industry Readiness for Data Science
Are the industries looking to understand what they can do with data? Do they have the required data in place?
Abhishek: Yes, I think industries are not just looking to understand their data but are looking at new and exciting ways to analyze it and generate insights from it. Every new technique and method gives them a new understanding and improves upon their existing work. Special efforts are required to get the data out of the silos and munge it into the required and desirable format which can produce those insights and this is what separates good models and solutions from average ones.
Which are the top 3 problems that are on top of the Data Science, either based on industries or based on technology area?
Abhishek: Since Data Science is such a new field, more and more efforts are spent on a better understanding of different black box models and why a certain technique works better compared to other for some problems. This kind of study opens new doors for people to explore other ideas and build a solid understanding of the subject. Also data now comes in different formats e.g. ( image, text, sound, tabular ) this has generated interest in industry on how to make use of all of it to get better and accurate models. Every year the scale at which we are producing data is increasing and it poses a quite exciting challenge to work more towards doing learning at scale which is the need of the hour.
Advice to Aspiring Data Scientists
According to you, what are the top skills, both technical and soft skills that are needed for Data Analysts and Data Scientists?
Abhishek: Ability to code is very important and highly desired. One should develop a habit of reading up papers and blog posts and implementing them on real datasets to test their validity and your ability to present your ideas and findings is very crucial to your success.
How much focus should aspiring data practitioners do in working with messy, noisy data? What are the other areas that they must build their expertise in?
Abhishek: It’s very important to work with raw data but keeping an eye on data quality because those two are separate things. You almost never know what you don’t know about your data and so scepticism and some sense of data intuition are the best sources of guidance you will have.
What is your advice for newbies, Data Science students or practitioners who are looking at building a career in Data Analytics industry?
(i) Programming and software skills – R, Python, SAS or Excel – I would recommend Python as it is a mature language and community is very supportive and vibrating.
(ii) Visualization Tools – Depends on the data, for tabular data one can start off by going over it in Excel and then later move on to heavy machinery like GGPlot and Matplotlib. For other formats, it’s better to work with small samples ( images, text, sound etc. ) to better understand what they represent and pick up on any patterns and oddities.
(iii) Statistical foundation and applied knowledge – It’s very beneficial if you are familiar with different statistical methods because they serve as a good platform to understand the problem and provide you with certain guarantees that are sometimes very important in the process.
(iv) Machine Learning – In this current age we have data everywhere and comes in different forms so there are different methods that work well with certain kinds of data. Pick up any model and learn how and why it works and this should give you a better understanding of the whole process.
What are the changing trends that you foresee in the field of Data Science and what do you recommend the current crop of data analysts do to keep pace?
Abhishek: Deep learning is something I am very excited about, it has proved its usefulness in different industries and the wide range of problems it is able to tackle amazes me so it’s important to learn more about it and understand why it works.
Would you like to share a few words about the work we are doing at Digital Vidya in developing Data Analytics Talent for the industry?
Abhishek: I think it is very important to develop talent for Data Analytics and provide a pathway for them to exercise their knowledge and make an impact. Digital Vidya is a great platform for people to jump-start their career in data analytics.
Are you inspired by the opportunity of Data Science? Start your journey by attending our upcoming orientation session on Data Science for Career & Business Growth. It’s online and Free :).