Praveen has over 15 years of experience primarily in the ‘Digital’ space. The initial part of his career was all about managing large eComm operations for Fortune100 clients. Then, he moved into digital marketing space managing large scale global digital marketing operations for leading brands. In this journey, he soon realized data and analytics will disrupt the traditional reporting world thus pushing the industry to adopt data/analytics. Around 6 years ago, he decided to get deeper into analytics and eventually made it to where he is today. Currently leading, Digital Analytics and Insights delivery at Course5 Intelligence.
What was the first data set you remember working with? What did you do with it?
Praveen Sathyadev: Interestingly, my first day at work post-MBA was to build a QBR (Quarterly Business Review) for our team running ecommerce store operations for a leading PC e-tailer. Bunch of excel sheets and emails to work on.
I built a decent presentation but I wish I had embarked on the analytics journey sooner as it could’ve made me look smarter that day. In reality, my first data set was a notebook where I had recorded pocket money paid by my father when I was 9 and to date have them. Have just managed to pay the interest off to him till date.
Was there a specific “aha” moment when you realized the power of data?
Praveen Sathyadev: Every time I look at any fresh data or data I’ve worked on, there is an ‘aha’ and an ‘ouch’ moment. Data can provide more than just information and insights. It also helps you learn a lot about your ability to process it.
How do you stay updated on the latest trends in Data Analytics? Which are the Data Analytics resources (i.e. blogs/websites/apps) you visit regularly?
Praveen Sathyadev: Biggest source are my teams. They teach me every day, helping me stay knowledgeable and humble. Other sources include Linkedin (if you keep the clutter away), DAA blogs and forums. I also strongly recommend webinars and articles collated by organizations like Digital Vidya.
Share the names of 3 people/publications/research that you follow in the field of Data Science or Big Data Analytics.
(i) Tuhin Chattopadhyay, a friend and a scholar. He has an interesting take on analytics in general & Subarna Roy, another interesting leader in this space.
(ii) Data Elixir – fantastic read
(iii) Analytics India Magazine
Team, Skills and Tools
Which are your favourite Data Analytics Tools that you use to perform in your job, and what are the other tools used widely in your team?
Praveen Sathyadev: This is a pretty wide-open question and I must say I’m agnostic. That said, there are a few good ones over others. Good old excels are good and so are the R/ Python + other tools out there. A smart analytics professional will have to familiarise across and not be stuck with one tool.
What are the different roles and skills within your data team?
Praveen Sathyadev: I’ll try and depict the role by flow:
Data Collector→Data Integrator→ETL Specialist→Data Solutions Architect→Data Visualizer→Cloud Enabler→Analyst→Data Scientist/Advanced Analytics Specialist→AI specialist→Analytics Leader.
Help describe some examples of the kind of problems your team is solving in this year?
Praveen Sathyadev: Most critical problem statements are the following
(i) Simpler ways to consume and act on large varied data. (Leveraging AI)
(ii) Building a ‘Unified Customer Profile’ and running analytics on top of it.
(iii) Measuring marketing success and recommending opportunities to optimize.
How do you measure the performance of your team?
Praveen Sathyadev: Continuous improvements and innovation. The key metric is the # of successful ideas generated over a period of time addressing specific known and unknown problem or need statements.
Big Data Team, Skills and Tools
In the huge Big Data landscape, the skills are swiftly changing. Which is the technology do you see dominating in the ETL data space and real-time? <Hive and Spark appear to be taking the leadership position, what attributes driving the growth>
Praveen Sathyadev: Hive and Spark indeed are taking things over but there are other much promising open sources that are aiding in this endeavour. (Hadoops of the world).
How do aspiring Data Engineering demonstrate their capabilities of handling the tool, technology, data and domain? Is Certificate (Cloudera/Hortonworks) a clear differentiator?
Praveen Sathyadev: One should actually join (if already graduated) a niche analytics firm and learn this on the job. Nothing compares to that. Or even an intern if you are a student. There are many companies including Course5 Intelligence who encourage such initiatives.
Are Analytical skills, Statistics, Machine Learning must have or good to have skills for Data Engineers?
Praveen Sathyadev: Good to have for now and a must-have in the next 5 year or sooner if you plan to stay relevant.
Industry Readiness for Data Science
Are the industries looking to understand what they can do with data? Do they have the required data in place?
Praveen Sathyadev: Yes, of course, everyone is trying to understand their data and hence the whole buzz in the market. They have required data in most cases but often struggle using it properly. They end up wasting their time setting up infrastructure with limited understanding.
Which are the top 3 problems that are on top of the Data Science, either based on industries or based on technology area?
(i) Inheriting bad or incomplete data from the past (Inbred Data) – hoarding bad data gives bad outputs.
(ii) Inability to connect digital and offline data (Applicable to CPG/ retail and other industries with both digital and non-digital data).
(iii) Too much or too little data leading to poor outcomes (loads of biasedness).
Industry Readiness for Big Data
Is Big Data becoming a reality in the industry beyond the social giants like Facebook, Google, Yahoo? If yes, which industries are actually moving towards the power of Big Data Analytics? If no, what is the outlook for adoption?
Praveen Sathyadev: Absolutely yes. Traditional financial and CPG companies have loads of legacy data piled up and they have realized the need for big data. As a matter of fact, any industry going through digital transformation is looking at Big Data as their key component of success.
Name 3 Industries and the kind of problems that they are solving using Big Data.
(i) Retail – Integrating and building ‘Single View of Customer’ through big data.
(ii) CPG – Integrating their entire supply chain value cycle to enable optimization through big data.
(iii) Insurance – Leveraging big data to simplify operation and enhance the customer experience.
Who in the Industry is your typical client for Big Data? Is it the CTO, CIO, CMO or special data leaders?
Praveen Sathyadev: It is very much dependent on the industry and size but we see a trend towards CTO/CIO alignment. That said such leaders are creating a specific role called Chief Data Officer or in those lines to ensure there is focus on data.
Advice to Aspiring Data Scientists
According to you, what are the top skills, both technical and soft-skills that are needed for Data Analysts and Data Scientists?
Praveen Sathyadev: I hope the aspirants actually understand the difference between these roles. With those assumptions, here are my high-level recommendations:
(i) Technical Skills – Must have an understanding of basic querying languages, along with the ability to process large data using top ETL solutions. Cloud is getting even more important, and understanding its basics is relevant. If you aspire to be a data scientist, R/Python will exclusively not suffice.
You have to understand the statistical concepts along with hands-on experience. The best piece of recommendation, I give to my team is to familiarize concepts through hands-on experience. (visualization/ BI skills are highly recommended). Last but not least – ‘Automate or Optimize’ everything you build continuously.
(ii) Soft Skills – It’s all about ‘Story Telling’. Consciously read and attend analytical publications/webinars to hone this skill.
How much focus should aspiring data practitioners do in working with messy, noisy data? What are the other areas that they must build their expertise in?
Praveen Sathyadev: Well, there’s is no escape to messy data. And if it is not messy or noisy, what’s fun? As detailed in my previous response, focus on building skills around storytelling, dashboarding and comms in general (PPT/PDF etc.). Adoption is key for any analytical output.
What is your advice for newbies, Data Science students or practitioners who are looking at building a career in Data Analytics industry?
Praveen Sathyadev: As mentioned below, all these are relevant. But, your reputation comes from experience. Stay in touch with the industry and keep updating yourself. Certifications also help in the validations. Try and get industry-focused as it will help you build a long term domain depth.
(i) Programming and software skills – R, Python, SAS or Excel
(ii) Visualization Tools
(iii) Statistical foundation and applied knowledge
(iv) Machine Learning
What are the changing trends that you foresee in the field of Data Science and what do you recommend the current crop of data analysts do to keep pace?
Praveen Sathyadev: Artificial Intelligence and Machine Learning are here to disrupt but it is still a buzz and less understood concept. Underneath this buzz is a strong data weave. Stay relevant in this space and you will survive for the next 10 years at least. Stay aligned with new technology innovation and best practices. Follow thought leaders and disruptors.
Big Data Solution Space
What is the kind of structured and unstructured data companies have? What is the size that we are talking about?
Praveen Sathyadev: In simple words, decipherable data is structured and otherwise is unstructured. Structured could be an enterprise tool generated that is based on a known or defined schema. Unstructured could be text, verbatim etc. Size is up for any range. Generally, Petabyte+ data is considered as Big Data worthy.
Are there legacy systems that are being replaced? If yes, which legacy skills are being replaced?
Praveen Sathyadev: This depends on the industry. There are few industries like travel and legacy retail that are still using age-old systems. In general, the transition is towards cloud-based solutions. Skills have to align with industry and type of data.
What is the size of clusters/environments that are being deployed for the clients? What are the production challenges?
Praveen Sathyadev: Clusters can range from 1000 to 4000 modes with over 5000 cores and 30TB+ of RAM. Size and scale are just not numbers here. They are growing by the day. One of the top production challenges is enabling real-time data processing for a large data set which does not seem to reduce. So, building a cluster future-ready is a keeping us all sleepless.
Would you like to share a few words about the work we are doing at Digital Vidya in developing Data Analytics Talent for the industry?
Praveen Sathyadev: I find Digital Vidya to be a humble but strong contributor in this space. While there are many other sproutings in this space, Digital Vidya is definitely far more relevant in comparison. They have helped me expand my circle and give a new perspective to this filed. Definitely a good go-to source for industry-relevant information.
Are you inspired by the opportunity of Data Science? Start your journey by attending our upcoming orientation session on Data Science for Career & Business Growth. It’s online and Free :).