Anirudh is an entrepreneur and engineer from Carnegie Mellon. He has built software and managed teams for 13 years, working extensively with machine learning and NLP. Apart from 3LOQ, he has also built other enterprise tech companies like KeyPoint Technologies and Cafyne.
While partnering with banks and telecom companies to improve their customer engagement rates through new technology, he realized that customers who cross a certain threshold of usage rates exhibited greater product engagement. This set him on the path to building the world’s first AI engine that builds habits, Habitual.AI. Habitual.AI is currently engaged with leading banks to enable habitual usage of their digital and mobile banking platforms.
What interested you in learning Data Analytics?
Anirudh Shah: I got my first taste of data analytics as a career option when I joined the enterprise tech start-up KeyPoint Technologies as a software developer. I ended up transitioning to natural language processing and machine learning domains, which were new and esoteric tech applications back then in 2007. I loved the work and the rest, as they say, is history.
What was the first data set you remember working with? What did you do with it?
Anirudh Shah: The first large data set that I remember working on was when we had just launched 3LOQ. Our telecom client, a SE Asian operator, had shared call-data records for more than 2M post-paid subscribers covering a period of 6 months. This dataset was more than a 1000GB and was very rich as it had provided data points ranging from the phone used to location of the subscriber and the duration of the call.
Was there a specific “aha” moment when you realized the power of data?
Anirudh Shah: It was when we created a machine learning model that was able to predict the kind of establishment by observing the following data inputs (among other things):
- The number of people, and kind of people that were in and around that area
- How their numbers varied throughout the day
How do you stay updated on the latest trends in Data Analytics? Which are the Data Analytics resources (i.e. blogs/websites/apps) you visit regularly?
Anirudh Shah: I follow Data Tau, Hacker News, KD Nuggets, Analytics India, Digital Vidya and Analytics Vidya.
Share the names of 3 people/publications/research that you follow in the field of Data Science or Big Data Analytics.
Anirudh Shah: Tomas Mikolov, Abu Mustafa, Facebook AI Research (FAIR), OpenAI
Team, Skills and Tools
Which are your favorite Data Analytics Tools that you use to perform in your job, and what are the other tools used widely in your team?
Anirudh Shah: HPCC by LexisNexis, Spark, Hive, TensorFlow, Scipy/Pandas, H2O, Supersets
What are the different roles and skills within your data team?
- Data Engineer: Responsible for making sure that the data is sanitized and validated for every run
- Machine Learning Engineer: Responsible for making the models production-ready.
- Principal Data Scientist: Data exploration and feature engineering.
Help describe some examples of the kind of problems your team is solving in this year?
Anirudh Shah: We’re automating the process of building product habits with our patent-pending technology Habitual.AI.
How do you measure the performance of your team?
Anirudh Shah: Performance is mainly based on:
- The model performance
- Number of errors in the data
- Software bugs
Big Data Team, Skills and Tools
In the huge Big Data landscape, the skills are swiftly changing. Which is the technology do you see dominating in the ETL data space and real time?
Anirudh Shah: The open nature of Hive and Spark as well as the large investment by the giants (Google, Facebook etc) in these platforms is the real reason behind their swift adoption.
Is Analytical skills, Statistics, Machine Learning must have or good to have skills for Data Engineers?
Anirudh Shah: Basic statistics and probability are must-have skills for data engineers.
Industry Readiness for Data Science
Are the industries looking to understand what they can do with data? Do they have the required data in place?
Anirudh Shah: The BFSI, Telecom, e-Commerce and Content sectors have all the required data in place.
Which are the top 3 problems that are on top of the Data Science, either based on industries, or based on technology area.
Anirudh Shah: For the BFSI Sector: Credit Risk, Marketing, Operations.
Science Advice to Aspiring Data Scientists
According to you, what are the top skills, both technical and soft-skills that are needed for Data Analysts and Data Scientists?
Anirudh Shah: Curiosity is important – understanding the ‘why’ behind decisions and solutions is key to professional and team growth.
How much focus should aspiring data practitioners do in working with messy, noisy data? What are the other areas that they must build their expertise in?
Anirudh Shah: The real world is very messy. The sooner aspiring practitioners get their hands dirty, the better. Kaggle competitions are a great start.
What is your advice for newbies, Data Science students or practitioners who are looking at building a career in Data Analytics industry?
- Programming and software skills – R, Python, SAS or Excel:
- Start with Excel, then graduate to Python or R. Sticking to open source technologies has a number of advantages as they have widespread adoption.
- Visualization Tools
- Start with excel and graduate toGGPlot/Seaborn
- Statistical foundation and applied knowledge
- Understanding of basic probability is a must
- How mean/median/quartiles work in a wide variety of data sets and what they actually mean
- Machine Learning
- Tree based methods: GBT, RF, DT
- Logistic regression
What are the changing trends that you foresee in the field of Data Science and what do you recommend the current crop of data analysts do to keep pace?
Anirudh Shah: Make sure that you understand the basics of software development and approach data science by becoming a “full stack” engineer (full stack: exploratory data analysis, feature engineering, model development and optimization, production deployment, data integrity, ops, testing).
Are you inspired by the opportunity of Data Science? Start your journey by attending our upcoming orientation session on Data Science for Career & Business Growth. It’s online and Free :).