Anshu is a strategic leader with a keen sense of piloting & managing business, processes, technology deliveries and people. She has over 20 years of experience with premier investment banks and technology firms. She has been recognized as one of the top 10 influential data science leaders in 2017 by Analytics India Magazine for setting up a high performing data science unit for AIG in India. Her initiation into the world of data analytics & data science is unintentional. While she was working at a large global investment bank, she faced a curious challenge of solving a particular business problem. She then collaborated with a data scientist, who opened the windows of clustering algorithms, to networks to visualization. And since then she has been a believer and is super invested in the power and possibilities of data science.
What was the first data set you remember working with? What did you do with it?
Anshu: It was the daily usage data of over 30,000 users – a big data alright given it was time series by every second. The first challenge was to organize that data and then to make meaningful insights about it.
How do you stay updated on the latest trends in Data Analytics? Which are the Data Analytics resources (i.e. blogs/websites/apps) you visit regularly?
Anshu: The world of data science is evolving faster than any one of us would have imagined. And while there are journals, sites, handles to follow to stay abreast the ‘science’ part of it, it is equally or more engaging to look at the applications and how the same algorithm or model can be applied to different business areas, industries. Bangalore itself has a very rich community of data science – practitioners and thought leaders.
Share the names of 3 people/publications/research that you follow in the field of Data Science or Big Data Analytics.
Anshu: Locally I find Analytics India Magazine an interesting mix of engineering and applied concepts. I follow a lot of interesting handles and people such as Dataminr, DJ Patil and then dive in if the topic attracts me. Closer home, Gramener is a unique company that take story telling of data to another level.
Team, Skills and Tools
What are the different roles and skills within your data team?
Anshu: From data mining, which employs data engineering techniques, and honestly something that is highly underrated, so building statistical and machine learning driven model, the teams have had a variety of skills.
Big Data Team, Skills and Tools
In the huge Big Data landscape, the skills are swiftly changing. Which is the technology do you see dominating in the ETL data space and real time?
Anshu: Data is hugely democratized now and people are moving towards less propriety and more collaborative tools – Hive and Spark have been big game changers and this space continues to evolve. Hadoop still remains a favourite for some of those trying to dabble in this space.
How do aspiring Data Engineering demonstrate their capabilities of handling the tool, technology, data and domain? Is Certificate (Cloudera/Hortonworks) a clear differentiator?
Anshu: My personal view is to focus on the problem to solve, and then apply the right tools and technology. A lot of times we tend to first jump into the tool or technology selection thereby adding a bias.
Personally, I am not a big fan of ‘certification’ – adding a certification doesn’t guarantee critical analytical thinking but yes, it definitely can be viewed as an endorsement of your technical abilities.
Are Analytical skills, Statistics, Machine Learning must have or good to have skills for Data Engineers?
Anshu: Like the software programming world has moved to ‘full stack developers’ similarly in the parallel analytics universe ‘applied analytics’ is valued than operating in a data or a model silo. Having an appreciation of data will be consumed or applied and understanding the business context is a huge value add.
Industry Readiness for Data Science
Are the industries looking to understand what they can do with data? Do they have the required data in place?
Anshu: Of course there is an abundance of data – transactional, behavioural, cognitive – all kinds and every day, every hour, every minute more data is being produced. Making sense of that data and using it for business gains is no longer ‘desirable’ – is not a necessity, and whether you are a small to medium enterprise or a large scale such as Amazon or Uber heavily uses data science for running their business efficiently.
Which are the top 3 problems that are on top of the Data Science, either based on industries or based on technology area?
Anshu: If it’s not on their list already, then I would believe Bangalore’s traffic management should certainly be (laughs!) but on a serious note areas of effective marketing and converting sales opportunities, risk control and avoidance, operational efficiency and customer delight are top of the mind for data scientists irrespective of the industry you belong to.
Industry Readiness for Big Data
Is Big Data becoming a reality in the industry beyond the social giants like Facebook, Google, Yahoo? If yes, which industries are actually moving towards the power of Big Data Analytics? If no, what is the outlook for adoption?
Anshu: Absolutely – BigData is no longer limited to media uploads to your Yahoo and Facebook reference – today across service industry (telecom, insurance, banking, airlines and others) to aggregators (Uber, Booking.com and more) use BigData. Not only this manufacturing industry is very high on the power of BigData and related analytics it can provide for preventive maintenance, forecasting productivity and has now become more sophisticated than ever.
Farmers of today in our country have the benefit of using tractors fitted with IoT sensors that transmits to the nearest service centre in case of a breakdown – not only this they are able to get proactive localized information and advice.
Doctors are able to quickly research and connect prognosis not just by Googling but also but making the correlations provided by an underlying model that run on large datasets of research and medical information,
All this is made possible by BigData and other related technology advancements.
Who in the Industry is your typical client for Big Data? Is it the CTO, CIO, CMO or special data leaders?
Anshu: Depending on the problem statement the internal clients could be a CTO, CMO or CIO – which goes on to reflect that analytics is no longer limited to deriving market intelligence – it is now all-pervasive into operations and growth of the business.
Advice to Aspiring Data Scientists
According to you, what are the top skills, both technical and soft skills that are needed for Data Analysts and Data Scientists?
Anshu: Apart from the technical skills which would be the tools, framework some of the other fundamental skills would be to understand the business domain and context. I would also recommend that those in space make an effort to understand and appreciate the big picture. Keeping oneself updated and connected in the industry is also important and platforms such as Kaggle provide that platform to keep your problem solve edge sharp.
How much focus should aspiring data practitioners do in working with messy, noisy data? What are the other areas that they must build their expertise in?
Anshu: Noisy, messy data – well if you don’t dirty your hands you will only be building models that don’t move the needle and will be considered academic. Real business is not about creating perfectly aligned datasets – data engineers and data scientists are valued because if their ability to create insightful nuggets of information from underlying data. Having said that putting in strategic solutions to help streamline underlying data may very well be worthwhile exercise too.
What is your advice for newbies, Data Science students or practitioners who are looking at building a career in Data Analytics industry?
Programming and software skills – R, Python, SAS or Excel
Statistical foundation and applied knowledge
Anshu: Focus on technique first and then the tools – because tools change every so often in this fast-growing industry
Machine learning while is aspirational and much in demand needs a certain flair for computer science fundamentals, mathematics and programming – make an honest assessment why are you in it – and then take the plunge
Don’t let data (or the quality of it) deter you.
What are the changing trends that you foresee in the field of Data Science and what do you recommend the current crop of data analysts do to keep pace?
Anshu: Knowledge basic statistical concepts, being able to leverage libraries and program is not a default assumption. Additionally, one should continue to be hand on. Aside from technically enriching yourself, also understand the power of unstructured and public data sources.
Big Data Solution Space
Are there legacy systems that are being replaced? If yes, which legacy skills are being replaced?
Anshu: Traditional business intelligence areas are already morphing into insight-driven, AI/ML and analytics backed models. Traditional reporting and dashboards have made way for on-demand and visually powerful dashboards that allow for more intuitive decision making. Rich mobile apps.
Would you like to share a few words about the work we are doing at Digital Vidya in developing Data Analytics Talent for the industry?
Anshu: Digital Vidya has been at the forefront of grooming aspiring data analytics practitioners – this is a critical time for the industry as the demand is outstripping the capacity. In the coming future, Digital Vidya and related firms have a greater responsibility and role to play in building a robust talent pool.
Are you inspired by the opportunity of Data Science? Start your journey by attending our upcoming orientation session on Data Science for Career & Business Growth. It’s online and Free :).