Breaking the data science ‘geek cycle’
(Published in Market Leader: Quarter 4, 2014)
Lack of careers awareness in schools and universities is leading to a serious shortage of new data scientists.
There is a perception that data analysis and econometrics is a dark art – that it’s incredibly complicated and black box. This is not the case. The hardcore maths behind the scenes may be complex, but conceptually it’s just a logical process that anyone can follow.
The misconception is partly the fault of data scientists – it is up to them to explain complex models in an understandable way, or to “take the geek out of the equation”. This makes the work infinitely more valuable as it actually becomes usable: you can create the most amazing model in the world, but if you can’t communicate it, it’s worthless. This is why we need to break the geek cycle of data science.
Data science is often considered a new phenomenon, but it has been around for years. The people doing it have just been called different things: econometricians, market researchers, web analysts, programmers, database analysts – they just don’t sound as cool as ‘data scientist’.
Data science has become prominent due to the massively increasing demand for data-handling skills and the need to couple them with the mindset to interpret and deliver decisions from data. This has created a significant shift in the need for data scientists: demand has increased exponentially in the UK over the past 18 months (Figure 1).
According to McKinsey research, by 2018 the US will experience a shortage of 190,000 skilled data scientists and 1.5 million managers/analysts.
After a shift in demand for something, a shortage of supply has a knock-on effect of wage inflation. Economics 101 states that a shift in the demand for something (the green demand curve – see Figure 2), then wages will increase to offset the shortage of supply. This has meant average salaries have doubled in the UK since 2010, from £30,000 to £60,000.
Why is there a shortage of good data scientists?
Shortages have become an issue due to the recent rise in demand highlighting the gap of commercially savvy data experts. The massive hype of ‘Big Data’ (it’s still just data) has triggered companies into making more intelligent decisions.
So, now the issue is known, can’t we just source people with these skills or retrain others? Retraining is difficult as you need to have a blend of familiarity with data, coupled with how to apply various techniques in a commercial environment. This is difficult even for an experienced person to learn.
In terms of sourcing new people, the main problem is a lack of knowledge that this kind of industry exists, but it is also about applicability in a commercial environment. More specific issues are:
Awareness: When I was doing A-levels in economics and maths, the perfect combination for data science disciplines such as econometrics, I didn’t have a clue that you could make a career out of interpreting and using data.
I fell into my econometrics degree entirely by accident. When faced with which economics degree to choose (straight economics or econometrics), through laziness I chose the one that didn’t need me to select any further modules: econometrics. And I am now eternally grateful for that choice.
This highlights the issue of younger people not having an idea of what choices they have going into university, which can easily end up shaping their lives. If there was a wider awareness in schools of different careers then this wouldn’t be an issue.
University careers process: The problem of awareness also creeps into universities. As an econometrician, the career options I apparently had open to me were accountancy (PWC and E&Y are great companies but were really not for me) or civil service (GES, agricultural economics and so on).
I was again very fortunate to stumble across a careers ad for a marketing econometrician, placed by one of the pioneers of the technique, Ohal. Had I not seen the ad, I would have been doomed to go down the accountancy/civil service route: not necessarily a bad thing, but definitely not for me.
Course syllabus at universities: The other issue is what is taught at universities, especially British ones. It’s still the case that courses such as econometrics are taught in a non-commercial way – they tend to focus on the macro economy rather than the micro, or the individual business. This means graduates are less commercially-minded and business-focused than they should be.
However, this is not the case in European universities. My wife’s course (she is also a data scientist and, yes, our dinner conversations are enthralling) was in operational research in Germany, which not only concentrated on the analytical techniques companies actually used, but also had an element of commercial work experience built in. In over 15 years of recruiting data scientists, I have found that universities in Germany, Italy, Spain and Lithuania tend to lead the way in commercially-focused degrees that set graduates up much better for the working life.
Communication:The final piece of the puzzle is finding people with the ability to communicate complex analysis into a digestible format that others can follow. This tends to be a rare breed, but is an essential part of the data science industry because if no one gets it, then no one will trust it, which means no one will use it.
This is the problem with just hiring ‘maths people’: building great models alone is not enough to make better-informed business decisions. What you need on top of this is the commercial application of the model, where the analyst needs to translate the model into business solutions.
If the model isn’t commercially viable, and not communicated clearly, then all that analysis and brainpower goes to waste.
So what can be done to fill this void? The most important thing is raising awareness and also the profile of data analytics. As a data analytics company, we are doing a few things to help, and we hope that others will do the same.
Universities: We are in the process of setting up programmes with universities to help promote awareness of the commercial analytical environment to undergraduates. This ranges from giving talks in seminars on what we do and how companies use our work, to helping create course modules that are more business-applicable.
Schools: I’m fortunate enough to have some very good teacher friends, so we can also go into schools and let them know what we’re doing.
Communication: We work very closely with our clients to help knock down the geeky barriers of data analysis. This includes explaining things in plain English, not trying to show them how clever we are, but showing simply and clearly how we do the modelling. This even involves building up models with them from scratch so they can see that data analysis is a very logical (and slow!) process that doesn’t involve a mythical black box. This builds awareness and knowledge with clients to make the techniques more accessible, well-known and therefore trusted.
Our overall mission is to break this geek cycle to grow the industry in terms of size and commercial maturity:
- Get more people aware of, and interested in, careers in data.
- Ensure these people are commercially aware and able to communicate complexity in a comprehensible way.
Our industry is very exciting and the work we do with clients is getting more and more useful with every project. We want to make sure this is continued and that data becomes the backbone of all commercial decisions.
Michael Cross is director of Brightblue Consulting firstname.lastname@example.org