- 16th April 2018
- Posted by: Manolis
“We are looking for a mixture between a statistician, scientist, machine learning expert and engineer: …The ideal candidate understands human behavior and knows what to look for in the data.”
That is a real excerpt from a recent job listing looking for a data scientist. From startups to Fortune 500 companies, business are on the hunt for that rare individual that likely has a Ph.D. in mathematics, is versed enough in Python or R to write production-ready code, has an intuitive grasp on decoding their coworkers’ behavior, and likely also needs to have experience in whatever industry the business is in.
Sound like a tall order? Perhaps it’s these lofty expectations that have been fueling the corporate hiring reality that data scientists are about as hard to find as unicorns.
In recent years, the position of data scientist has become one of the most desired and sought after positions for enterprises, but finding someone with this skill set is a challenge. The number of open positions is ballooning. In fact, Forbes predicts that by 2020 the number of job listings for data science and analytics jobs will continue to grow from roughly 364,000 now to 2.72 million, with most of these positions concentrated in finance and insurance, professional services, and IT.
But actually finding these people often proves tough. Multiple surveys over the past few years have suggested there is a data scientist shortage. These job positions stay open an average of five days longer than the typical market average of 45. When companies do land one of these needle-in-the-haystack hires and they prove their merit, they are often poached by bigger companies that can offer even bigger salaries. This has driven the average tenure of these positions down from 2.5 to two years.
This leaves businesses in a bit of a conundrum. They supposedly need data scientists, who are being presented as the greatest thing since sliced bread, but finding and keeping these employees is a tenuous mission at best. But there could be another way out of this situation: What if organizations don’t need data scientists as much as they think they do?
It may be hard to imagine, but thinking through the details makes it clear that the current mantra that businesses must have a data scientist needs retooling. What if instead of having a single person at the helm of these many — and disparate — duties is mandatory, there are actually other options that could help organizations properly pull off their data science goals without latching them to hiring one perfect data scientist?
The following are a few areas enterprises should consider to rethink the role of the data scientist today.
A Change in Infrastructure
In the rush to not miss out on the big data hype of the early 2010s, companies stood up a lot of different tools that were supposed to help make sense out of the endless onslaught of data points they were collecting. The outcome of this for many companies was a messy architecture that needed to be rejiggered and jury-rigged to manage whatever was the most pressing analytics issue of the moment.
Unfortunately, this means that the proper infrastructure isn’t in place for today’s data scientists. Thus, these professionals end up spending a lot of their time performing maintenance instead of doing data science. They end up fixing this infrastructure and its accompanying engineering problems, a task below their pay grade, and are forced to focus on data quality and data management — definitely not as sexy as that job description made things sound.
The reality is, finding an alternative is necessary for many businesses to succeed. Organizations, strapped to find someone that can handle the messy data infrastructure they’ve amassed, can’t expect that person to want to stick around, performing work that’s menial versus the original task set out before them.
If companies want to get the most out of their data scientist efforts, they must put in some upfront work with more appropriate personnel to ensure that the infrastructure works as a support system for their data science efforts, not merely a tool in a very disorganized toolbox.
Considering the Value of Expertise in Data Analytics
When a better infrastructure is in place, it should be able to capture data, curate it, store it and retrieve it. Then someone needs to bridge the last mile so it becomes operational. Yet relying on this person to be a data scientist might be a bit of a misnomer. Perhaps instead companies should focus on the specific skills they need, not a jack of all trades.
Often, the simplest approach is the best when it comes to data, so hiring smart people that are skilled in data, but don’t necessarily have a Ph.D. or the expertise to invent a new algorithm, may often be the best choice.
Otherwise, companies should explicitly ask for what they need. Do they need a machine learning expert? The job description should say so, instead of relying on the buzzword “data scientist” to attract attention to a listing.
Turning to a Team
The most obvious answer to needing to find talent that can cover a plethora of skills is simple — hire a plethora of talent.
Instead of tasking a single data scientist to manage a company’s analytics goals, companies should amass teams that can be more than the sum of their parts. Yes, this is a more costly option than hiring one data scientist, but the benefits of a team are that you can search for people with a core competency instead of that all-around unicorn. If companies opt for this team-based structure, this may mean there is still a lead data scientist that relays findings up to the C-suite, but these teams could help round out areas where a single data scientist would fall flat. Then your best coder, who isn’t a natural at giving board-worthy presentations or intuitively understanding how staff will use their data models, can focus on what they do best. And they can rely on another member of the team, who may not be the most creative coder but excels at interpersonal skills, to be in charge of operationalizing data science projects across an organization, so teams outside of the data science group actually use them.
Instead of offloading a ton of work on your other IT-centric employees, this team’s mission should be to seek and instill new efficiencies in a company’s data management. Initially, this will require work and focus, but the pay out in the long run should be a data management team that is more agile and capable of handling a company’s’ needs.
A New Approach
Instead of trying to fill millions of jobs with a small number of people qualified to soup-to-nuts transform a company’s data analytics efforts, businesses could bypass the need for a data scientists by adopting an infrastructure that makes its data more accessible, understanding that the buzz around the job title “data science” might a distraction and leveraging a team of people to meet their data science goals instead of a single person.
Companies could implement full-stack tools that provide them with rich insights through analytics, targeting, user segmentation and multivariate testing, to name a few. This tool must have a user friendly interface, so that the data-driven insights normally only at the hands of the data scientist are easily accessible for a group of people in the organization. This can give companies an out-of-the-box capability to understand the story their data is telling them, without requiring a specific skill set from a single person.
For enterprises that do this, when Google comes calling to hiring away their best and brightest talent, they will still have the infrastructure and personnel in place to not have to start their data-driven enterprise goals from scratch.