When is your company ready for a data engineer vs. a data scientist? There’s no hard-and-fast answer. Rather than asking what position to hire for, Galvanize Principal Instructor and Lead Data Scientist Benjamin Skrainka recommends founders and CEOs consider the skills they need on their team and work backward from there. Generally speaking, the needs of your company will evolve from a skill set generally packaged as ‘data engineer’ to the skills generally packaged as ‘data scientist,’ and possibly back to data engineering again. The trick to building your dream team is understanding what problems you need to solve and what skills are going to get you the best solution.
From the start, you need software infrastructure to both store and access your company’s data. That’s what a data engineer does for you: builds your data infrastructure, maintains your data infrastructure, and makes sure your data is accessible to the folks who will analyze it and make it useful to your company. Those folks are your data scientists. Generally speaking, founders and CEOs will need a data engineer in the early stages of their company’s life. They might also consider contracting a data scientist at this stage to ensure that the initial infrastructure is built in a way that will be optimally useful down the line when the business is ready for a data scientist on staff (so that it doesn’t have to be re-done).
Also, this an important time to keep in mind scalability. If you’re planning on growing (you are), you’ll want your engineers building something scalable.
After you amass enough data that it is useful in producing meaningful insight and you’ve got the resources to put into asking questions about optimizing your business (you’re beyond the just-getting-it-to-run part), then you’re ready for a data scientist.
“If you bring data scientists in too early, they may not be happy, and some companies make that mistake,” Skrainka said. (If you’re a data scientist, you can read about how to find the right employer for you here.)
As the company continues to grow, there may because down the line bring data engineering back onto the team.
“As the actual scale of data or the speed requirements become higher, you’re more likely to need specialists with data engineering talent who understands those things,” Skrainka said. “You might need data engineers who understand building and debugging massively paralleled, distributed scales of systems.”
‘Massively paralleled’ means you have so much data you have to store it on many machines, and therefore you need someone who can make it computable in segments and then wrangle those computations into an appropriate average for all of your data. This might be where your data engineers and data scientists need to work together.
Bear in mind, the terms ‘data scientist’ and, to a lesser extent, data engineer, mean different things to different people. That’s why it’s so crucial to look for the specific skills you need to solve your company’s problems and hire for those. There could be a data engineer out there who can fill your data science needs and vice versa, there could be data scientists out there who aren’t the right fit to solve your specific problems. There five essential skills in the world of data, and as Skrainka likes to say, they exist on a continuum with data engineering on one end and data scientist on the other. There is overlap, more or less depending on the individual, and some candidates will be stronger in specific areas suited to your team’s needs. The skills are:
- Math and Statistics
- Machine Learning
- Data Engineering
- Big Data Tools
“Very few people can master all of these or be good at all of them, so people are somewhere on the spectrum,” Skrainka said.