Building Full-Stack Vertical Data Products

I’ve been in the Data Science space for a number of years now, I first got interested in AI/Machine Learning in 2009 and have a background typical of a number of people in my field – I come from Physics and Mathematics.

One trend I’ve run into both at Corporates and Startups is that there are many challenges to deploying Data Science in a bureaucratic organisation – or delivering Enterprise Intelligence. Running into this problem led me to be interested in building data products.

One of the first people I saw building AI startups was Bradford Cross – and he’s been writing lately about his predictions for the 2017 in the Machine Learning startups space.

I agree with his precis that we’ll begin to see successful vertically-oriented AI startups solving full-stack industry problems that require subject matter expertise, unique data, and a product that uses AI to deliver its core value proposition.

At Elevate Direct we’re working on this working on the problem of sourcing and hiring contractors – so one of the fundamental problems that companies have which is hiring the best contractor talent out there.

So what are some of the reasons that it can be hard to deploy Data Science internally at a corporate organisation? I think a number of the patterns are related to other patterns we see in terms of software.

  1. Not being capable of building consumer facing software – Large (non-tech) organisations sometimes struggle to build and deliver software internally – I’ve seen a number of organisations fail to do this – their build process can be 6 months.
  2. Organisational anti-patterns – I’ve seen some organisations that rapidly inhibit the ability to deploy product. Some of these anti-patterns are driven by concerns about the risk of deploying software. And often end up with diffuse ownership – where an R and D team can blame the operations team and vice versa.
  3. Building Data Products is risky – Building data products is hard and risky – I think you really need to approach data products in a lean-startup kinda way. Deploy often, if it works it works, if not cut it. Sometimes the middle-management of large corporates is risk-averse and so find these kinds of projects scary. It also needs a lot of expertise –  subject-matter expertise, software expertise, machine learning expertise.
  4. Not allowing talented technical practitioners to use Open Source/ pick the tools – I once worked at a FTSE 100 company that it took me about 6 weeks to be able to install Open Source software tools such as R and Python. It severely restricted my productivity, in that time at a startup my team probably deployed into production, to a customer facing app about 1000 changes. This reminds me of the number 3 here. Don’t restrict the ability of your talented and well-trained people to deliver value. It makes no sense from a business point of view. Data Science produces value only when it produces products or insights for the business or the customers.
  5. Not having a Data Strategy – Data Science is most valuable when it aligns with the business strategy. Too often I’ve seen companies hiring data scientists before they have actual problems for them to work on. I’ve written about this before.
  6. Long term outsourcing deals – This is an insidious one, and one that came from a period of time when “IT didn’t matter”, before big Tech companies proved the value in the consumer space of for example e-commerce. It’s impossible to predict what will be the key tech for the next 10 years, so don’t lock yourself to a vendor for that period of time. Luckily this trend is reversing – we’re seeing the rise of agile, MVP, cloud computing, design thinking, getting closer to the customer. A great article on this re-shoring is here.

I think fundamentally a lot of these anti-patterns come from not knowing how to handle risk correctly. I like the idea in that RedMonk article that big outsourcing is a bit like CDOs in finance. Bundling the risk into one big lump doesn’t make the risk go away.

I learn this day after day working on building data products and tools at Elevate. Being honest about the risks and working hard to de-risk projects and drive down that risk in an agile way is the best we can do.

Finally, I think we’re just getting started building Data Products and deploying data science. It’ll be interesting what we see what other anti-patterns emerge as we grow up as an industry. This is also one of the reasons I’ve joined a startup and why I’m very excited to work on an end-to-end Data Product, which is solving a real-business problem.

Leave a comment