I think database technologies don’t get enough love and attention.

Anyway recently I was playing with the BigQuery API. Very impressive stuff!

I will write up a proper post on this in the future 🙂

Data Science and Soft Skills


I once did an internship under Andrew Fogg at

I learned a lot about data science at that period, but one of the hardest lessons I had to learn was the importance of soft skills and project management in any data science projects.

John Foreman another idol of mine, talked a bit about this, in his book about data. 

So although I am not a super experienced data scientist, I am going to talk about what I have learned so far from the data science projects, which I have been involved in.

Sometimes it is a development project

 Sometimes you will encounter data science projects which actually need data engineering or software engineering. I think it is ok for data scientists to do some scripting and maybe hack together some web applications. But it is a bit different from what a software engineer team should do.

Data Science is not software engineering For reasons I have not quite well understood, some parts from project management in software engineering work in data science projects and sometimes do not. In my experience the notion that it is an agile project seems to work. Yet daily scrum meetings can sometimes be too much. Also too much interaction with business partners can derail analytics projects.

Gantt charts or burndown charts work to some degree

I have successfully used these in data science projects. They communicate to non-technical stakeholders that progress is being made. Which they often lack the mental model to sufficiently understand.

Solving a problem as stated is not a good idea, without further exploration Sometimes you are given a data science project and a suggested technique – and you try as an analyst to solve that problem. This generally backfires. Interaction with the business here helps, and lots of questions to sufficiently understand what their motivations are.

Deadlines are lies

I have never ever done an analytics project that worked in the way I expected. One reason is that some things are what I call ‘linear tasks’ and somethings are ‘non-linear’ tasks. Applying a basket analysis algorithm can be a linear task for example, but only if one has the right data set prepared and is familiar with the programming language and tools that are used.

So be very firm and explicit with your stakeholders about what is linear and what is not.

Of course if you are in an environment that does not allow you to control your own deadlines and has unrealistic expectations for good quality analytics work, then it is probably a sign the universe is telling you to clean up your Linkedin profile.

I will explore more of these concepts in the future.