So this is a quick review of a book that ended up in my mailbox a few months ago.
Firstly the good: this is a good academic introduction to a variety of techniques all in one reference book. I particularly liked the discussion of Process mining and survival analysis as I feel these are techniques often neglected in the discussion of data science. I know that the author of the Lifelines library. Cameron-Davidson Pilon has done some screencasts on Data Origami of this technique and the applications it has to say Customer Churn modelling but this is the first time I saw it in a book aimed at Data Scientists.
I believe that Bart is an expert in risk modelling so there is a lot of discussion of financial services applications – this is fine and a good addition to the literature on data science, since a lot of the literature is focused on Machine learning applications for social networking websites or the e-commerce sector.This last point may be due to the fact that Bart is based in Europe as opposed to the Bay Area.
An interesting addition to the data science literature is in his applications chapter – and he includes Business Process Analytics, as someone who has worked on some Business Process Mining I’ve not seen too many remarks to this field in the literature and certainly none in book form so this is a worthy addition.
The bad: The print of the book is terrible and the paper a colour that makes reading it extremely difficult. I also felt that the type face for the mathematics equations was hard to read. This may not be Dr Baesens fault. I felt that some of the material was not new to me – but this is fine I’ve probably got more experience in this sector than the target audience who seems to be soon to finish Masters students or PhD students in STEM subjects who are considering a career in Data Science.
I would also like more discussion on how to present your ideas to clients but I guess this is for a separate book or a book on ‘Creating Data Products’.
Nevertheless I would give recommend the book to any MSc or PhD students interested in a career in data science and any analysts like myself who want a good reference for Survival Analysis and Process Mining. I think those chapters and subchapters make this a worthy addition to my own library. I think also that the discussion of risk modelling and customer churn modelling is excellent as this is a bottom up approach from the Mathematical models and data processing to how a model could be produced and evaluated. Together with say a good Coursera course this could be an excellent preparation for interviews for Data Science roles.
Disclaimer: Dr Bart Baesens sent me a copy of this book for review but I have no stake in it’s success.