Three things I wish I knew earlier about Machine Learning

I’ve been working with Machine Learning models both in academic and industrial settings for a few years now. I’ve recently been watching the excellent Scalable ML from Mikio Braun, this is to learn some more about Scala and Spark. His video series talks about the practicalities of ‘big data’ and so made me think what I…… Continue reading Three things I wish I knew earlier about Machine Learning

Image Similarity Database…

Image similarity questions are very common in e-commerce and fashion. This is particular the case with the question of similar colours. I based the following on the excellent work by my friend Thomas Hunger. My implementation has only a few alterations compared to his, but I felt it was worth putting online even if I’m…… Continue reading Image Similarity Database…

Hacking a Paris corpus from Inside Airbnb

There is an excellent resource called Inside Airbnb which has some data sources included in it. I hacked together a script to extract from the descriptions in Paris a corpus.  And then applied this code. On github I’ve put up the code and examples of this. One problem with this example is that currently there are no…… Continue reading Hacking a Paris corpus from Inside Airbnb