2016: In Review

Standard

I’m mostly writing this for me, but maybe it will be interesting to you too! Here’s are some things that happened in 2016. (just to me personally, mostly about programming) This is based on the excellent post by Julia Evans.

Open Source

I continued being involved in PyMC3. This has taught me a lot about programming, the challenges of shipping software. The code reviews by Thomas Wiecki and the others have been amazing.

I helped pick the new logo, worked on becoming a fiscally sponsored project by NumFOCUS. For those of you who don’t know NumFOCUS is an organisation that supports diversity in open source, open source projects and the conferences associated with Open Source. It largely focuses on the Python ecosystem but has branched out to other projects.

Learning about this has taught me a lot about the governance aspects of OSS – and our responsibilities to encourage more people into this ecosystem. I consider that an important part of my duties as a member of the Open Source world.

Talks

  • spoke at PyData London – About Statistics with Python.
  • spoke at the Toulouse Data Science Meetup – I spoke about the PyData ecosystem
  • I keynoted at PyData Amsterdam – I spoke about the current PyData ecosystem and what various tools like Dask, NumPy, Numba, etc are all for.
  • Gave a talk at the Bayesian Mixer in London on the state of PyMC3 I spoke a bit about the new tools in Variational Inference, which has been a research topic of mine for the past year. I wish I had time to finally write some slides on that.

Doing the PyData keynote was kind of exciting/scary (me??? keynote???) and I think it turned out well and I’m happy I did it. I love the PyData community and I’m happy with the talk I gave.

It’s been fun to experience some of the other places that are doing Data Science and forming communities. At each of these events I’ve met a lot of cool people. It’s great to see our industry grow up!

In 2017 I’ll be keynoting in Colombia in Feb at their PyCon Colombia conference. I’m excited to give this talk. I want to goto a conference like NIPS/KDD/ICLR/ICML to stay a bit closer to some of the improvements in the Machine Learning world from Academia/ Industry.

cool: Writing for Hakka Labs

  • I was honoured to be featured on Hakka Labs, Hakka Labs run the excellent Data Eng Conference and some awesome content on their blog. I wrote about Three Things I learned about Machine Learning, this is an ongoing journey where I realise how little I know.

cool: Blog

Some of my favourite posts this year have been.

  • A map of the PyData Stack  – This was an idea floated with Thomas Wiecki before. I finally got around to doing this for my keynote, the aim was to give some people a ‘map of the pydata stack’ and what different tools were for.
  • I interviewed one of my heroes – Greg Linden who helped devise the first Collaborative Filtering algorithm in production at Amazon.
  • I did some other interviews – I liked this one too with Masaaki Horikoshi one of the most prolific contributors to the PyData ecosystem.

I’ll continue to do some interviews over the next year, and hopefully add them to a revised book.

cool: moving to London

I moved to London in late March. I’ve found it very exciting to be close to the Machine Learning community and Data Science community out there. It was a hectic few months adjusting to new job(s) however I’m glad I made the move.

I think everyone should spend sometime in a major city when they’re young.

I hope to blog a bit more about work in the next few months.

cool: Teaching Data Science

My friend John Sandall  mentioned a Teaching Assistant gig at General Assembly.

I helped about 20 students learn more about Data Science, they came from various backgrounds and sharing my own experiences – reminded me that 1) I knew stuff and 2) teaching is hard.

I recommend to all Data Scientists and Engineers if they get the time to teach. It’s a great experience and I learned a lot about what was easy and hard in Machine Learning.

conclusions?

some things that worked:

  • asking a lot of questions about how computers work (not a surprise)
  • working on a team of people who know more stuff than me, and listening to what they have to say
  • asking for advice from people who are more experienced than me.
  • at work, figuring out what’s important to do and then doing the work to get it done, especially if that work is boring / tedious
  • working on one thing at a time (or at least not too many things)
  • getting a bit better at software “process” things like design documents and project plans
  • learning how to mentor junior data scientists – this is something I’m continuing to do
  • learning more about leading teams in ML – which is hard. I’ll not probably be doing too much people stuff over the next few months.
Advertisements