Dev 007: February 2016

Saturday, February 13, 2016

Reminiscence of writing a book on Machine Learning: Challenges, Lessons Learnt, Experiences and Insights...

I have been working on writing a book on Machine Learning, namely “Apache Mahout Essentials" for about 6 months, which was published recently by Packt Publishing - UK.

I’m sharing my experience in this article, as it may help others who want to pursue the same.

So, I got an invitation to write a book, what’s next?

When I got an email from Shaon (Acquisition Editor at Packt Publishing) to write a book, I immediately replied saying that I’m currently occupied (if not overloaded) with MSc and office work and I won’t be able to do that. Then, Shaon again approached me saying they can give flexible timelines for chapter deliverables and asked me to give it a second thought.

Then I spoke to Abi, with three possible options in my hand and one was “not writing the book” which she straight away eliminated saying that “even writing a bed time story book itself is something she won’t miss out”.

Also, I spoke to Rafa, who was the Head of Research at Zaizi sometime back. He assured that I can do this and gave an advice which was just three words but helped me vastly though out the journey of writing the book. “Step by step!”

So, I want to emphasis the fact that, even though I’m getting some recognition on writing a book, if it wasn’t for these people it would have been just a rejected invitation. I have no words to explain my gratitude for them for the motivation they provided.

From my side, the steady and compelling reason to start writing this book is my unquenchable curiosity about machine learning and the desire to learn more.

Yup, decided to go ahead and try out, But still…

So, I started writing and within no time I realised that this is not as easy as I imagined.

One reason is I was following MSc in Artificial Intelligence where we had to follow 4 modules in 8 weeks (and the following week exams! - no study leave) and we had lectures during entire week end 8-5 (Those who went through this struggle will realise the pain ;)). Apart from that, I was working full time as well. To make the situation even worse, I had to travel for 2 hours daily as I stayed out of Colombo.

So, I decided to utilise the travel time effectively and I was reading the required content using my smart phone even if I’m standing in a crowded train. There was a time which I worked almost all the hours continuously. As a result, I got stressed out and most of the time I was sick.

This is where "focusing one thing at a time" helped me, as it was so overwhelming to think about all the items in my “things-to-do” list. Also, I planned out the structure and the content before start writing, with fresh mindset. And then I spent all night before the deadline finalising everything.

However, regardless of the problems that came along my way, I was determined to complete what I started. I remember one day I was having a terrible ear infection and still I was struggling to meet a chapter deadline until 3 a.m.

Shaon and Nikhil (Content editor at Packt Publishing) were working with me during this time and they were kind enough to give me flexible chapter deadlines which will not overlap with my university exams.

Finally, it all worth the effort!

The book went through several stages of reviews/ revisions etc. before publishing and the happiest of all was the time I completed all the first drafts.
And the next may be getting the opportunity to decide an image with n-Shades of Grey as the cover page. ;)

Reading has been my favourite and consistent hobby since my childhood, yet I was unaware of the publishing process a book has to go through before it reaches reader’s hands. So, getting to know the process itself was another exciting factor.

In addition to learning and writing about ML concepts, planning out on how to structure and present the content to ensure others can understand was a novel experience as well.

Finally, writing a book was one of the bucket list item in my life and it turned out to be immensely rewarding that exceeded my expectations.

However, this is just one milestone in the long journey of machine learning. There is lot to learn, lot to experience and lot of things that needs to get better :)

Recap on WiML and NIPS 2015

I presented in WiML 2015 and attended NIPS 2015 which was held in Canada. I thought of sharing my experience in this blog on that. I know it's little late, but better late than never... ;)

WiML was founded by two women researchers (Hanna Wallach and Jenn Wartmen) from Microsoft research when there were sharing room for NIPS. Very few women engage in machine learning research (Specially in our region, very few women engage in machine learning research when compared to men). WiML was form to provide women machine learning researchers an opportunity to collaborate and share their research experiences.

Few quick facts on WiML:

Support network for women researchers
Share knowledge about their research work
Initiated for Grass hopper conference - proposal for grass hopper session
Co located with Grace hopper conference (women in Computing)
2008 co located with NIPS

What did I present there?

In WiML, I presented an approach to analyse and retrieve different content forms such as image, video, text, etc., which are embedded in different content forms in a collective manner. I have given more information on this in the link given below:

http://jayaniwithanawasam.blogspot.com/2012/12/content-extraction-and-context.html

Few notes I took from WiML Invited Talks

The slides from the speakers for the invited talks are available at: http://www.thetalkingmachines.com/blog/2016/1/15/real-human-actions-and-women-in-machine-learning

Super human multi tasking - Raia Hadsell - DeepMind

Games as platform to implement and test AI applications
Why? difficult and interesting for humans/ huge variety of games/ built in evaluation criteria and reward
Atari 2600 games
Reinforcement Learning
Deep Q- Learning
Knowledge/ policy distillation - model distillation (model compression/ compress the knowledge in an ensemble into single model)
Create intelligent agents that can learn many tasks > multiple Atari games

Structured data/ facts at scale (and bit of machine learning at Google) - Corinna Cortes - Google Research

Structured snippets - Extracting structure from unstructured content - Less clicking, more convenient
Problem - How do we find good tables on the web?
Feature design -

semantics of the table is often determined by surrounding text
detecting subject columns - other columns contain properties of the subject

Determining column classes using Google knowledge graph

Is it all in the phrasing? - Lillian Lee - Cornell University

Does phrasing affects memorability?
Memorable and non-memorable movie quotes
Memorable quotes use less common word choices
Memorable quotes tend to be more general in ways that make them easy to apply in new contexts
“These aren’t the droids you are looking for” :)
http://www.cs.cornell.edu/~cristian/memorability.html

Interactive and Interpretable Machine Learning Models for Human Machine Collaboration - Been Kim, AI2/University of Washington

Communication from machine to human - provide intuitive explanation
Basin case model - proto type and subspaces to help humans understand machine learning results
BCM on recipe data
Subspaces, the sets of features that play important roles in the characterization of the prototypes
Learns prototypes, the ``quintessential observations that best represent clusters in a dataset
Prototype clustering and subspace learning. In this model, the prototype is the exemplar that is most representative of the cluster.

Other events I attended

Lean in Circles

Dedicated to helping all women achieve their ambitions.
Founded by Sheryl Sandersberg - COO Facebook

Nvidia

GPU computing/ speed up deep learning matrix calculations
NVidia digits - interactive deep learning GPU training system
Demo that shows how GPUs can speed up training operation in deep neural networks

Career advice session

Helpful not specifically for machine learning but for any career
http://wiml2015.weebly.com/list-of-careeradvice-tables.html

Finally NIPS!

NIPS is one of the top machine learning conferences in the world. I have mentioned few important deep learning techniques that got highlighted in the conference.

CNN

Recognise images, used in computer vision
Object proposal generation, image segmentation
Feed forward neural networks

RNN

Networks with recurrent connections which forms circles (signals travelling in both directions)
Used in NLP
Designed to recognise sequences such as speech signal or text
Process arbitrary sequence of input
Speech recognition, hand writing recognition
LTSM - question answering

Type of RNN
LTSM outperforms other sequence learning methods such as conventional RNNs and HMMs
Grammer as a foreign language

So, that’s it for now. :) I might write a detailed blog on NIPS, if I get some free time in future. NIPS is somewhat overwhelming and I need to go through the ideas presented there again to have a clear grasp on cutting edge technologies in machine learning.

I have given some thoughts on this in Zaizi blog as well.

http://www.zaizi.com/blog/wiml-and-nips-conference-montreal-2015

Also, I gave my thoughts on 2nd Colombo Machine Intelligence Meetup which was held in WSO2 on Feb, 2016.

http://www.meetup.com/colombo-machine-intelligence/events/228052923/

I have given my slides at,

http://www.slideshare.net/JayaniWithanawasam/thoughts-from-wiml-2015-and-nips-2015

Pages