Machine learning @ Coursera

[TL;DR;] Machine learning is not as hard as I imagined and is a great/fun toolset to learn and Andrew Ng’s course on Coursera is the best place to start!

There is something about darkness, that brings to life all our fears. It is tempting to say this (illogical manifestation of fears) only occurs during our childhood. The slightest rattles in the attick or the creaks on the floor board is enough to unlock your wildest and horrific imaginations of perhaps the bogeyman hiding under your bed or the *monster* lurking outside your house waiting to pounce on you (something I am still terrified of as I put the garbage bins on collection night).

*But*, shine a light on the darkness the next morning and your fears vanish! The strange, seemingly inexplicable noises and phenomena you observed earlier seem absurd suddenly can be explained (racoons the roof causing the rattle, somebody walking down the kitchen to get a drink, etc).

As strange as this may seem, this fear of darkness and all that lies within is rooted in our prehistoric past.   Wierdly and worse annoyingly this fear transcends the realm of darkness and caves into our modern daily lives plainly, simply and even boringly as the FEAR OF THE UNKNOWN.

For me that fear for years has been Machine Learning.   Actually, it was a combination of fear, laziness and simply brushing off ML as simply statistics and a whole lot of heuristics.   The long thanksgiving weekend was upon me and I really really really wanted to try something different and “new”.   A lunch with a machine learning expert-colleague-friend somehow got me curious to give it a shot.  And since not a day goes by when you dont hear how awesome Andrew Ng is in the field, his ML course on Coursera seemed like the most natural place to start!

Beset by notions of the course being heavy on statistics, I’d be lying if proclaimed my eagerness to it.   However, I was curious to see what the fuss was about!   It seemed every developer, her gardener and his dog were using ML for *something*, so it could’nt really be as fearsome as I had made it to be?

You see where this is going dont you?   Like a true [hb]ollywood flick this has all the promise of taking a carefree ladies man relinquishing his bachelor lifestyle when he falls for “the one”!   Well ok let us not kid ourselves.    This is real life, not bollywood!    More like man learns to appreciate friends he seldom focussed on… or better yet hardworking family man learns to appreciate all the things his partner does for him!

Firstly on ML itself.   It is surprisingly not as heavy on statistics on I had feared.   On the contrary it is pleasantly focussed on linear algebra and from what my limited experience has shown me takes the concept of linear regression to devastating effect.   As scary as that sounds this is actually the most fun part of ML (apart from the large scale infra side of things).   Knowing this 10 years ago would have exponentially altered my life, but better than knowing this *in* 10 years time!

Secondly the course is a testament to the hardwork Andrew Ng has put in personally not only in the study of the field, but also in improving himself as a presenter.   He has clearly taken a topic that has great scope and distilled it in a clear and structured manner.   It is almost as if he has a knack for predicting the kind of questions you would have about a topic and would proceed to address it immediately.   Each lesson builds on the previous chapters and is actually very easy to follow.   Even the math is layered in such a way that the reader can take as much as they would like and go as deeply as they would want without it being a hurdle to proceed (for example the derivation of cost and gradient functions while not explained are extremely fun exercises that you should try out – coming in another blog).

Finally the choice of language.   At first being faced with Octave incited the same gasp that would be incited in any self respective developer.  But not having to implement numeric computing routines (and optimising it) is a huge blessing!    Even for an Octave/Matlab newbie (atleast not having used it for 20 years) the language and environment were painless to get (back) into.

There *are*  a couple of ways the course may have been improved.   Firstly the language.  While Octave was a pretty appropriate environment for this exercise, I am not sure the same could not have been achieved with NumPy/SciPy or in other “popular” languages.   This is not a big deal, just putting it out there.   Definitely gives me a chance to try to implement a lot of the algorithms in another language like Haskell or Swift!   And by the way if I had to *really* pick on two things about Octave – it would, oh:

  1. A lack of a good debugger.  The only way to set breakpoints it seems is via a dbstop function which is extremely unintuitive.  How about a pythonesque “break” or “set_trace” statement?
  2. This was a major annoyance for me.   1-based indexing.   This is fine when your equations actually all refer to and start with “1”.  But when you infact introduce a 0th term, just the cognitive load of translating between the indexes added a good 30% to my assignment times!

Then there were the quizzes.  Personally I am not a fan of “Exam” type assessments.   I could do assignments/projects all day!   Even within the quizzes, the numerical ones I loved as I knew how to apply the formulas I had learnt, but the ones requiring a more analysis with respect to the wording, I personally felt a bit not so much at ease with.   My OCD mind always tries to end up dissecting the hidden meaning behind how questions are worded and end up making wrong assumptions.

But overall a very effective and awesome course in teaching the concepts of ML whether you are an experienced developer or a mathematician.    Definitely recommend it (if there is still anybody out there who hasnt done the course yet!).