Wednesday, September 11, 2013

Map is Map is Map

If you're involved or following developments in data science over the last few years, you've heard the term "map reduce". It's one of the new primitives that's a part of a few of the big data frameworks. When understanding how a technology works it helps to understand when two concepts are closely related. If you're still trying to wrap your brain around map reduce, here's a light-bulb moment that helped me: Map functions are found in a few programming languages like perl and python. Initially, I had thought of them as completely different from the map in map reduce. However at some point it clicked for me and I realized that in principle they are the same operation. All the tricks and idioms that I used with those map functions translates almost directly to things I can do with big data frameworks with the map reduce primitives. Map is map is map. In hindsight it's almost embarrassing that I didn't immediately connect these dots in my mind, but that's how these things sometimes go. Sometimes learning new technology is really as easy as making the mental connection with a concept you've already mastered. 

Perl map
Python map
Map Reduce