Sunday, March 17, 2013

Bulldozer or a bus?

Is your system a Bulldozer or a Bus? It can't be both at the same time.

This is a quick introduction to a metaphor. I made it up last week, maybe it will catch on. Big data systems fall into one of two categories: they are either a bulldozer or a bus.

I know this is a picture of an excavator. Bulldozer creates better alliteration and for our metaphor they both fit.
First: Bulldozers. Bulldozers have relatively few users (the drivers) and they need to move a lot of dirt (data). All of the mechanics and resources of the system are aligned for this purpose, moving (computing) lots of stuff. Bulldozers are often used to find inferences or create things that serve the purposes of buses. I hate to risk stretching the metaphor too far, but you can compare the creation of pre-computed indexes to bulldozers creating "roads" for buses to drive on. It's a great example of how the these different types of systems support each other.

Buses, unlike bulldozers are designed to handle lots of users (passengers) on the system. Computer resources are allocated, optimized, and balanced to handle the processing required when dealing with many simultaneous users. 

The "data physics" of bulldozers and buses are different. Mixing these roles leads to poor performance and upset stakeholders.
Some specific examples:
  • Cloud Foundry
  • Heroku
  • Cloudfront
  • App Engine
  • OpenShift
  • Cloudify
  • AppFog
  • Most of the Hadoop ecosystem
  • Most of the Greenplumb products
  • Datameer
  • InfoSphere
Software that supports both types of systems (but could be easily mistaken for one or the other... in other words they could be categorized by context):
  • Any of the Infrastructure As A Service systems (AWS, Azure, Rackspace, VMWare)
  • Much of the database technology out there (Mysql, SQL Server, Oracle DB, SAS)
  • Most visualization software
Even programming languages can loosely be categorized with these metaphors. All general purpose programming languages can really serve as functioning components on a bus, but languages such as R and Matlab are designed to be used as parts on bulldozers.

One of the defining themes of the big data era has been an awareness that bulldozer tasks are not ephemeral and more organizations are investing in bulldozer systems specifically to support them. In my opinion, the biggest danger comes when folks spend money to load bulldozer features onto a bus to try to make it perform better. It often doesn't work, but plausibly could in some circumstances which is why people keep trying to do it. Another common occurrence is trying to allow too many users on a bulldozer. Bulldozers make poor buses.  Trying to use bulldozes as buses not only destroys bulldozer performance, but usually assumes a level of subject matter expertise that isn't present. Not everyone is qualified to be a driver of a powerful and flexible system.

A good metaphor will help explain the differences and help guide expectations for both yourself and your customers as you work with these very different systems in your enterprise.

Thursday, March 14, 2013

5213 year old mind

The history of writing can be traced back to 3200 BC. Add Anno Domini Nostri Iesu (Jesu) Christi 2013 to that number and we can calculate that 5213 years that have passed between when I'm writing this and then. Who knows how much further it goes, but the evidence goes back that far.

I think there are two primary reasons why humans fail at intellectual endeavours. One is ignorance and the other is ineptitude. I'm not pointing fingers, I'll be the first to admit that I've been guilty of both; so relax. This dichotomy was laid out in the excellent book titled "The Checklist Manifesto" which I recently read, so I'm borrowing the idea here and I'm not trying to be overly detailed. As I think back to my experiences I can probably lump each failure I've witnessed into one of these two categories and leave the discussion of the finer details for another time.

Ignorance is lack of knowledge, and ineptness is clumsiness.  If you are inept, chances are you'll suffer self-inflicted ignorance and be unaware of the knowledge you require to be successful. However if you are not inept and explore all opportunities to discover the required knowledge, ignorance can still be a contributing factor to failure. So ignorance really ends up being a root cause that's sometimes unavoidable.  Research projects, in my mind, usually tackle issues related to ignorance. Most other projects deal in some way with resolving ineptness. Some examples to illustrate my point: There's probably a great way to cure all forms of cancer, however we are ignorant of it so we need to research and uncover it. Building a bridge over a river is in large part an exercise in avoiding ineptness by referencing previously disseminated knowledge about the proper way to build a bridge.

I think that ineptitude can be mitigated by evaluated experience. Experience by itself means nothing, but experiences that  you later reflect on means everything. If you have 20 years of "experience" it might as well be worth 1 year of experience if you just kept repeating the mistakes of the first year over and over. The importance of evaluation of experience was recently highlighted in another book I just read "Leadership Gold". Sometimes we all watch people with lots of "experience" act very inept. The truth is that they might have spent the time, but they let it pass them by. John Maxwell explained it better.

So how do we combat ignorance? By sharing knowledge. I'm only a few paragraphs into this post and I've already explicitly mentioned two recent books that have helped. There are probably thousands of others that have influenced me or influenced other people that have influenced me as I try and write down this thought.

Some advice was given to me when I was much younger and is the real point I'm trying to make with this post:

"It's fine to have a young body, but there's no excuse not to have a 5000 year old mind."

I forget who told me this, it's been a very long time. But this line has stuck with me and been a driving mantra. The wisdom of the ages is available at our finger tips and it's our job to find the best parts of it and apply them to our endeavours. To not do that is an inept and clumsy way to live.

Here is General James Mattis summarizing the concept: "The problem with being too busy to read is that you learn by experience (or by your men’s experience), i.e. the hard way. By reading, you learn through others’ experiences, generally a better way to do business, especially in our line of work where the consequences of incompetence are so final for young men."

If you're reading this post, you've probably already figured that out. Find somebody in your life that is young and give them the good news. They don't have to wait through years of experience to get access to wisdom. They can get to it right now.

Tuesday, March 5, 2013

The Alchemist and The Astrologer

This picture 17th century painting "The Alchemist" is public domain via creative commons and free for our use and discussion. Cornelis Bega's only other existing painting of a scientist "The Astrologer" is proprietary.  I'd have to pay the British National Gallery for rights to use it in the Internet on a yearly basis. I checked and it's $225 for three years on an Internet homepage, not including VAT which itself is a whole other discussion.  This annoys me deeply. It's the same artist, same time frame; drastically different licensing. In my opinion, ironically (or hypocritically) it is antithetical to the stated mission of the National Gallery: promoting access to art. 

I'd prefer to show The Astrologer because it more closely relates to one of my favorite analogies: computers are to computer science what telescopes are to astronomy. The point being that computer scientists don't study computers any more than astronomers study telescopes. They are both exploring something else more fundamental.

Use of The Astrologer image is denied to me by the greed of a foreign gallery and it's associated government for extorting an imaginary "Value added tax" for a value created centuries ago. Fortunately, Alchemy serves as a satisfactory proxy in lieu of the more aprapos 17th century astronomer for adding a little bit of classy art to my presentations and media related to computer science. So... it gets used and shared, while The Astrologer gets ignored in obscurity.

There is some deeper point to be made here about open source software and the open innovation model. But I'm going to leave that discussion for a different time and place. Set all those great points aside. At least we should be able to share and discuss 400 year old art without some thug with a lawyer extorting money. They didn't make the art, why are they demanding my money to be able to show it to others?