Friday, February 1, 2013

Modular vs Integrated wrt Boeing 787 project



A colleague forwarded this article. I think the conclusion it reaches is wrong.
http://blogs.hbr.org/cs/2013/01/the_787s_problems_run_deeper_t.html

First, I have no opinion on the 787 project and have no problem with Boeing's approach. The article attempts to frame the supposed cost and schedule problems of the Boeing 787 project on premature modularization.  Maybe that's a problem when building airplanes or power plants (I have never designed either so I'll suspend my disbelief).  However, I strongly disagree with regard to software.  Granted, software is different than most engineering undertakings.  But I think we can reason through some useful engineering considerations from within that context. 

There are two primary reasons to integrate early in software projects: decreased development time or increased optimization (aka decreased run time).  Sure, faster development and faster software are both good things.  But there are trade-offs to attaining each.

Decreased time: The software term for early and overly-tight integration to reduce development time is a “hack” or “prototype”.  This can be fine in some cases but usually means that the solution is myopic and there is an increased cost to applying it to other use cases or problems.  When writing software, which is mostly tools for others to use during the undertaking of their professions, we want it to be as widely applicable as possible (while still being effective for the original intended purpose). The cost of modifying the software or deploying it in situations outside the scope of the original prototype is made more difficult as during prototype hacking middle components are left out and back end components are often built in a manner that doesn't scale well. Modularization reduces the cost of changing these components to meet new use cases.

Increased optimization: Alternately if integration is explored as an optimization can also be problematic as it increases the cost of change.  Tightly integrated code is more difficult to troubleshoot and debug.  Don Knuth famously noted this when he said “Premature optimization is the root of all evil (or at least most of it) in programming”.  Changing requirements are also a common occurrence in software projects. Optimizing prematurely means that you may be required to untangle a web of dependencies before making changes. You literally "outsmart yourself" when that clever trick to reduce runtime has to be laboriously undone because the tables have turned on where execution time matters. Modularity assists in addressing shifting requirements by reducing the cost of change.

As an example, separation of concerns is a core tenant of how we bounce data around the Internet. The protocol that handles routing (IPv4) of information isn't the same protocol that handles error correction (TCP). Some use cases don't need perfect transmission of each bit of data, e.g. Voice over IP (VoIP), so when those came along TCP was replaced with UDP for that use case. If you miss a few bytes of audio while someone is speaking it doesn't matter if those bytes catch up a few seconds later. Our human brains happen to be very good at filling in the small gaps in audio, especially human speech. To look at the reverse of that autonomy, because of the growth of the Internet it looks like we'll run out of IPv4 address space soon. Updating to IPv6 doesn't break the underlying protocols (TCP, UDP, VoIP). This is all possible because the stack is designed with modularity.

Let's think of another popular and current example: SOLR, Lucene, and Tika. This is a great stack of software, when put together provides one of the best enterprise search capabilities available (and it's open source). SOLR indexes text. Tika transforms various documents into text. Had these complimentary functionalities been tightly integrated into the same software it might have decreased development time or run time, but the added complexity and dependencies would also mean that the use cases would be far more limited than they currently are. Plenty of projects use SOLR and Tika to do unique and interesting things as part of other systems. The modular approach to development of these projects enabled that.

Modularity allows components of a system to evolve independently.  Tika can continue to parse new file types and SOLR can continue to do neat stuff with indexes and thanks to their modular approaches they can both evolve independently. If I have a specific set of file types Tika developers will never see and I can't share with the developers, I can write my own parser and output to something SOLR can read.  Since software advances so quickly, this becomes more important than other professions.  Integration increases the cost of change to any single functionality or concern. Unintended side-effects of advancements cascade through the software. Understanding interfaces is a necessity and the work required to think through modularization is useful in catching unanticipated problems early and creating maintainable/usable software.

In the case of air planes “flying x people y miles” seems to me to imply a very static use case so tight integration might be good.  Certainly in the case of Apple’s product lines tight integration has served them well due to their ability to maintain a homogeneous software and hardware environment.  But I’d even argue that to some degree.  Even though Apple products appear to be tightly integrated the hardware manufacturers have famously made last minute changes that were facilitated by modularity of the hardware components.  Anyone that has ever disassembled an iphone has probably been as amazed as I was to see how modular the components are.  So the author’s argument using his experience at Apple as a precedent falls apart under scrutiny.   

 I doubt modularity caused the failure of the 787 effort.  Perhaps they were implementing modularity at the wrong level or in the wrong way.  Perhaps the actual causes of the failure wasn’t related to integration vs modularity at all. “asked suppliers to create their own blueprints for parts” seems like a pretty big red flag in my mind, wrought with all sorts of chances to mess up simple things like English to Metric conversions or variances in design software nuances causing problems.  “Lawyers will probably get involved” is another red flag.  Writing business sub-contracts which don’t account for known-unknowns seems like very poor decision making.  The author seems to be stretching in the assertions he is able to make about building things if he is blaming this failure on modularity.  (That’s my polite way of saying that I think he missed the mark entirely.)