This week I read the book Facts and Fallacies of Software Engineering by Robert L. Glass, in preparation for attending a new meetup devoted to discussing classic books about software development. The author opines that the information in the book is ageless, and that remains mostly true some fifteen years after its initial release. The book mentions Agile development once in passing, right near the end, and as much as that management technique was meant to address some of the issues discussed in the book, that method’s pioneers have gained a deep appreciation for what Agile (and Scrum) do well and what they don’t do so well. With that in mind they’ve moved on to newer approaches. (One should also bear in mind that every approach should be modified to local conditions and used as a guideline rather than a brittle, formalist approach to be followed to the letter above all.)
One of the observations Glass makes is that the history of software development is littered with new methods, paradigms, and so on, that promise to revolutionize the field by orders of magnitude. The truth, he suggests, is that the innovations, while valuable, tend to improve things by no more than forty percent, a far cry from “orders of magnitude.”
And why is this so? Mostly, it’s because software is incredibly complex, a fact that is too rarely understood or appreciated. This complexity is embedded in every step and aspect of software development and engineering, and an improvement in any individual facet–no matter how great–can have only so much effect on the practice as a whole. Some facets of the process are (my list): identifying customer/business/operational requirements, identifying system requirements, identifying user requirements, language features, low-level computer science methodology (classic, foundational algorithms), interface features, architecture and interface paradigms, testing methodologies, governance schemes, and tools and languages to accomplish all of these.
I spent five years analyzing the logistics of aircraft maintenance and supply, where individual aircraft were represented less as individual entities in their own right and more as collections of, as my mentor described it to me, “several thousand parts flying in close formation!” Of those thousands of parts, each one had a rate at which it needed to be maintained and/or replaced. Even if some parts almost never required maintenance while others required a lot more, the sheer number of parts overall meant that making even major improvements to the reliability or maintainability of a handful of the most “troublesome” parts could have only a limited effect on the overall maintainability and support overhead for any aircraft or group of aircraft. This was often a source of disappointment for our customers, and I can see why. The same must be true of evangelists for new techniques and ideas in software engineering. Nonetheless there will always be evangelists. They have things to sell and they might not know about the historical trajectories of previous ideas.
Still, forty-percent improvements are not trivial by any means, and a long succession of such improvements has had an incredible impact on our ability to produce software. Not only are we producing larger and more complex software systems more quickly, but more people are creating far more of them.
That said, it also remains true that a large proportion of software projects fail, and that happens for many reasons. I can’t remember if Glass says this directly, but I would say that the reasons for most failures don’t have to do so much with the quality or efforts of individual programmers (though the quality of individual teams is quite important, more on that in a bit), but rather have to do with not identifying the correct problem in the first place. I am particularly sensitive to this because, as my website’s tagline suggests, solving the right problem is what I consider to be my “superpower.”
Correctly identifying the problem to be solved entails several parts: understanding the customer’s process, figuring out how to abstract the important aspects of that process to automate and improve it, and to come up with a high-level architecture plan in which to implement it. Having a good governance methodology helps as well, but it is rarely the source of failure. The biggest reason for failure of any project, of course, is poor team dynamics, which affects every aspect of the process from discovery and design through implementation. I couldn’t find the exact quote, but I read somewhere that the outcome of battle is usually determined in the minds of the commanders. What’s more, the outcome is often determined before the battle even begins. This is akin to saying that if you don’t know what problem you are trying to solve you are likely doomed before you start.
There is some tension between system architecture on the one hand and flexibility and adaptability as you go. Starting with the architecture seems like a top down approach while attempting to use Agile and Scrum methods to elucidate the requirements on the fly seems like a bottom up approach. It’s not like either approach can’t work in its pure form in limited circumstances, but the tension between the two should be resolved using a hybrid approach. It’s perfectly OK, and even desirable, to have a good concept of the overall architecture are early as possible, so the entire effort can work within that framework. It’s also OK, and also desirable, to continually gather feedback from the customer as the project proceeds so course corrections can be made.
The key in my mind, is to always come up with a flexible and modular architecture that can be easily adapted to situations as they are identified. This is why I always strive to break down a customer’s process into its most basic components. Once that’s done I can identify common and repetitive themes which can be addressed by common building blocks of functionality. I can then design and implement a system based on the smallest number of building blocks, which can be combined in numerous ways with minimal customization, to address the problems at hand. The more modular the solution, the simpler the building blocks, and the less customization is required, the more the result can be made efficient, robust, maintainable, flexible, approachable and comprehensible (by both the user and the builder/maintainer), and maintainable. (This does not mean that software can be universalized, some uniqueness is unavoidable and necessary, but that is another topic the book discusses.)
That is always my way of thinking, anyway. So how does this comport with Glass’s observations?
In his discussion of software quality (itself a potentially elusive term) he talks about a list of -ilities: portability, reliability, efficiency, usability testability, understandability, and modifiability. All of the items on this list match observations I’ve had with the exception of portability, which turns out to be a special case I’ll revisit shortly. I don’t think this means I’m especially insightful, I think it means that anyone who’s seen and done enough in the field over a long enough period of time is likely to come to similar conclusions.
That said, Glass enumerates 55 facts and ten fallacies. He classifies his facts into categories as follows:
- Twelve of the facts are simply little known. They haven’t been forgotten; many people haven’t heard of them. But they are, I would assert, fundamentally important.
- Eleven of them are pretty well accepted, but no one seems to act on them.
- Eight of them are accepted, but we don’t agree on how–or whether–to fix the problems they represent.
- Six of them are probably totally accepted by most people, with no controversy and little forgetting.
- Five of them, many people will flat-out disagree with.
- Five of them are accepted by many people, but a few wildly disagree, making them quite controversial.
And yes, he knows that doesn’t add up to 55 and explains why.
The interesting thing about his observations is that he provides a context for each and explains the nature of the associated controversies, if there are any. I don’t recall having a strong disagreement with anything he described, which may or may not be meaningful.
In my parenthetical about uniqueness I note that different solutions have to address different needs. This necessarily limits how general any piece of software can be. This also affects portability. Smaller concepts are portable. These have to do with specific data structures, optimizations, and so on. They are often the stuff of pure computer science, at least in its early days. Larger concepts are less portable. It might be a good idea to share lessons about super-efficient sorting mechanisms across languages and platforms, but it’s less necessary for hardcore analytic simulations to run on an iPhone. Portability is not always the Grail, and in many cases it needn’t be worried about much or at all.
Glass also relates the timeless observation that a solution should have as little complexity as possible, but not less than it actually needs. This is a critical point that is obvious to those with experience but not obvious to others. If you need power, and if you need flexibility, and if you need to be able to deal with a wide range of truly different considerations, then you have to include custom approaches to each one. You can generalize that as much as possible, but unique processes and properties must be represented as needed. Think of this like compression software (think, PKZip): Some naive observers have claimed that they can compress and compress and compress a data set down to a single byte. Ahh, but how much information would you lose? If the process reversible? Clearly, since we have more than 256 unique sets of date we might want to compress we would need more than 256 possible representations in our compressed output. Compression processes make use of continuous and repeating pattern in the source data. Once all the substitutions are made then no further compression is possible. The compression stops when the compressed output appears completely random. There are situations where “lossy” compression is acceptable (JPEG is a lossy compression mechanism). In these cases the character of the input is sufficiently maintained in an abridged output. The decompression process is not reversible. A JPEG image still looks pretty good to the viewer, even if it’s not as clear as the original RAW image.
An insight that seems counterintuitive but makes perfect sense once you think about it is this. Research suggests that about sixty percent of all software engineering effort goes into maintenance. So far, so good, right? So is this a good thing or a bad thing? It turns out that, of this sixty percent, almost forty percent is devoted to adding new features or making things easier to use. Only about seventeen percent is devoted to fixing things that are outright broken. (The other few percent have to do with migrating to new hardware, hosts, tools, systems, or whatever, as old ones become obsolete.) In this context, performing more and more “maintenance” on a software system turns out to be an indication of its quality. A well-designed system that does it’s job and can be modified over a long period of time without breaking is evidence that it must be all the things I listed earlier (i.e., modular, efficient, robust, maintainable, flexible, approachable and comprehensible, and maintainable). That’s a good thing. Poorly designed and brittle systems often don’t get used long enough to absorb a lot of maintenance effort.
As much as I stated that the outcome of battle is often decided in the minds of commanders I should also point out that there are different levels of commanders. The author does describe how (non-technical) management has to provide an environment for deeply knowledgeable technologists to succeed–and then has to get out of the way. At that point the technological “commanders” have to come up with a good architecture, good algorithms, a good way of working with customers, and so on. The key to is be able to get the various parties to understand and work well with each other. To this extent the quality of the members of the team is extremely important. Now, there is a Pareto distribution (eighty percent of the effect is generated by twenty percent of the sources) in the quality of software practitioners just like there is a Pareto distribution of competence or effect in every other area of life. Sturgeon’s Law (“My dear sir, ninety percent of everything is crap!”) may even be said to apply. That means that not all practitioners are equal. Some are not just twenty or fifty percent better than others, they might be ten or twenty times better than others. (Don’t even get me started about hiring by unicorn-like list of specific technologies in lieu of considering experience, ability to learn, and adaptability…)
This has a few ancillary effects, like the observation that (the way I hear it when I worked at Westinghouse) “Nine women can’t have a baby in a month.” This says that throwing bodies at a problem may not only not speed up the solution to the problem, it may actually slow the solution down. Local knowledge, experience, and continuity among team members is extremely valuable. I’ve always said that I would rather build a team using a smaller number of experienced practitioners than from a large number of less experienced ones. Skill, knowledge, good communication, and trust are extremely important and should be fostered at every opportunity. If you have a lot of turnover you’re probably doing it wrong (either hiring the wrong people or not properly incentivizing the right ones to stick around). This does not say that some people don’t have to go; some problems you just can’t fix. That happens.
I will say that in the rare instances where I’ve hired people to work on teams of my choosing in situations I understood intimately, I had a very high rate of success (essentially 100%). I achieved this by not being overly specific about the requirements and experience I was looking for. What I was doing was so specific and “long-haired” that I knew I was going to have to teach them a lot when they came on board. What I looked for instead was raw intelligence combined with flexibility and adaptability (e.g., I didn’t care what languages they knew, I wanted them to know at least two languages; they didn’t necessarily need to come from the same industry, but they did need to come from a technical background with a certain amount of heft, etc.).
The second section of the book, which described ten fallacies, was a bit less interesting to me. The most important one, I think, has to with mistaken notions of managing by things you can explicitly measure. While identifying specific, efficient metrics can be very important, there remains a soft aspect of management that cannot be overlooked. I think everyone is in agreement on this.
The bottom line is that the book felt appropriate and familiar to me. Even if I didn’t know all the practitioners, details, and history of every item in the book, certainly most of the material was recognizable based on my own experience, they all the ring of verisimilitude or “truthiness.” I think this book is useful for experienced practitioners, to contextualize their own experience, but it may be even more valuable to newer practitioners. They might not get the same things out of it. What they might get instead is a guide to things to look for. Rather than having things seem familiar because you’ve already experienced them, they might seem familiar when you experience them for the first time.
They say that experience is something you get right after you needed it. Wisdom, by contrast, is knowing how to apply knowledge before you need to. Forewarned is forearmed, right? I think this book contains a lot of useful wisdom.