Wednesday, May 28, 2014

Engineering Principles

Engineering principles, software best practices, methodologies, process, etc. We're constantly inundated with golden rules to code by which amazingly are almost always boiled down into short sentences or even three letter acronyms (e.g. NIH, DRY). Most seem content to believe that the sum and total of engineering can be distilled down to a distinct, infallible ruleset (and the reduction to acronyms might even hint that it could even be sub symbolic).

Especially frustrating is the moment when, in the middle of an honest discussion, some well-educated blog reader (much like you and I) quips one of these engineering principles with the fortitude and gusto to imply that not only is the discussion ended, but the answer has been provided by the holy commandments of best practice. Even worse is when a disagreement is not defeated on its merits, but paralyzed with the pinpoint insertion of best practice between the argument's metaphorical cervical vertebrae.

After all, you aren't the kind of jackass to encourage an epidemic of NIH, are you?

I certainly am. I also frequently reject code reuse, disregard the stability of my interfaces, ignore dependency management, and repeat myself. I learned these filthy habits from some of the best software engineers in the world. Allow me to explain.

NIH versus Limiting External Dependencies

To adhere absolutely to either of these principles is an impossible contradiction. The refusal to pass up anything that is available that works for your case, no matter how small, is to accept any dependency that might fit the bill. Even if you are using only the tiniest, most trivial slice of that pie, pies happen to be baked whole. This is like inviting folks with children to your barbecue and assuming the children won't track dirt into the house.

The children are going to drag dirt into the house, and you can't control them. Circular topologies in dependencies can now happen out of your control. You might wake up next monday with dependency conflicts.

At the same time, if it's a really good library, use the damn thing. Especially if it has managed to stay lean while being battle-tested (the limit of open source projects as they approach infinity is monolith).

Interfaces Should be Stable

In theory, you should design your public interfaces once while in a stupor of enlightened engineering bliss. After that moment, they should remain constant for the remainder of forward-flowing time. 20,000 years from now robot archeologists will discover your brilliance, still churning away on one last node.

Except every single project EVER starts off with a huge disclaimer that the interfaces haven't stabilized yet, and you shouldn't rely on them until some fuzzy date roughly 20,000 years from now (may or may not be Valve time).

Don't get me wrong, at some point you need to freeze the interface. But let's examine why. Presumably your interface is going to be used by a whole heck of a bunch of people. It's going to be tweeted some day, retweeted, voted up to the top on HN and spread around as the glorious solution to that problem we all have (probably deployment related). At that point, you will have at least 13 million users of your beautifully crafted library. It's open source, so you won't make a dime from it, but before you go to sleep at night, a single tear will roll down your cheek as you smile to yourself knowing the world is a better place.

Except that's not going to happen. Because your library isn't open source; it's something nobody else will ever want, and your two users sit across from you and to your left. Your two coworkers are the only people using your library, and they're only calling into it from three places in the entire company's codebase. If your code lives in one repo, you can change the ENTIRE interface, and fix up those three places in a single commit. If your code is spread out, it's still at most four commits to change everything. The cost to changing the interface is practically zero. Furthermore, the advantage gained by this low overhead to change is substantial. You can move quickly. You can change. You can learn from how it's used by 100% of your user base (it takes but a few minutes to get feedback from BOTH coworkers), and adapt. You don't need to stuff yourself with nootropics in search of some supernatural cognitive plateau the first time you give it a go.

And when you do end up in that situation where you have hit the open source lottery and actually have more than two users and want to change the interface long after the concrete has set? Create a new thing. The versioning of your interface can just as easily be embodied in the name itself. Luckily, there's no rule about that (yet). But really, that thing you built today because you had to get that darn thing working? That's going nowhere, ever.

Dependency Management

Ah yes, one must manage dependencies very wisely. So wisely, in fact, that this is a problem that continues to plague (almost) every platform, language, ecosystem or distro. That moment when you kick off a build and gosh darn it, one of those kids tracked dirt all over your house. Damn that guy! Doesn't he understand the FIRST THING about dependency management? Ugh, everyone (except you) is an idiot!

Vendor them. Oh, but then you can't get updates! Well, sorry folks, if software engineering was easy we'd all be getting paid minimum wage. I strongly recommend you vendor (or the functional equivalent) your dependencies and move on with life. Dependency management is an impossibly hard problem. Sure, there are various tools out there to help you, but those tools are the product of a labor of sweat, blood and tears. Even so, you're probably going to run into a problem at some point.

This rule is also the bastard father of Interfaces Should be Stable. The assumption here is that if everyone simply followed the rules, none of this would be a problem, at all. Good luck with that. It's unrealistic and impractical.

The best way to solve a problem is to eliminate it. Sure, a tool might exist that "fixes" the problem, but now you've introduced a tool, and tools are built the same way our now-broken system was built, with code by a human. Humans never do anything completely right.

Don't Repeat Yourself

This rule is actually pretty good, except when it's completely misunderstood which is way too often. But even when it's understood, it can lead to some ridiculous abstractions. In addition to the engineering atrocities committed in the name of DRY, if someone discovers you ditched DRY for something as stupid as simplicity, they'll start hurling insults at your code like WET("we enjoy typing" or "write everything twice").

If being DRY requires mind-bending backflips, abstractions, extensive use of the keyword `mutable`, and meta-programing, simply stop. Despite common misconception, code is not for computers. Code is for humans to read, which is then nicely translated into something a computer can do. One of the most (if not the most) important functions of code is to be easily understood by your coworkers, who were ruthlessly dropped onto your project yesterday with orders to fix the mess you made no later than tomorrow.

The other common misapplication of DRY seems to be this idea that code should be reusable, and therefore never repeated.

Code Reuse

Oh the abominations that have been created in the name of code reuse. Again, the tendency of any project is towards a monolithic, one-size-fits-all creation.

When you're shopping for a replacement part for an actual, real-world widget, and you have the option between the one-size-fits-all and the actual replacement part, what has painful experience taught us? The one-size-fits-all will probably work, but it won't work great. It will most likely fit awkwardly. It's nearly impossible to think about code reuse without assuming the one-size-fits-all mentality. In addition, there is probably zero chance anyone (including your teammates) are ever going to reuse your code.

If you are writing infrastructure code, or something akin to standard library code, then by all means make the code reusable. For the remaining 99.999% of us, please stop with the overuse of generics and parameterization and over-engineering.

First and foremost, your code should do what it was supposed to do in the first place, clearly, concisely and as simply as possible. If that isn't reusable, so what? And if someone wishes your prefix trie could also send emails, kindly tell them where to shove the pull request. Isn't it interesting how new projects are so exciting, so elegant, so simple while the old ones are an overwhelming, burdensome array of factory factories and page-long configuration objects? Do one thing and do it well. Keep it simple.


This one is for real. Simplicity is hard, but it's important. In fact, it's probably the most important. Simple things are easy to understand, fix, maintain and change. Unfortunately, simplicity is very hard to define in concrete terms, measure or enforce. The truth is, engineering can't be boiled down into convenient limericks and acronyms. Engineering is really, really hard. As pointed out, many engineering best practices are completely at odds with each other or at odds with reality. The best thing, in my unimportant opinion, is to constantly consider, "is the next poor soul to work on this going to understand it?" That is absolutely important, and if you need to break a few rules to get there, then by all means. Software engineering principles are like pirate law, they're loose guidelines, and they should never be treated otherwise.