Wednesday, September 14, 2016

How Open Source Devolves

You know what I'm talking about. Why are you forced to use the build flag "utf8strings" to generate correct python thrift code? Why is the default behavior of MySQL to truncate data (among a million other things)? Why, over time, do so many projects/libraries/services become obtuse and require a wealth of knowledge to successfully use correctly? Why do so many open-source things come with absolute bonkers default behavior?

Let me show you an example.

https://issues.apache.org/jira/browse/THRIFT-395

Let's systematically break this down. The behavior of the Thrift compiler at the time was completely unaware of unicode strings. It was essentially broken, especially when talking to other thrift code. Thrift contains two string-like types: string and binary. Binary is for raw bytes, while string is for utf8 -encoded strings. Python at the time wasn't correctly encoding unicode strings as utf8, so it was broken. Essentially every other thrift target language was doing the right thing.

Now if you notice in that thread, a tortured programmer soul was disturbed by this change, because it would break his existing code. This argument is the cancer of the open-source world: if the world is broken, it must remain broken because fixing it will break my thing.

But this isn't true. This person's code would only be broken if 1) this code change landed, 2) they upgraded their thrift libraries to the new version containing the change and 3) refused to go through their broken codebase and change "string" to "binary". This person is willing to upgrade versions of thrift, but unwilling to run sed. Maybe they're on a mac, and BSD sed can be tricky? I don't know. But this person could also just NOT upgrade thrift, and everything they've written will continue to work. Or they could both upgrade thrift and use some sed.

Yet, because of this one person, the ENTIRE world gets to add "utf8strings" to their python thrift builds.

Look, this is like if Ford made a truck and accidentally forgot one of the wheels. Then one person figures out how to load the truck bed so that it drives (albeit shakily) on 3 wheels. Then Ford issues a recall and this person protests, so they cancel THE ENTIRE RECALL and EVERY truck continues to be shipped with 3 wheels. The 4th wheel is included in the truck bed when you drive it off the lot, in case you want a truck with 4 wheels instead of 3.

And if you go looking, you will find exactly this, over and over and over. This is literally how open source development works. You can't fix the world, you have to keep it broken.

This is how open source sucks.

Don't even get me started on committee governance models. Let's go ahead and dilute any individual expertise on the committee by giving everyone an equal vote.

No comments:

Post a Comment