Over engineering and paranoid delusions

I think the mind is an interesting computer. Probably the best computer around with the highest error number. Our minds are so bad that we had to build computers to formalise our logic. Our minds are so good that we still cannot understand or mimic them with computers.

Some of the major problems occur when the human mind talks to the computer. The battle ground for this war of logic versus association occurs in none other than the mind of a programmer.

Most programmers (including myself) cannot write long drawn out stretches of code without making some syntax errors, more runtime errors and some nasty logic errors. When we see our mistakes they are obvious. After all, you cannot argue against the cold logic of flowing electrons through semiconductors.

Eventually we come to an agreement with our computer pals. Things work kinda... it goes well... for a while.

This is where paranoia sets in. We all indulge in occasional paranoia and this is normal. Most of us set it aside and move on. Some of us move into padded one bedroom apartments. The rest try to cover every possible outcome the universe has to offer with complicated programming.

"What if the database server is off?" "What if the DNS fails?" "What if this process is interrupted half way by a server crash?"

I Will not deny that some of the questions above are valid. The context of where they are valid is worth noting however. Most systems for instance assume that their database is there. Just let them crash.

Why would I say such a thing?

If you write code for every conceivable outcome and 5% of those situations do actually occur, you have written 95% of useless code. I've seen this. I've done this.

A prime example of this psychotic behaviour represents itself when a programmer decides that external systems should not be trusted. These external actors call functionality in your system, but because of your lack of trust you validate every single byte of data they send. Your code swells to ridiculous proportions. Your paranoid delusions suck you into a world of confusion and madness, and you start to distrust your own code, and the code written by those after you. You write in checks for everything. You picture your code being discovered and studied by hyper intelligent energy beings from the distant future. You've done it! You've written a bug free program!

You release your rock solid unbreakable monster state machine into the world and as you lay back in your chair and watch how it goes, something bad happens. One of the checks you are doing is causing perfectly valid system behaviour to be devoured as error. Because checks are just logical branches and not crashes you roll up into a little ball and suck your thumb while frantically wading through thousands of lines of code to find the culprit. You debug n levels deep to find nothing, because now you are sure it is one of your checks.

Days are spent in caffeine induced stupour searching for the smoking gun, and nights are spent tossing and turning while debugging strings of cheese in your dreams. You wake up with hunches and lay down at night with disappointment.

Finally, after blaming yourself, you find an incorrect configuration setting. The day is saved, but you will never be the same. You hang your head in shame as you sneak past people wanting to ask you if you found that difficult bug.

The moral of the story is now clear. ONLY do what is necessary. Accomplish this first. Plan for obvious outcomes, not for unlikely ones. A perfect parralel comes from a snippet of code that did it's rounds on the web:

if (true == false) { panic(); }

I learned this lesson fairly early on by reading and debugging other peoples' code. I have done this myself as well, blaming my checking code when something that was completely unrelated caused the problem. I have also written checks that broke everything.

GUIDs are bad mkay

A globally unique identifier is a unique reference number used as an identifier in computer software. The term GUID also is used for Microsoft's implementation of the Universally Unique Identifier (UUID) standard.
The value of a GUID is represented as a 32-character hexadecimal string, such as {21EC2020-3AEA-1069-A2DD-08002B30309D}, and is usually stored as a 128-bit integer. The total number of unique keys is 2128 or 3.4×1038. This number is so large that the probability of the same number being generated randomly twice is negligible.
Still, certain techniques have been developed to help ensure that GUID numbers are not duplicated (see Algorithm below). - From Wikipedia/GUID

I have quite often seen the practice of GUIDs being used in databases as primary keys. Much has been written against this practice, but I felt that more voices are louder, so I am also going to chime in. GUIDs are good for a few things:

Making your database incomprehensible

I Once worked on a database that used GUIDs. The biggest problem is that when your inspecting the tables with your eyeballs GUIDs are long scary numbers. You have to do some funky trick where you remember the beginning and the end. If it's a transactional table, then god help you. This will not allow you to write quick selects though, and you will have to copy and paste it everywhere you go. If it is your primary key then your even worse off, because you have to join on them, and you will have a bunch of them lying around in your table structure. It will look all FBI and shit, but that's where the fun ends... REALLY!

Making your database slow

GUIDs as keys are nasty. The DBMS has to work with 32 character hex values instead of ints. Ouch. Indexes on GUIDs then will be huge, and primary keys default to being indexes. So generally a bad idea.

Adding no value whatsoever

Have you ever wondered why your bank account number is not a GUID? The reason is quite simple. You could make a mistake and choose someone else's bank account number. With check digits there is no such problem, because you will have to mess up both specific numbers in the sequence and the check digit at the end. If data integrity and validation is your goal, then check digits are the way to go, not GUIDs

Making your database look like it was written by rainman

An over engineered database looks ugly, works ugly, and is a PITA to maintain because you need to be a savant to keep the mental overhead while your working. GUIds though all unique, also have a nasty quality of looking the same. I know you can level that same argument against normal auto ids, but I'm sure there is an easy way around that. With a GUID appending a character at runtime will not make it more legible. If you are giving users these numbers then you will see that most people are more sand dweebs than rain men when it comes to remembering, repeating, or even copying down extremely long letter/number combinations.

End?

Before you embark on the wonderful journey of GUID think about it carefully. Do you really really really need identifiers that are globally unique? Does that warm feeling of having planned for the most random and unforeseeable eventuality really make it worth the sacrifice?

So what else?

If you're working with a transactional table, you can use normal auto identities. If your auto identities are for something sensitive like bank account numbers, make check digits. If you have a lookup table and want to ensure an identifier for each record make it text based and put a unique key on it like ChecqueAccount, SavingsAccount etc... That will guard you from auto id renumbering and save you the pain of having to look up your lookups so to speak when working with them in code.

Finally end?

I sometimes look at tables using GUIDs and wonder whether there were really good reasons to use them. After close inspection I have not found one good use for them. Identifiers live in a world of context. Your ID number does not make sense when you dial it on your telephone, and your dentist's practice number will not be used as your bank account number when you transfer funds. Human error is a big problem with users especially, something that check digits prevent. GUIDs however seem to make it easier to make a mistake and more difficult to remember the numbers themselves. Nobody is served well by the security. Unless you are using it for security purposes, in which case you just wasted 5 minutes of your life reading this