How To Maximize Job Security By Secretly Writing Bad Code

Disclaimer: While the tips in this code are absolutely true and will make your code unmaintainable, this post is satire and should be treated as such. Please don’t purposely write bad code in production.

By a sudden stroke of bad luck, half of your team has been laid off due to a lack of budget. Rumor has it that you’re next. Fortunately, you know a little secret trick to ensuring job security in your cushy software development position — by writing bad code of course!

After all, if no one can setup, deploy, edit, or read your code, then that means you’re now a “critical” member of the team. If you’re the only one who can edit the code, then you obviously can’t be fired!

But your code has to pass a bare minimum quality, or else others will catch on to how terrible and nefarious you are. That’s why today, I’m going to teach you how to maximize your job security by secretly writing bad code!

Couple Everything Together, Especially If It Has Side Effects

A little known secret is that you can write perfectly “clean” looking code, but still inject lots of potential bugs and issues into your code, just by unnecessarily flooding it with I/O and side effects.

For example, suppose you’re writing a feature that will accept a CSV file, parse and mutate it, insert the contents into a database, and then also insert those contents into a view via an API call to the database.

Like a good programmer, you could split out the obvious side effects (accepting the CSV, inserting data into a database, inserting data into a view, calling the API and fetching the data) into separate functions. But since you want to sneakily write bad code until the guise of being clean code, you shouldn’t do this.

What you should do, instead, is hardcode the CSV name and bundle all of the CSV parsing, mutation, and all of the insertion into one function. This guarantees that no one will ever be able to write a test for your code. Sure, someone could attempt to mock all of the side effects out, but since you’ve inconveniently bundled all of your side effects together, there isn’t any way for someone to easily do this.

How do we test this functionality? Is it even testable? Who knows?

If someone were insane enough to try, they would first have to mock the CSV out, then mock the CSV reading functionality, then mock the database, then mock the API call, and then finally test the mutation functionality using the three mocks.

But since we’ve gone through the lovely effort of putting everything in one function, if we wanted to test this for a JSON file instead, we would have to re-do all of the mocks. This is because we basically did an integration test on all of those components, rather than unit tests on the individual pieces. We’ve guaranteed that they all work when put together, but we haven’t actually proven that any of the individual components work at all.

The main takeaway here — insert as many side effects into as few functions as possible. By not separating things out, you force everyone to have to mock things in order to test them. Eventually, you get a critical number of side effects, at which point you need so many mocks to test the code that it is no longer worth the effort.

Hard to test code is unmaintainable code, because no sane person would ever refactor a code base that has no tests!

Create Staircases With Nulls

This is one of the oldest tricks in the book, and almost everyone knows this, but one of the easiest ways to destroy a code base is to flood it with nulls. Try and catch is just too difficult for you. And make sure you never use Options/Optionals/Promises/Maybes. Those things are basically monads, and monads are complicated.

You don’t want to learn new things, because that would make you a marketable and useful employee. So the best way to handle nulls is to stick to the old fashioned way of nesting if statements.

In other words, why do this:

try:
    database = mysql_connector.connect(...)
    cursor = mydb.cursor()
    query = "SELECT * FROM YourTable"
    cursor.execute(query)
    query_results = cursor.fetchall()
...
...
except:
....

When you could instead do this?

database = None
cursor = None
query = None
query_results = None

database = mysql_connector.connect(...)
if(database == None):
    print("Oh no, database failed to connect!")
    cursor = database.cursor()
    if(cursor == None):
        print("Oh no, cursor is missing!")
    else:
        query = "SELECT * FROM YourTable"
        if(query == None):
            print("Honestly, there's no way this would be None")
        else:
            cursor.execute(query)
            query_results = cursor.fetchall()
            if(query_results == None):
                print("Wow! I got the query_results successfully!")
            else:
                print("Oh no, query_results is None")

The dreaded downward staircase in this code represents your career and integrity as a programmer spiraling into the hopeless, bleak, and desolate void. But at least you’ve got your job security.

Write Clever Code/Abstractions

Suppose you’re, for some reason, calculating the 19428th Fibonacci term. While you could write out the iterative solution that takes maybe ten lines of code, why do that when you could instead write it in one line?

Using Binet’s formula, you can just approximate the term in a single line of code! Short code is always better than long code, so that means your one liner is the best solution.

But often times, the cleverest code is the code that abstracts for the sake of abstraction. Suppose you’re writing in Java, and you have a very simple bean class called Vehicle, which contains “Wheel” objects. Despite the fact that these Wheel objects only take one parameter for their constructor, and despite the fact that the only parameters that this car takes are its wheels, you know in your heart that the best option is to create a Factory for your car, so that all four wheels can be populated at once by the factory.

After all, factories equal encapsulation, and encapsulation equals clean code, so creating a CarFactory is obviously the best choice in this scenario. Since this is a bean class, we really ought to call it a CarBeanFactory.

Sometime later, we realize that some drivers might even have four different kinds of wheels. But that’s not an issue, because we can just make it abstract, so we now have AbstractBeanCarFactory. And we really only need one of these factories, and since the Singleton design pattern is so easy to implement, we can just turn this into a SingletonAbstractBeanCarFactory.

At this point, you might be shaking your head, thinking, “Henry, this is stupid. I might be trying to purposely write bad code for my own job security, but no sane engineer would ever approve that garbage in a code review.”

And so, I present to you Java’s Spring Framework, featuring:

Surely, no framework could do anything worse than that.

And you would be incredibly, bafflingly, and laughably wrong. Introducing, an abstraction whose name is so long that it doesn’t even fit properly on my blog: HasThisTypePatternTriedToSneakInSomeGenericOrParameterizedTypePatternMatchingStuffAnywhereVisitor

Conclusion

Bad code means no one can edit your code. If no one can edit your code, and you’re writing mission critical software, then that means you can’t be replaced. Instant job security!

In order to write subtle yet destructively bad code, remember to always couple things together, create massive staircases with null checks (under the guise of being “fault-tolerant” and “safe”), and to write as many clever abstractions as you can.

If you ever think you’ve gone over the top with a clever abstraction, you only have to look for a more absurd abstraction in the Spring framework.

In the event that anyone attempts to argue with your abstractions, gently remind them:

  1. Spring is used by many Fortune 500 companies.
  2. Fortune 500 companies tend to pick good frameworks.
  3. Therefore Spring is a good framework.
  4. Spring has abstractions like SimpleBeanFactoryAwareAspectInstanceFactory
  5. Therefore, these kinds of abstractions are always good and never overengineered.
  6. Therefore, your SingletonBeanInstanceProxyCarFactory is good and not overengineered.

Thanks to this very sound logic, and your clean-looking but secretly bad code, you’ve guaranteed yourself a cushy software development job with tons of job security.

Congratulations, you’ve achieved the American dream.

Advertisements

Singletons are Glorified Globals

You’re working on a snippet of code, and out of the blue, you happen to need a class of which you should only have one instance of, and which needs to be referenced by other classes.

Sounds like a job for a singleton! Or is it?


You want to use a singleton to store state


Realistically speaking, if you’re using a singleton solely to store state, you are doing it wrong. What you’ve made isn’t a singleton — what you’ve made is a bunch of glorified globals. Using singletons to store state seems appealing at first, because you want to avoid using a global and every programmer knows that globals are evil. But using a singleton in this case is just abstracting the globals one layer back so that you can feel happy about your code not having globals.

Furthermore, having globals will make your code unpredictable and bug-prone. This becomes exponentially worse as the number of dependencies in your singleton increases. If you are using a singleton because you need its global properties, it’s because you are not taking advantage of dependency injection (DI) / inversion of control (IoC). You should be passing the state around with IoC, not by creating a giant global and passing around the global fields.


Singletons make your code painfully difficult to test and debug


Singletons can’t be easily tested when they are integrated with other classes. For every class that uses a singleton, you will have to manually mock the singleton and have it return the desired test values. On top of this, singletons are notoriously difficult to debug when multi-threading is involved. Because your singleton is a major dependence for several classes, not only is your code tightly coupled, but you will also suffer from hard-to-find multi-threading bugs due to the uncertain nature of globals.


You want to use a singleton to avoid repeating an expensive IO action


Occasionally, you may be tempted to use a singleton to perform an expensive IO action exactly once, and then store the results. An incredibly common example of this is using a singleton for a database connection. But doing so in addition to having multi-threading will cause massive headaches unless your connection is guaranteed to be thread-safe.

In general, database concurrency will not be easy to implement if you are using a singleton for the connection. What you really want is a database connection pool. By caching the connection, you avoid having to repeatedly close and open new connections, which is expensive. As an added bonus, if you use a connection pool, you simply won’t need to use a singleton.


You want to pass data around, but you don’t need to modify the data


You just need a data transfer object. No need for a singleton here. And worse, giving your singleton access to methods that can modify the data makes your singleton a god object. It knows how to do everything, knows all of the implementation details, and is likely coupled to basically everything, violating almost every software development principle.

Worse yet, using a singleton to provide context is a fatal mistake. If a singleton provides context to all the other classes, then that means every class that interacts with the singleton theoretically has access to all the states/contexts in your program.


You are using a singleton to create a logger


Actually, this is really one of the few acceptable use for a singleton. Why? Because a logger does not pass around data to other classes, provides no context, and there is generally minimal coupling between the logger and the classes that require the logger. All the logger needs to know is that given some log request or string, it should output the log as a file or to the console.

As you can see, a logger will provide nothing to classes that require it — there is nothing to grab from the logger. Therefore, it’s impossible to use the logger as a glorified global container. And best of all, loggers are incredibly easy to test due to how simple they are. These properties make loggers an excellent choice for a singleton.

In the future, you’ll probably find a scenario where you’re considering using a singleton for any of the reasons above. But hopefully, you’ll now realize that singletons are not the answer — inversion of control is.

Everything in Haskell Is a Thunk

There are a lot of misconceptions about Haskell.

Some people think everything in Haskell is a function. Others think that Haskell is just an implementation of Church’s lambda calculus. But both of these are false.

In reality, everything in Haskell is a thunk.

A thunk is a value that has not been evaluated yet. If everything in Haskell is a thunk, then that means that nothing in Haskell is evaluated unless it has to be evaluated.

From a non-functional programmer’s perspective, a thunk may initially seem to be a useless feature. To understand why thunks are useful, consider the following Python code.

x = 5 / 0

Traceback (most recent call last):
  File "", line 1, in 
ZeroDivisionError: division by zero

Unlike Haskell, Python evaluates the results immediately, which means that Python will crash as soon as an error occurs.

On the other hand, Haskell is different. In fact, you can string together dozens of lines of code, all of which contain an error, and no error will appear until one of those lines get evaluated.

let x = 3 `div` 0
let y = x

-- Can even divide x and y!
let z = x `div` y

let stringX = show x

-- Evaluate stringX, which
-- evaluates show x, which evaluates x
-- which finally evaluates 3 `div` 0, causing
-- the error to occur
putStrLn stringX

In this example, no matter what, Haskell will not give an error until something is evaluated. A common misconception is that a function call will always force the results to be evaluated, but that is simply not true.

In the example above, z is assigned the value of x divided by y, and yet there is no error. It is only when the value of stringX is evaluated via putStrLn (which prints a String) that Haskell throws an error.

One way to imagine thunks is to think of every value and function as being wrapped in a box. No one is allowed to peek into the box until the order is given. For all Haskell knows, the box could have anything in it, and Haskell wouldn’t care. As long as there are no type conflicts, Haskell will happily pass the box along.

In the code above, Haskell is fine with dividing two errors (both x and y, when evaluated, will give a division by zero error). However, it is only because x and y are both thunks, and Haskell doesn’t actually know the true values of x and y.

This gives Haskell the benefit of aggressively optimizing code, allowing for massive performance boosts.

Imagine if Haskell had an expensive function, F, that takes in one parameter, takes 10 seconds to run, and returns the evaluated value 5. Say that this same function also existed in Python.

If we were to run F ten times, we would find that Haskell only takes about 10 seconds to finish, while Python would take 100 seconds. Here, Haskell’s functional properties and lazy evaluation allows it to outperform Python.

In Python, functions are allowed to have side-effects. This means that if a Python function is run ten times, with the same input, it can give ten different results. On the other hand, for Haskell, if a function is run ten times, with the same input, it will always give the same result. This means that if there are multiple copies of the same function, called on the same input, Haskell can be certain that those results will all be the same. This is known as referential transparency, and it’s a feature that exists in most functional programming languages.

This property, combined with Haskell’s lazy evaluation, allows Haskell to memoize function calls. Now, if Haskell sees multiple copies of this expensive function called on the exact same input, it will simply evaluate one of the function calls, cache the result, and replace every future function call of F on that same input with the cached result.

What About Infinite Lists?

One consequence of everything being a thunk in Haskell is that Haskell is able to create infinite lists, and pass infinite lists around. In other words, Haskell is able to process and manipulate infinite lists as if they weren’t infinite, because they aren’t. Although the lists are technically infinite, Haskell only takes the elements that it wants.

For example,

let x = [1..]
let y = [1..]

x ++ y
-- concatenate all of x with all of y
-- to infinity
[1, 1, 2, 2, 3, 3, 4, 4, 5, 5 ...]

Here, x is stored to the infinite list from 1 to infinity, and y is stored to the exact same infinite list. Haskell is perfectly fine with storing these infinite data structures, because Haskell does not actually evaluate these structures until it needs to. When they are finally evaluated, they perform exactly like an infinite list — because that is exactly what x and y are.

In Haskell, treating everything as a thunk allows big performance boosts, and also gives the programmer the ability to store infinite data structures. So the next time someone says, “Everything in Haskell is X”, gently remind yourself that unless X is a thunk, they’re most likely wrong.

Why Keyword Arguments in Python are Useful

In Python, there are two types of arguments : Positional arguments and keyword arguments.

A positional argument is a normal argument in Python. You pass in some data as input, and that becomes your positional argument.

def foo(posArg):
  print(posArg)

There’s nothing unique or inherently special about positional arguments, but let’s say you have a function that evaluates your pet. Is your pet happy? Is your pet healthy? Is your pet playful?

def evaluatePet(isHappy, isHealthy, isPlayful):
  if(isHappy):
    print('Your pet is happy!')
  
  if(isHealthy):
    print('Your pet is healthy!')
 
  if(isPlayful):
    print('Your pet is playful!')

That’s fine and dandy, but what does it look like when we call the function?

 
evaluatePet(True, True, False)

We get the output :

Your pet is happy!
Your pet is healthy!

The result is correct, but the function call is absolutely unreadable.
A reader who has never read the documentation for evaluatePet will have a difficult time understanding what it does. From a quick glance, it takes three booleans. But what do those booleans describe? Whether it’s alive? Whether it’s a ghost? Whether it’s a flying ten thousand feet tall purple dinosaur?

The solution to this issue of readability is to avoid using a positional argument, and instead use a keyword argument.

A keyword argument is an argument that follows a positional argument, and allows the user to pass in arguments by explicitly stating the argument’s name, and then assigning a value to it.

In other words, you can call evaluatePet(True, True, False) in any of the following ways, without changing anything in the evalulatePet function.

#Explicitly calling the names and
#assigning all three of the arguments
evaluatePet(isHappy = True, isHealthy = True, isPlayful = False)

#Switching the order of the arguments.
#Since the arguments are named, 
#the order can be anything you like.
evaluatePet(isHealthy = True, isPlayful = False, isHappy = True)

#You can use positional arguments AND keyword arguments
#at the same time, as long as the keyword arguments
#are AFTER the positional arguments.
evaluatePet(True, isHealthy = True, isPlayful = False)

#Keyword arguments can ALWAYS 
#be switched around in any order
evaluatePet(True, isPlayful = False, isHealthy = True)

However, there are some things that you can’t do.

#Putting keyword argument before
#positional argument is illegal
#Will error, 
#"Positional argument follows keyword argument."
evaluatePet(isPlayful = False, isHealthy = True, True,) 

#Also will error for the same reason.
evaluatePet(isPlayful = False, True, isHealthy = True) 

You can see that with keyword arguments, the arguments are explicitly assigned. There is no confusion. The reader can simply look at a line like :

evaluatePet(isHealthy = True, isPlayful = False, isHappy = True)

And they will automtaically know, “Oh. This function takes in three booleans which are, isHealthy, isPlayful, and isHappy.”

It would be a huge understatement to say that this is the only thing that keyword arguments can do.

You can also load in defaults.

def evaluatePet(isHappy = False, isHealthy = False, isPlayful = False):
  if(isHappy):
    print('Your pet is happy!')
  
  if(isHealthy):
    print('Your pet is healthy!')
 
  if(isPlayful):
    print('Your pet is playful!')

Now, all three arguments become optional, and become automatically assigned to False if that specific argument has not been assigned.

#All three are automatically set to False, 
#so nothing is printed.
evalulatePet()

#You can set just one to True, 
#and the rest will automatically be False.
evaluatePet(isHappy = True)
evaluatePet(True)

Convenient, isn’t it? You can give your function a ton of default values, and then allow the user to change any defaults they don’t like, without requiring them to rewrite all the default values.

Underneath all of this magic, Python created a dictionary with a key value pair, where the keys are the argument names, and the values are the values you assign to those argument names.

If you want to prove this fact, you can use a true keyword argument by putting a double asterisk before an argument.

def foo(**kwargs):
  print(str(kwargs))

foo(bar = "henry", baz = "dang", foobar = "prg")

Output :

{'bar': 'henry', 'foobar': 'prg', 'baz' : 'dang'

In other words, Python has been converting evaluatePet’s arguments into a dictionary.

Naturally, Python wants the group of keyword arguments together, because it is cheaper to lump all the arguments together if they are all within one specific range (and not broken up between multiple ranges). In addition to this, Python can’t accept a positional argument after a keyword argument because it is impossible to determine which argument you are referring to. Are you referring to the first argument? Or the argument after the keyword argument?

These two reasons combined are why you can’t put in positional arguments, and then keyword arguments, and then another positional argument.

You might argue that Python should be able to do this :

evaluatePet(isHappy = False, isHealthy = False, False)

Since there are only three arguments, and two of them are keyword arguments, the third argument must be “isPlayful”.

However, Python’s philosophy is
“Special cases aren’t special enough to break the rules.”

So instead of Python automatically iterating over your arguments to figure out which argument hasn’t been assigned yet (you shouldn’t do this anyway since iterating over a list is expensive), Python simply says, “This is a special case. Follow the rules and deal with it.”

So while Python could potentially have allowed this special case to work, their mantra of sticking strongly to rules prevents you from doing so.

In a nutshell, keyword arguments are simply augments to Python’s core philosophy that “readability counts”. Without keyword arguments, readers must examine the documentation to understand what the arguments mean, especially if there are many arguments. The use of defaults also makes functions shorter if the user is unlikely to modify the defaults.

Shorter argument lists? Argument defaults? Understandable parameters? That’s elegant.

Five Great Practices for Safer Code

You’re sitting at your desk, glaring at your monitor, but it glares back at you with equal determination.

Every change you make introduces new bugs, and fixing a bug causes another bug to pop up.

You don’t understand why things are randomly breaking, and the lines of code just increase every day.

However, by coding in a rigorous and specific fashion, you can prevent many of these issues simply by being slightly paranoid. This paranoia can save you hours in the future, just by dedicating a few extra seconds to include some additional safeguards.

So without further ado, let’s jump right into the top five tips for safer code.

1. Stop Accepting Garbage Input


The common phrase “Garbage in, Garbage out” is one that rings strongly with many programmers. The fact is, if you accept garbage input, you’re going to pass out garbage output. If your code has any modularity at all, then something like this will likely happen :

def foo(input):
  do_stuff

def bar(input):
  do_other_stuff

garbage_input = 'Hi. I'm garbage input.'

some_variable = foo(bar(garbage_input))


As you call foo and bar and other functions, all of which depended on garbage_input, you find that everything has turned into garbage. As a result, functions will start throwing errors a few dozen passes down the line, and things will become very difficult to debug.

Another common mistake is attempting to correct the user’s input in potentially ambiguous cases, which leads to the second tip.

2. Don’t Try to Correct Garbage Input


Let’s take an example scenario :

Imagine you had a box that exported values from 0 to 1 on a display, depending on the number the user passed in.

One day, you suddenly get a value of 1.01, a value slightly higher than the maximum. Now, this should raise a red flag for most programmers. However, some programmers resort to doing the following :

def calculateValue(temperature):
  do_calculations

def getBoxValue(temperature):
  if calculateValue(temperature) > 1 :
    return 1
  elif calculateValue(temperature) < 0 :
    return 0
  else:
    return calculateValue(temperature)

The technique shown above is known as clamping, which is basically restricting the value to a certain range. In this case, it is clamped to 0 and 1. However, the problem with the above example is that it is now impossible to debug the code.

If the user passed in bad input, you would get a clamped answer, instead of an error, and if the calculateValue function was buggy, you would never know. It could be slightly inflating the value, and you would still never know, because the values would be clamped.

As an exaggerated example, if calculateValue returned 900,000,000, all you would see is “1”. Instead of embracing and fixing bugs, this tactic throws them under the carpet in the hopes that no one will notice.

A better solution would be :

def calculateValue(temperature):
  do_calculations

def getBoxValue(temperature):
  if(calculateValue(temperature) > 1
       or calculateValue(temperature) < 0):
    raise ValueError('Output is greater than 1 or less than 0.')
  else:
    return calculateValue(temperature)

If your code is going to fail, then fail fast and fix it fast. Don’t try to polish garbage. Polished garbage is still garbage.

3. Stop Double Checking Boolean Values in If Statements


Many programmers already adhere to this principle, but some do not.

Since Python prevents the bug caused by double checking a boolean value, I will be using Java, as the bug can only happen in languages where assignment is possible in if statements.

In a nutshell, if you do this :

boolean someBoolean = true;

if(someBoolean == true) {
  System.out.println('Boolean is true!');
} else {
  System.out.println('Boolean is false!');
}

In this case,

if(someBoolean == true)

Is exactly equivalent to :

if(someBoolean)

Aside from being redundant and taking up extra characters, this practice can cause horrible bugs, as very few programmers will bother to glance twice at an if statement that checks for true/false.

Take a look at the following example.

boolean someBoolean = (1 + 1 == 3);

if(someBoolean = true) {
  System.out.println('1 + 1 equals 3!');
} else {
  System.out.println('1 + 1 is not equal to 3!');
}

At first glance, you would expect it to print out “1 + 1 is not equal to 3!”. However, on closer inspection, we see that it prints out “1 + 1 equals 3!” due to a very silly but possible mistake.

By writing,

if(someBoolean = true)


The programmer had accidentally set someBoolean to true instead of comparing someBoolean to true, causing the wrong output.

In languages such as Python, assignment in an if statement will not work. Guido van Rossum explicitly made it a syntax error due to the prevalence of programmers accidentally causing assignments in if statements instead of comparisons.

4. Put Immutable Objects First In Equality Checks


This is a nifty trick that piggy backs off the previous tip. If you’ve ever done defensive programming, then you have most likely seen this before.

Instead of writing :

if(obj == null) {
  //stuff happens
}

Flip the order such that null is first.

if(null == obj) {
  //stuff happens
}

Null is immutable, meaning you can’t assign null to the object. If you try to set null to obj, Java will throw an error.

As a result, you can prevent the silly mistake of accidentally causing unintentional assignment during equality checks. Naturally, if you set obj to null, the compiler will throw an error because it’s checking a null object when it expects a boolean.

However, if you are passing around methods inside the if statement, it can become dangerous, particularly methods that will return a boolean type. The problem is doubly bad if you have overloaded methods.

The following example illustrates this point :

final int CONSTANT_NUM = 5;

public boolean foo(int x){
  return x%2 != 0;
}

public boolean foo(boolean x){
  return !x;
}

public void compareVals(int x){
  if(foo(x = CONSTANT_NUM)){
    //insert magic here
  }
}

In this example, the user expects foo to be passed in a boolean of whether or not x is equal to a constant number, 5.

However, instead of comparing the two values, x is set to 5. The expected value if the comparison was done correctly would be false, but if x is set to CONSTANT_NUM, then the value will end up being true instead.

5. Leave Uninitialized Variables Uninitialized


It doesn’t matter what language you use, always leave your uninitialized variables as null, None, nil, or whatever your language’s equivalent is.

The only exception to this rule is booleans, which should almost always be set to false when initialized. The exception is for booleans with names such as keepRunning, which you will want to set initially to true.

In Java’s case,

int x;
String y;
boolean z = false;

In particular, for Python especially, if you have a list, make sure that you do not set it to an empty list.

The same also applies to strings.

Do this :

some_string = None
list = None

Not this :

some_string = ''
list = []

There is a world of a difference between a null/None/nil list, and an empty list, and a world of a difference between a null/None/nil string, and an empty string.

An empty value means that the object was assigned an empty value on purpose, and was initialized.

A null value means that the object doesn’t have a value, because it has not been initialized.

In addition, it is good to have null errors caused by uninitialized objects.

It is unpleasant to say the least when an uninitialized string is set to “” and is prematurely passed into a function without being assigned a non-empty value.

As usual, garbage input will give you garbage output.

Conclusion


These five tips are not a magical silver bullet that will prevent you from making any bugs at all in the future. Even if you follow these five tips, you won’t suddenly have exponentially better code.

Good programming style, proper documentation, and following common conventions for your programming language come first. These little tricks will only marginally decrease your bug count. However, they also only take about an extra few seconds of your time, so the overhead is negligible.

Sacrificing a few seconds of your time for slightly safer code is a trade most people would take any day, especially if it can increase production speed and prevent silly mistakes.

Why You Shouldn’t Validate Emails with Regex

Foo, a programmer working at FooBar Inc., is happily working on a registration form, when he think to himself,

“Hmm. I should probably check if the email is valid.”

Somewhere along the line, Foo connect the words “email” and “valid” with “regular expressions”.

But as the great Jamie Zawinski once stated,

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

 

Implementing Regular Expressions to Validate Email Addresses


 

Now, for the sake of demonstration, let’s look into the thought processes of Foo.

First, Foo thinks, “Well, what makes an email address valid?”

He quickly scribbles down a list.

  1. Email must contain an “@” symbol.
  2. Email must contain a “.” symbol.
  3. Email must not contain spaces.
  4. Email can have the character sets [a-zA-z]
  5. Email can have the number sets [0-9]

Foo also knows that the order must always go as follows :

  1. Word
  2. “@” symbol
  3. “.” symbol
  4. Domain Name

Okay. Seems simple enough, right?

Foo writes the following regex :

^[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]+

And it works flawlessly! At first, anyway.
It turns out that a regex statement like this won’t account for emails like “henry.dang@henrydangprg.com” or “henry-dang@henrydangprg.com”

The solution? As long as it’s [az-A-Z0-9._], it should be fine, up until the “@” symbol.

^[a-zA-Z0-9.-]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]+

So far so good! Looks like everything’s working. But hold on, your friend tries to sign up with his email, “henry.dang@henry.dang.prg.com”, which is an absolutely valid email. Now you have to compensate for multiple periods after the “@” symbol!

But even after Foo fixes that, he realizes that his input could be infinitely large if someone strung together infinite periods! (EX : henry.dang@henry.henry.henry.henry.henry.henry.henry… and so on)

But wait! There’s more! The user could have a plus sign in his email, but not in the domain. And wait! What about foreign characters? And apostrophes? And all those other wild characters that hardly anyone would use in an email, but would still be valid if they really wanted to use it?

The fact of the matter is that Foo will most likely be unable to account for every single possible letter and combination. There are simply too many, and attempting to match all of them will simply bring in angry customers who have some esoteric symbol in their email that you failed to account for.

 

So What’s the Solution?


 

The solution is simple. Don’t use regex. It is not the right tool for the job. All you need to do is check if the user has an “@” and a “.” in their email. Anything else is extraneous and will lead to some user emailing you about how they can’t register with their email.

There is no point in attempting to check the infinite possibilities for an email. Send the user a confirmation email. If their email is valid, they will receive an email. If not, then their email was invalid, and they should change it.

For the stubborn or curious user, there is a solution available here. Clocking in at a whopping 6424 characters, this monstrous and unreadable regular expression is the last thing you want to use in your code.

Taking a Quack at Duck Typing

Your friend, Ruby, goes out and buys¬†Java a pet duck. But wait, on closer inspection, it’s not a duck at all! It just walks like a duck.

Quack!

And quacks like a duck.

“What’s the matter?” Ruby asks.

“It’s not a duck!” Java complains, distraught that the pet has no inheritance relations with the Duck class.

“If it walks like a duck, and quacks like a duck, it’s a duck!”

So what is duck typing?


Duck typing is a feature that allows a language to call a method on an object, if it has the method. Ruby doesn’t care what the object is, but rather, what methods the object has.

However, a caveat of duck typing is that the it tends to only be built-in to the language if the language handles type checking during runtime. This means that duck typing will only work on dynamically typed languages, such as Ruby. In languages like Java and C++, type checking is done during compile-time. As a result, if there is a type conflict, the program will not even compile. Languages that do type-checking during compile-time are called statically typed languages.

In Java, declaring an integer will look like this :

int x = 0;

The type (int) is explicitly stated, which means that Java checks for types during compile-time, making it a statically typed languages.

In Ruby, it would look like this :

x = 0

Ruby sees that x was assigned a number at runtime. Since x was assigned a number, Ruby automatically knows that x must be a Fixnum.

How does it work?


First, we need two different objects that have the same method name. However, the method can’t be given to the object via the same superclass, because it would instead be inheritance.

Let’s take a look at a simple example of duck typing.

class Duck
  def quack
    puts "Quack!"
  end
end

class Alien
  def quack
    puts "I'm not a duck, but... Quack!"
  end
end

def try_quack(duck)
  duck.quack
end

try_quack(Duck.new)
try_quack(Alien.new)

Output :

Quack!
I'm not a duck, but... Quack!

How did Ruby know?


In Ruby’s mind, the Alien instance and the Duck instance are essentially the same. When we called the try_quack method, Ruby wanted an object that had a quack method.

In this case, when we passed in the Duck instance, Ruby saw that the Duck object would quack, so it called the Duck instance’s quack method.

For the Alien, even though the Alien class has nothing to do with the Duck class, it still has a quack method. As a result, Ruby happily calls the Alien class’s quack method.

Conclusion


Duck typing is a powerful feature of Ruby that allows you to call methods on seemingly different objects, as long as those objects have the same method names. As a result, there is no need for inheritance. You simply call the method, and if the object has it, it will work.

Ruby doesn’t care who the object is, but rather what it is.