Simply Explained : Multithreading

John and Mary both walk into work. Let’s consider both John and Mary to be a “thread”. We’ll define a “thread” to simply be a sequence of consecutive actions. In this case, let’s say that John is a “thread”, and Mary is also a “thread”, because they can both execute actions in parallel. If John is working on a spreadsheet, Mary does not have to wait for John to finish before she can do her own tasks. They are both independent entities.

Now, John loves donuts. Fortunately for John, his company always stocks donuts in the company fridge. For John, his thought process is quite simple, and looks like this :

if John sees at least one donut in the fridge
    then John washes his hands
    then John grabs a donut and eats it

And this is fine and dandy. When John is alone, this course of logic always works. He sees a donut in the fridge, he washes his hands, and he eats it. It always works without fail.

However, one day, Mary walks into the room. And Mary also likes donuts. When John looks into the fridge and sees that there is exactly one donut left, he hurriedly goes to wash his hands. However, when he returns, he finds that the donut is gone! But that shouldn’t have been possible. John had checked that at least one donut existed before he decided to wash his hands to eat it!

So what happened? It turns out that Mary had taken the donut in the time that John had spent washing his hands. This is the problem with multithreading, which is the act of programming with more than one thread.

 


The Problem with Multithreading


While it may be true that John’s logic always succeeds when he is alone (single-threaded), the same cannot be said when Mary is with him (multithreaded). This is because when objects are shared between multiple threads, multiple different threads are able to modify the object at the same time. In this case, the fridge is the object, and both John and Mary are able to modify the contents of the fridge at any time.

This is a huge problem, because now, functions that had been unit tested or proven to work on a single thread no longer work when multithreaded. As shown above, doing secure checks like putting in defensive if statements is entirely ineffective, because the object can be modified right after your if statement is completed due to the fact that multithreading allows everything to be done in parallel.

So how can John prevent Mary from modifying the contents of the fridge until he is done using it? The answer is fairly simple — when John accesses the fridge, he needs to make sure that he is the only one who has control of it until he is done whatever he is doing. There are multiple ways to do this, but the most common is to use something called a “lock”.

This new multithreaded system with locks has three simple rules :

1. Whoever controls the lock will have sole control over the object for as long as they possess the lock.
2. An object should only have one lock.
3. Any thread waiting on a locked object can either throw an exception or wait for it to be unlocked. In which case, it will be passed by some priority order (usually first come first serve, like a queue).

Let’s observe why these three rules are true.

 


Rule Number One


“Whoever controls the lock will have sole control over the object for as long as they possess the lock.”

The first rule is true because that is how a lock is defined. But for a more intuitive approach, you can think of the lock as “binding” itself to its owner. As long as the owner of the lock does not relinquish control, the object will forever be locked. This also means that if a thread controlling a locked object gets stuck in an infinite loop, that locked object will be locked forever.

Of course, while you could manually unlock the object, there is no way to determine if an object will be locked forever, because you can’t (in general) determine whether or not a thread is stuck in an infinite loop (see Halting problem).

 


Rule Number Two


“An object should only have one lock.”

Suppose an object could have two separate locks that give complete access over the object. That means that two different threads can access the object at the same time. Unfortunately, this is essentially no different than not having a lock at all, because now the object can be freely edited at any time by anyone. We are back to the exact same problem that we initially started with!

This means that an object must have one and only one lock, and only one owner for that lock.

 


Rule Number Three


“Any thread waiting on a locked object can either throw an exception or wait for it to be unlocked. In which case, it will be passed by some priority order (usually first come first serve, like a queue).”

In general, it is bad practice to have your code throw an exception when it attempts to access a locked object. The more conventional approach is to create a queue for that locked object, like a wait list. This is so that if three threads wanted to access an object, they would access the object in the order that they called the lock.

For example, the execution order might look like this :


1. Thread 1 calls for the object. It is not locked, so Thread 1 gains control of the lock.
2. Thread 2 calls for the object. It is locked. Thread 2 is put on the queue (wait list)
3. Thread 1 performs some operation using the object.
4. Thread 1 relinquishes control of the lock.
5. Thread 3 calls for the object. It is unlocked at this very exact moment, but because there is a queue for this object, the lock goes to the first element in the queue, which is Thread 2. Thread 3 is now put on the queue.
6. Thread 2 performs some operation using the object.
7. Thread 2 relinquishes control of the lock.
8. Thread 3 automatically gains control of the lock, as Thread 3 is next on the queue.
9. Thread 3 performs some operation using the object.
10. Thread 3 relinquishes control of the lock.


As you can see, when multiple threads want access to a locked object, they are accessed in first-come-first-serve order. However, this does not necessarily mean that Thread 2 just has to twiddle its thumb and do nothing. Thread 2 does not have to be blocked while it waits — it can do other computations while it waits for its access to the locked object.

This is important for when the desired operation could take a long time. For example, imagine if Thread 1 wanted access to an object controlling the webcam for your computer. Thread 1 could be using this object for a long time, and if Thread 2 and Thread 3 both sat idly doing nothing, we would experience a severe loss in performance.

 


Conclusion


In this article, you’ve learned that multithreading is difficult because if two threads access an object at the same time, there’s no guarantee that the object will give you the desired behavior. To fix this problem, you simply have to use a lock, which will prevent other threads from accessing the object. When a thread is done performing operations with a locked object, it relinquishes the lock, and the next thread gets control of the lock.

In the next Simply Explained post, I’ll be talking about Futures, which is a programming construct used to obtain a value that does not yet exist (for example, because the object you wanted to use to get that value was locked). It allows you to write non-blocking code, because after you create a Future, your thread can do other tasks until the Future makes a callback, telling you that the value now exists.

Happy coding, and enjoy your new-found multithreading powers.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s