Programmers and Inferiority Complex

In a way, ever since I first started programming, I’ve been diving into a deep, dark, rabbit hole.

I started with simple programs, and progressively moved onto more advanced topics. I still remember when I was mind-blown by a recursive solution for getting the n’th Fibonacci term.

public int fibonacci(int n)  {
  if(n == 0) {
    return 0;
  } else if(n == 1) {
    return 1;
  } else {
    return fibonacci(n - 1) + fibonacci(n - 2);
  }
}

The idea of recursion was so foreign to me, that I sat there dumbstruck and analyzed those 8 lines of code for hours before I figured out how it worked.

I was so excited that I moved onto harder and more interesting problems, and then onto even harder challenges.

Eventually, I felt that I could solve any problem. I hadn’t yet seen anything involving software development, web development, or mobile development. I thought I didn’t need to learn anything else, that I was done learning, and that I would be able to resolve any problem with my meager and tiny puddle of knowledge.

And then I saw the other side.

It started with Eclipse. Then JUnit, Gson, LWJGL, Swing, JavaFX, Android Studios, Maven, Gradle, Guava, Google Maps API, and the list goes on ad infinitum.

mathematics-1230074_1280
How programming seemed after being exposed to Android

Paradoxically, the more I knew about programming, the more I realized that I didn’t know anything at all. Suddenly, I felt small and overwhelmed in an endless ocean of libraries, frameworks, APIs, IDEs, technologies, and languages. There was so much to learn, and I hadn’t discovered even 0.1% of all the technology stacks that existed.

I thought that if I learned more, I would feel even more confident in my abilities, that I would become a guru and master at my craft. But I was wrong. The more I learned, the less confident I felt. For every programming mystery I solved, two more sprang up, until there became infinitely many questions to answer.

confidence
Confidence had a negative relationship to how much I knew

 

I sincerely thought I would never catch up to my peers, and that everyone would tower over me no matter how much I learned.

And so, to compensate for my inferiority, I dedicated even more hours to programming. But no matter how much I learned, no matter how many new languages, concepts, or libraries I discovered and grasped, I still felt inadequate.

Eventually, the hours I spent programming spiraled out of control. I was holding such unrealistically high standards for myself that I burned myself out.

Things became even worse when I discovered StackOverflow.

As I was browsing through the questions, even with all of my supposed “knowledge”, I could only answer maybe 1 out of every 100 questions on StackOverflow. I believed at the time that I was stupid, and that programming maybe wasn’t the right choice for me. That there were too many new technology stacks, and that they changed too quickly.

What I failed to realize was that those people with 10k+ reputation on StackOverflow had been specializing in that small subset of frameworks or libraries for years and I was somehow under the wild notion that I could stand my ground against them after having only worked with the framework for a month or two.

But here’s the deal. No one knows every library. No one knows every framework. No one knows every API. It’s impossible to know it all. When I tried to absorb all the information possible, to become a guru in everything, I came crashing back down. I only worried about what I didn’t know, and failed to recognize all of the things I did know.

In short, I had a sort of Dunning-Kruger paradox effect. I would be able to learn new concepts and apply them immediately, and that left me feeling pretty confident.

But then as soon as I find a single programmer who knows even a single thing more than I did, the Dunning-Kruger effect suddenly switched into inferiority complex. The two jumped back and forth. Either I had too much confidence in my abilities, and I bite off more than I can chew, or I had too little confidence, and I ended up being afraid that I would be exposed as a fraud. In short, the vast amount of material in programming makes it difficult to compare yourself to others objectively without bias.

If you, a programmer, talked to a person who knew absolutely nothing about programming, you would blow their minds, and they might something like, “You’re really smart! How do you know so much?” They would assume you were a genius with 200 IQ.

albert-einstein-1145030_1280

Similarly, if you didn’t know anything about NoSQL, and you decided to talk to someone who did, you would feel the same way. You would think that the other person was much better than you as a programmer, when in fact, they only knew something innovative that you didn’t.

Even the best programmers won’t know everything. Chances are that someone like Guido van Rossum, creator of Python, will know very little about Android. But Rossum didn’t spend all of his time lamenting the fact that he doesn’t know Android development. He understands that there are people better than him at making websites, or designing mobile apps.

And that’s okay.

At its core, programming is just the use of a computer to solve problems. Rossum saw that programming was difficult for beginners to learn, and wanted to create a programming language that would be easy to learn and be as readable as English. And so he did.

It doesn’t matter that you can solve every problem that exists, or that someone else can solve a problem that you can’t. It matters that you can solve problems that matter to you. You’re not a sham just because you can’t make a website, or because you don’t know how to write a test suite with Selenium. You have your own strengths, and your own skill sets that not everyone else will have.

Whether or not you learned the latest hot programming trend, or the latest Android API is not necessarily important. Don’t burn yourself out and throw yourself into the pits of insanity just to stay (excessively) ahead on the technology frontier.

While it may seem tempting to learn another language, another framework, another API, “just in case” you need it, there’s a certain point at which you are crossing the line between sanity and insanity. Yes, you could put in an extra ten hours here, and work on several pet projects with some fancy new technology stacks, but is it really worth it?

“Well yes! Of course it is! Real programmers spend all of their waking hours programming, 24/7, day and night, even after 8 hours of work!” Yo might hear others say.

But that’s complete baloney. Spending more hours programming to “prove” that you’re a good programmer benefits no one. In the end, you will only program worse due to stress and burnout. And honestly, spending all your time programming is bad for both your physical or mental health.

In a nutshell, the problem of both inferiority complex, and the impostor syndrome that follows it, essentially boils down to unnecessary comparisons.

Stop comparing yourself to others. You are not Linus Torvalds. You are not Guido van Rossum. You are not James Gosling. You are not that developer who codes 80+ hours a week because he’s a “real programmer”.

So relax with the comparisons. In the end, comparing yourself to other programmers is pointless, at best. The only programmer you need to compare yourself to, is the programmer you were yesterday.

How URL Routing in Django Works

A URL is a uniform resource locator. It contains the address that links to a resource such as an HTML page.

For example, “https://henrydangprg.com” is a URL, that links to the HTML page that contains this website. A single website, like this, can have many other URL’s formed by adding a backslash (“/”) after the domain name.

If you wanted to access the about page on this website, you would add “/about/” to the end of the home page’s URL. It can be visualized like a tree.

You start with the base website, and you have other possible URL’s accessible by adding a backslash and some word.

  • https://henrydangprg.com/
    • about/
    • contact/
    • infinitely-many-other-possibilities/
      • which-can-contain-other-links/
        • containing-potentially-even-more-links/

In theory, you can have something like :

https://henrydangprg.com/foo/bar/foobar/foobarbar/foofoo-ad-infinitum/

How Does It Work in Django?

In Django, the premise is exactly the same. Inside each project is a urls.py file.

You’ll see something along the lines of this :

from django.conf.urls import url
from django.contrib import admin

urlpatterns = [
  url(r'^admin/', admin.site.urls),
]

Here, you can see that Django handles the URL routing with regular expressions. The ‘r’ before a string indicates that the following string is raw input. This means that Python will not convert things like ‘\n’ into a new line, and will instead process it as is.

You can play around with the urlpatterns list, and add new url’s to test that it works.

For example, if we change it to :

urlpatterns = [
  url(r'^admin/', admin.site.urls),
  url(r'^foobar/', admin.site.urls),
]

Then foobar will become a possible extension to your website’s URL. Assuming you’re using localhost and a port of 8000, it would be 127.0.0.1:8000/foobar/ to access your new URL.

If you want to go down two layers deep, like 127.0.0.1:8000/foobar/foo, you should create a new app.

Let’s make a new app called “foobar”.

django-admin startapp foobar

Modify the urlpatterns list inside your project’s urls.py file. We are now going to add foobar’s urls.py file into the project’s urls.py.

from django.conf.urls import include, url
from django.contrib import admin

urlpatterns = [
  url(r'^foobar/', include('foobar.urls')),
  url(r'^admin/', admin.site.urls),
]

Essentially, any time we visit 127.0.0.1:8000/foobar/, Django will see that we want to access “foobar”, and will check foobar’s urls.py for the next portion of the URL.

Now, we have to add add URLs into foobar. Go into the foobar directory and modify urls.py.

from django.conf.urls import url
from django.contrib import admin

from . import views

urlpatterns = [
  url(r'^$/', admin.site.urls),
  url(r'^foo/', admin.site.urls),
]

I don’t suggest you actually make every URL link to admin.site.urls, but for the sake of simplicity, we will stick to using that.

The ‘^$’ simply indicates that if there is nothing, then load the admin page. In this case, it would be 127.0.0.1:8000/foobar/ because there is nothing after “foobar/”, which is what our project’s urls.py looked up until.

We also added “foo” to our urlpatterns, which means we can now visit 127.0.0.1:8000/foobar/foo, allowing us to add extensions to our URL.

You can add virtually any URL you want, and as many levels of URLS as needed. You don’t even have to add new apps for each new extension. However, you will have to write a lot of duplicated code if you do that.


#With new apps 

urlpatterns = [
  url(r'^foobar/', include('foobar.urls')),
  url(r'^admin/', admin.site.urls),
]

#foobar.urls
urlpatterns = [
  url(r'^$/', admin.site.urls),
  url(r'^foo/', admin.site.urls),
]

#Without new apps, anti-DRY

urlpatterns = [
  url(r'^foobar/$', admin.site.urls),
  url(r'^foobar/foo$', admin.site.urls)
  url(r'^admin/$', admin.site.urls),
]

You can see that you would have to write out the full URL each time. If you had 100 URL’s, all 100 would have to be crammed into this single urlpatterns list, and if it goes 10 layers deep, you would have to write it all out each time.

Conclusion

URL routing with Django is absurdly easy, provided that you know a little bit of regex. However, if you are struggling with regular expressions, you can click here for an interactive regular expressions tester.
Best of luck with learning Django, and happy coding!

 

Null Island – The Most Famous Island that Doesn’t Exist

It’s been a long and tiresome week of work, and you want a vacation.

You search online for some decent vacation spots.

Hawaii? Nah. Canada? Already been there.

But hold on, something catches your eye. Null Island? Thousands of visitors? And look at that population density! All those posts from (0,0)?

And so you set out on an arduous and dangerous journey to Null Island, only to find this.

Null-island-buoy
Photo from Wikipedia

Yep. Null Island, the most popular tiny island you’ve found in your life, doesn’t exist.

What’s Going on Here?

Null Island, located with the geographic coordinates (0,0), is a byproduct formed by bad programming.

How did it happen?

Some developers who were creating a GPS system decided that, if a person’s coordinates couldn’t be found, that instead of setting them (Null, Null), they were instead set to (0,0).

In psuedocode, it went down something like this :

if user's x coordinate is Null
  set user's x coordinate to 0

if user's y coordinate is Null
  set user's y coordinate to 0

Not only is this kind of code an anti-pattern, but for something such as position tracking, this can have serious consequences.

For example, in Wisconsin, many voters were in locations that did not have registered coordinates. However, in order to vote, they needed to use a new geocoding system that tied their house address to a geographical coordinate.

Unfortunately, the Wisconsin voters were given a geographical coordinate of (0,0), or Null Island. For a population of nearly 6 million, the problem was quickly discovered and fixed, but it is yet another example of how bad programming can wreck havoc.

To prevent mistakes like these, ask yourself if what you’re doing makes sense in a different context.

Here are some similar scenarios, using the same logic that caused the Null Island issue.

if user's phone number is Null
  set user's phone number to 000-000-0000

if user's pin number is Null
  set user's pin number to 0000

if user's credit card number is Null
  set user's credit card number to
    0000 0000 0000 0000

In scenarios where the 0 is significant, you cannot replace null with zero. If the price 0 sell not for sale. If you wallet is null, it means you don’t have any money. Your balance is 0. If your friend count is null, it means you have no friends, so your friend count is 0.

However, if the zero matters, such as in a phone number, you can’t set the user’s phone number to 000-000-0000. Similarly, if you have an expensive watch, and the price is null because it’s not for sale, that doesn’t mean your watch costs $0.00. That would imply that you’re giving it away for free.

Conclusion

Null Island is a silly coding mistake that carries significant consequences, even today.

That’s not to say that every single mistake you make will effect an entire state’s voter population, but it can happen.

If you want to avoid creating the next equivalent of Null Island, leave your user’s input as is instead of attempting to change it. Null means null. Leave it as null.

Five Great Practices for Safer Code

You’re sitting at your desk, glaring at your monitor, but it glares back at you with equal determination.

Every change you make introduces new bugs, and fixing a bug causes another bug to pop up.

You don’t understand why things are randomly breaking, and the lines of code just increase every day.

However, by coding in a rigorous and specific fashion, you can prevent many of these issues simply by being slightly paranoid. This paranoia can save you hours in the future, just by dedicating a few extra seconds to include some additional safeguards.

So without further ado, let’s jump right into the top five tips for safer code.

1. Stop Accepting Garbage Input


The common phrase “Garbage in, Garbage out” is one that rings strongly with many programmers. The fact is, if you accept garbage input, you’re going to pass out garbage output. If your code has any modularity at all, then something like this will likely happen :

def foo(input):
  do_stuff

def bar(input):
  do_other_stuff

garbage_input = 'Hi. I'm garbage input.'

some_variable = foo(bar(garbage_input))


As you call foo and bar and other functions, all of which depended on garbage_input, you find that everything has turned into garbage. As a result, functions will start throwing errors a few dozen passes down the line, and things will become very difficult to debug.

Another common mistake is attempting to correct the user’s input in potentially ambiguous cases, which leads to the second tip.

2. Don’t Try to Correct Garbage Input


Let’s take an example scenario :

Imagine you had a box that exported values from 0 to 1 on a display, depending on the number the user passed in.

One day, you suddenly get a value of 1.01, a value slightly higher than the maximum. Now, this should raise a red flag for most programmers. However, some programmers resort to doing the following :

def calculateValue(temperature):
  do_calculations

def getBoxValue(temperature):
  if calculateValue(temperature) > 1 :
    return 1
  elif calculateValue(temperature) < 0 :
    return 0
  else:
    return calculateValue(temperature)

The technique shown above is known as clamping, which is basically restricting the value to a certain range. In this case, it is clamped to 0 and 1. However, the problem with the above example is that it is now impossible to debug the code.

If the user passed in bad input, you would get a clamped answer, instead of an error, and if the calculateValue function was buggy, you would never know. It could be slightly inflating the value, and you would still never know, because the values would be clamped.

As an exaggerated example, if calculateValue returned 900,000,000, all you would see is “1”. Instead of embracing and fixing bugs, this tactic throws them under the carpet in the hopes that no one will notice.

A better solution would be :

def calculateValue(temperature):
  do_calculations

def getBoxValue(temperature):
  if(calculateValue(temperature) > 1
       or calculateValue(temperature) < 0):
    raise ValueError('Output is greater than 1 or less than 0.')
  else:
    return calculateValue(temperature)

If your code is going to fail, then fail fast and fix it fast. Don’t try to polish garbage. Polished garbage is still garbage.

3. Stop Double Checking Boolean Values in If Statements


Many programmers already adhere to this principle, but some do not.

Since Python prevents the bug caused by double checking a boolean value, I will be using Java, as the bug can only happen in languages where assignment is possible in if statements.

In a nutshell, if you do this :

boolean someBoolean = true;

if(someBoolean == true) {
  System.out.println('Boolean is true!');
} else {
  System.out.println('Boolean is false!');
}

In this case,

if(someBoolean == true)

Is exactly equivalent to :

if(someBoolean)

Aside from being redundant and taking up extra characters, this practice can cause horrible bugs, as very few programmers will bother to glance twice at an if statement that checks for true/false.

Take a look at the following example.

boolean someBoolean = (1 + 1 == 3);

if(someBoolean = true) {
  System.out.println('1 + 1 equals 3!');
} else {
  System.out.println('1 + 1 is not equal to 3!');
}

At first glance, you would expect it to print out “1 + 1 is not equal to 3!”. However, on closer inspection, we see that it prints out “1 + 1 equals 3!” due to a very silly but possible mistake.

By writing,

if(someBoolean = true)


The programmer had accidentally set someBoolean to true instead of comparing someBoolean to true, causing the wrong output.

In languages such as Python, assignment in an if statement will not work. Guido van Rossum explicitly made it a syntax error due to the prevalence of programmers accidentally causing assignments in if statements instead of comparisons.

4. Put Immutable Objects First In Equality Checks


This is a nifty trick that piggy backs off the previous tip. If you’ve ever done defensive programming, then you have most likely seen this before.

Instead of writing :

if(obj == null) {
  //stuff happens
}

Flip the order such that null is first.

if(null == obj) {
  //stuff happens
}

Null is immutable, meaning you can’t assign null to the object. If you try to set null to obj, Java will throw an error.

As a result, you can prevent the silly mistake of accidentally causing unintentional assignment during equality checks. Naturally, if you set obj to null, the compiler will throw an error because it’s checking a null object when it expects a boolean.

However, if you are passing around methods inside the if statement, it can become dangerous, particularly methods that will return a boolean type. The problem is doubly bad if you have overloaded methods.

The following example illustrates this point :

final int CONSTANT_NUM = 5;

public boolean foo(int x){
  return x%2 != 0;
}

public boolean foo(boolean x){
  return !x;
}

public void compareVals(int x){
  if(foo(x = CONSTANT_NUM)){
    //insert magic here
  }
}

In this example, the user expects foo to be passed in a boolean of whether or not x is equal to a constant number, 5.

However, instead of comparing the two values, x is set to 5. The expected value if the comparison was done correctly would be false, but if x is set to CONSTANT_NUM, then the value will end up being true instead.

5. Leave Uninitialized Variables Uninitialized


It doesn’t matter what language you use, always leave your uninitialized variables as null, None, nil, or whatever your language’s equivalent is.

The only exception to this rule is booleans, which should almost always be set to false when initialized. The exception is for booleans with names such as keepRunning, which you will want to set initially to true.

In Java’s case,

int x;
String y;
boolean z = false;

In particular, for Python especially, if you have a list, make sure that you do not set it to an empty list.

The same also applies to strings.

Do this :

some_string = None
list = None

Not this :

some_string = ''
list = []

There is a world of a difference between a null/None/nil list, and an empty list, and a world of a difference between a null/None/nil string, and an empty string.

An empty value means that the object was assigned an empty value on purpose, and was initialized.

A null value means that the object doesn’t have a value, because it has not been initialized.

In addition, it is good to have null errors caused by uninitialized objects.

It is unpleasant to say the least when an uninitialized string is set to “” and is prematurely passed into a function without being assigned a non-empty value.

As usual, garbage input will give you garbage output.

Conclusion


These five tips are not a magical silver bullet that will prevent you from making any bugs at all in the future. Even if you follow these five tips, you won’t suddenly have exponentially better code.

Good programming style, proper documentation, and following common conventions for your programming language come first. These little tricks will only marginally decrease your bug count. However, they also only take about an extra few seconds of your time, so the overhead is negligible.

Sacrificing a few seconds of your time for slightly safer code is a trade most people would take any day, especially if it can increase production speed and prevent silly mistakes.

Bogosort : Moving At The Pace of a Random Snail

Bogosort is a sorting algorithm, much like quick sort, and merge sort, with a twist.
It has an average case performance of O((n+1)!).

Bogosort is known by many other names, including stupid sort, monkey sort, and permutation sort.

Knowing this critical information, let’s dive into the meat of this sorting algorithm.

How Does It Work?


Bogosort works by taking an input list, and checking if it is in order. If it is, then we are done, and the list is returned. This means that Bogosort has a best case performance of O(n).

If the list isn’t in order, then we will shuffle the list until we get a list that is sorted. Unfortunately, since the list will be shuffled randomly, Bogosort technically has a worst case performance of O(∞).

 

Implementation


Bogosort’s implementation is short and concise.

import random

def Bogosort(list):
    while not isSorted(list):
        random.shuffle(list)
    return list

def isSorted(list):
    if(len(list) &amp;amp;lt;= 1):
        return True

    if(list[1] &amp;amp;gt;= list[0]):
        return isSorted(list[1:])
    else:
        return False

Simple and crisp. Let’s try it on some inputs.
Note that I used Python’s time module to print out the time, which is quick and dirty, but it does the job.

Bogosort([5,4,3,2,1])
Time = 0.000202894210815

Okay. Now let’s try it again on an input size of 10.

Bogosort([10,9,8,7,6,5,4,3,2,1])
52.8398268223

Since it is completely randomized, running this will give different times each time it is run, but the ballpark figure will at least be somewhat similar.

Normally, I’d try on a larger input size, but since the growth rate is going to be exponential, an input size of 11 will cause a time that is much longer than when an input size of 10 is used.

In fact, in our example, running on an input size of 10 took over 260,000 times longer than when run on an input size of 5.

However, since the numbers are random, it is impossible to determine exactly how much longer it will take , but it should be somewhere around that range.

Conclusion


If you are looking for a horrendously slow sorting algorithm for any reason, Bogosort is the perfect candidate. It works by randomly shuffling the list until it gets a sorted list, which means that its average performance is going to be O((n+1)!).

Of course, Bogosort should not be taken as a serious sorting algorithm, and is intended to be used as a comparison to other sorting algorithms like quick sort and merge sort.

Unless your goal is to create slow code, stay far far away from this monstrosity.

 

Why You Shouldn’t Validate Emails with Regex

Foo, a programmer working at FooBar Inc., is happily working on a registration form, when he think to himself,

“Hmm. I should probably check if the email is valid.”

Somewhere along the line, Foo connect the words “email” and “valid” with “regular expressions”.

But as the great Jamie Zawinski once stated,

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

 

Implementing Regular Expressions to Validate Email Addresses


 

Now, for the sake of demonstration, let’s look into the thought processes of Foo.

First, Foo thinks, “Well, what makes an email address valid?”

He quickly scribbles down a list.

  1. Email must contain an “@” symbol.
  2. Email must contain a “.” symbol.
  3. Email must not contain spaces.
  4. Email can have the character sets [a-zA-z]
  5. Email can have the number sets [0-9]

Foo also knows that the order must always go as follows :

  1. Word
  2. “@” symbol
  3. “.” symbol
  4. Domain Name

Okay. Seems simple enough, right?

Foo writes the following regex :

^[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]+

And it works flawlessly! At first, anyway.
It turns out that a regex statement like this won’t account for emails like “henry.dang@henrydangprg.com” or “henry-dang@henrydangprg.com”

The solution? As long as it’s [az-A-Z0-9._], it should be fine, up until the “@” symbol.

^[a-zA-Z0-9.-]+@[a-zA-Z0-9]+\.[a-zA-Z0-9]+

So far so good! Looks like everything’s working. But hold on, your friend tries to sign up with his email, “henry.dang@henry.dang.prg.com”, which is an absolutely valid email. Now you have to compensate for multiple periods after the “@” symbol!

But even after Foo fixes that, he realizes that his input could be infinitely large if someone strung together infinite periods! (EX : henry.dang@henry.henry.henry.henry.henry.henry.henry… and so on)

But wait! There’s more! The user could have a plus sign in his email, but not in the domain. And wait! What about foreign characters? And apostrophes? And all those other wild characters that hardly anyone would use in an email, but would still be valid if they really wanted to use it?

The fact of the matter is that Foo will most likely be unable to account for every single possible letter and combination. There are simply too many, and attempting to match all of them will simply bring in angry customers who have some esoteric symbol in their email that you failed to account for.

 

So What’s the Solution?


 

The solution is simple. Don’t use regex. It is not the right tool for the job. All you need to do is check if the user has an “@” and a “.” in their email. Anything else is extraneous and will lead to some user emailing you about how they can’t register with their email.

There is no point in attempting to check the infinite possibilities for an email. Send the user a confirmation email. If their email is valid, they will receive an email. If not, then their email was invalid, and they should change it.

For the stubborn or curious user, there is a solution available here. Clocking in at a whopping 6424 characters, this monstrous and unreadable regular expression is the last thing you want to use in your code.

Color Detection in Python with OpenCV

Image processing may seem like a daunting and scary task, but it’s actually not as terrible as some people make it out to be. In this tutorial, we will be doing basic color detection in OpenCV version 2.4.13. with Python 3.5. Note that you will also need to install NumPy to run the code in this article.

How Does Color Work on a Computer?


On a computer, color can be represented in many formats. However, in this tutorial, we will be strictly concerned with only BGR (Blue, Green, Red) and HSV (Hue Saturation Value).

With BGR, a pixel is represented by 3 parameters, blue, green, and red. Each parameter usually has a value from 0 – 255. For example, a pure blue pixel on your computer screen would have a B value of 255, a G value of 0, and a R value of 0. Your computer would read this and say, “Ah. This pixel is 255 parts blue, 0 parts green, and 0 parts red.”

With HSV, a pixel is also represented by 3 parameters, but it is instead Hue, Saturation and Value.

Unlike BGR, HSV does not use the primary color to represent a pixel. Instead, it uses hue, which is the color or shade of the pixel.

The saturation is the intensity of the color. A saturation of 0 is white, and a saturation of 255 is maximum intensity. Another way to think about it is to imagine saturation as the colorfulness of a certain pixel. Value is the simplest of the three, as it is just how bright or dark the color is.

HSV can be imagined like a three dimensional cylinder, as seen in the picture below.

hsv_color_solid_cylinder_alpha_lowgamma
Photo taken from Wikipedia’s HSL and HSV article.

Converting BGR to HSV


Since we will be using HSV, you will need an BGR to HSV to converter because OpenCV uses a different HSV scale from popular image editors like Gimp.

First, copy the following code into your favorite text editor and save it as converter.py. The lower and upper bound part will be explained later.

import sys
import numpy as np
import cv2

blue = sys.argv[1]
green = sys.argv[2]
red = sys.argv[3]  

color = np.uint8([[[blue, green, red]]])
hsv_color = cv2.cvtColor(color, cv2.COLOR_BGR2HSV)

hue = hsv_color[0][0][0]

print("Lower bound is :"),
print("[" + str(hue-10) + ", 100, 100]\n")

print("Upper bound is :"),
print("[" + str(hue + 10) + ", 255, 255]")

Now, we need an image to do color detection on. Download the image below and place it in the same directory as converter.py

circles

We have an image, and an BGR to HSV converter. That’s all we need to get started, so let’s jump into the actual image processing.

 

Let’s Get to the Code Already!


Fire up your favorite text editor and save a new file called “image.py” in the same directory as the circles.png file.

First, we need to grab our imports and load the image in OpenCV.

import cv2
import numpy as np

img = cv2.imread('circles.png', 1)

The 1 means we want the image in BGR, and not in grayscale.

As stated before, we will be using HSV instead of BGR, so we need to convert our BGR image to a HSV image with the following line.

hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

Great! Now that the picture is in HSV, we need something called a “lower range” and an “upper range” for the hue that we are searching for. The lower range is the minimum shade of red that will be detected, and the upper range is the maximum shade of red that will be detected. In our case, let’s search for the red circle at the top left. To do so, we will need to obtain the RGB numbers for the red circle.

I personally prefer Gimp, so I will be using that for the color picker feature. Simply use the color picker and click on the red circle, and you will have copied it. Now, click on the red shade that you copied. (see photo below)

color-picker

After clicking on that, you should see the following screen :

color-bgr.png

We can see that red equals 237, green equals 28, and blue equals 36. We will be using these numbers with the converter to automatically generate the respective lower range and upper range HSV values for OpenCV. (Note that this method is inaccurate when the color is less pure or murky)

Remember that the HSV values shown in the photo are different from the ones in OpenCV. The scaling is different, so you can not use the values Gimp gives you for OpenCV.

Open up your terminal or command line and cd into the directory with the converter.py file and run the following :


python3 converter.py 36 28 237

Note that the order is in BGR, not RGB. After you run the script, it should output that the lower range = [169, 100, 100] and the upper range = [189, 255, 255].

We will now use NumPy to create arrays to hold our lower and upper range.

lower_range = np.array([169, 100, 100], dtype=np.uint8)
upper_range = np.array([189, 255, 255], dtype=np.uint8)

The “dtype = np.uint8” simply means that it will have the data type an 8 bit integer, which makes sense, because the max possible value for the hue, saturation, and value is 2^8 – 1.

Finally, with the lower range and the upper range found, we can create a mask for our image.

mask = cv2.inRange(hsv, lower_range, upper_range)

cv2.imshow('mask',mask)
cv2.imshow('image', img)

while(1):
  k = cv2.waitKey(0)
  if(k == 27):
    break

cv2.destroyAllWindows()

A mask is simply a specific part of the image. In this case, we are checking through the hsv image, and checking for colors that are between the lower-range and upper-range. The areas that match will be set to the mask variable.

After that, we can display both the mask and the image side-by-side.

The last three lines just state that the program will wait until the user presses the “esc” key (which has an id of 27) before it quits and destroys every OpenCV window.

If you’ve followed up to this point, you should end up with a mask that only has filled in white pixels for where the red circle was.

final-result

And there you have it! You just did color matching in OpenCV. We found an upper and lower bound for the shade of red that we were looking for, and created a mask that only had white pixels filled in for wherever there was a red that matched.