The Curious npm Package With Over 60M Downloads

The node package manager, also known as npm, is a crucial part of the Javascript library ecosystem. Many of the most popular JS libraries and frameworks, such as ReactJS, JQuery, AngularJS, etc., are primarily downloaded from npm.

In fact, there’s one curious little npm package with over 60 million downloads, a package so incredibly useful and revolutionary that nearly every JS developer has installed this, or one of its dependents at least once in their lives. Have you ever used WebPack or ReactJS? Because both of those packages are dependents of this aforementioned mysterious package.

And the name of that revolutionary package? It’s is-odd. A package whose only purpose is to tell you whether a number is odd or not.

So What Else Does Is-Odd Do?

You’re probably thinking that there’s no way a package whose only job is to tell you if a number is odd or not, could possibly accrue 60 million downloads. Surely, it must do something else.

Fortunately, the source code never lies.


const isNumber = require('is-number');

module.exports = function isOdd(value) {
  const n = Math.abs(value);
  if (!isNumber(n)) {
    throw new TypeError('expected a number');
  }
  if (!Number.isInteger(n)) {
    throw new Error('expected an integer');
  }
  if (!Number.isSafeInteger(n)) {
    throw new Error('value exceeds maximum safe integer');
  }
  return (n % 2) === 1;
};

Aside from doing type checking to ensure that something is actually a number, it quite literally only runs (n % 2 == 1)

And that’s it. Over 60 million downloads, to run a single line of code.

“But what about all of the type checking?”. The type checking is a non-issue, because if it was ever a problem, then that means your code has an edge case that makes nearly no sense. For example, how would it ever be possible for you to accidentally check if a string is an even number, and not have this mistake get caught somewhere else in your code, like the input/data-fetching step?

Furthermore, if you seriously anticipate that the type might be wrong, then you would also have to wrap your code in a try catch statement. If you’re still not convinced, we can attempt to extend this philosophy by deprecating the “+” operator in JavaScript and replacing it with this function:

const isNumber = require('is-number');

module.exports = function add-numbers(value1, value2) {
  if (!isNumber(value1)) {
    throw new TypeError('expected a number for first input');
  }
  if (!isNumber(value2)) {
    throw new TypeError('expected a number for second input');
  }
  return value1 + value2
};

Now, anytime you want to add two numbers, you can’t just do value1 + value2. You’d have to do this instead:

try {
  add-numbers(value1, value2)
} catch(err) {
  console.log("Error! " + err);
}

But there’s a glaring problem here. With the is-odd package, we can check if a number is odd, but what if it’s even? Which of these three things would you do?

  1. Simply write (n % 2 == 0)
  2. The opposite of odd is even, so just do !isOdd(n)
  3. Create an entirely new npm package, complete with a test suite, TravisCL integration, and an MIT License.

Both 1 and 2 are the obvious sensible options, and yet, option 3, which is the aptly-named is-even package, was the option of choice.

So we’ve created an entirely new npm package, which has its own set of dependents. And this package has over 100,000 weekly downloads! What’s in the source code, then?

var isOdd = require('is-odd');

module.exports = function isEven(i) {
  return !isOdd(i);
};

A one-liner, with no logic other than reversing the result of another package’s function. And it’s dependency is is-odd!

So what exactly is wrong with having all of these tiny and largely useless dependencies? The more dependencies your project has, especially if those dependencies are not from verified and secure sources, the more likely you are to face security risks.

Like that one time a popular npm package spammed everyone’s build logs with advertisements, causing npm to ban terminal ads, or perhaps that other scandal where a core npm library tried to steal people’s cryptocurrencies.

Dependencies should be useful, non-trivial, and secure, which is why packages like is-even and is-odd don’t make any sense from a developmental standpoint. They’re so incredibly trivial (one-liners), that even adding them to your project is just a massive security and developmental risk with zero upside. Unfortunately, is-odd is firmly cemented in the history of npm, and most major packages include it as an essential dependency. There is no escape from single-line npm packages anytime in the foreseeable future.

How To Reduce Vision and Image Processing Times

Have you ever tried writing a program to analyze or process images? If so, you’re likely no stranger to the fact that analyzing large numbers of images can take forever. Whether you’re trying to perform real-time vision processing, machine learning with images, or an IoT image processing solution, you’ll often need to find ways to reduce the processing times if you’re handling large data sets.

All of the techniques listed in this article take advantage of the fact that images more often than not have more data than needed. For example, suppose you get a data set full of 4K resolution full-color images of planes. We’ll use this image below to track our optimization steps.

Removing Colors

There are many situations in which color is necessary. For example, if you’re trying to detect fresh bloodstains in an image, you normally wouldn’t turn an image into grayscale. This is because all fresh bloodstains are red, and so you would be throwing away critical information if you were to remove the color from an image.

However, if color is not necessary, it should be the first thing that you remove from an image to decrease processing times.

The reason removing color from an image decreases processing time is because there are fewer features to process, where we’ll say a feature is some measurable property.

With RGB (red, green, blue, ie; colored images), you have three separate features to measure, whereas with grayscale, you only have one. Our current plane image should now look like this:

Using Convolution Matrices

A convolution matrix, also known as a mask or a kernel, is a 3×3 or 5×5 matrix that is applied over an entire image. For this article, we will examine only 3×3 matrices.

For a 3×3 matrix, we select a 3×3 square in the image, and for each pixel, we multiply that pixel by its corresponding matrix position. We then set the pixel in the center of that 3×3 square to the average of those 9 pixels after the multiplication.

If you wanted this to output visually, you can simply set a pixel to 0 if it’s less than 0, and 255 if it’s greater than 255.

Immediately, you might realize that if we have to select a 3×3 square in the original image, then our convolution matrix would be useless if we selected the top left pixel. If the top left pixel is selected, then you wouldn’t be able to create a 3×3, since you would only have 4 pixels from the 3×3 (ie; you’d have a 2×2) and would be missing the remaining 5 pixels.

There are a wide variety of ways to handle these cases, although we won’t cover them in any depth in this article. For example, You could duplicate the 2×2 four times, by rotating the 2×2 around the center pixel to fill in the missing pixels, or you could just trivially set the missing pixels to 0 (results may be poor if you do this though).

There are massive lists of convolution matrices that can do all sorts of things from sharpening, blurring, detecting vertical lines, and detecting horizontal lines. Here’s our plane after applying a convolution matrix for detecting horizontal lines. Specifically, this matrix is [(-1, -1, -1), (2, 2, 2), (-1, -1, -1)]

Similarly, here’s the result after applying a convolution matrix for detecting vertical lines. The matrix for this one is [(-1, 2, -1), (-1, 2, -1), (-1, 2, -1)].

You might be wondering, “But how does this help me? It doesn’t reduce processing times at all!”. And you’re right. This only makes your processing time longer. However, notice that once you use convolution to extract out the high-level details you want, like edges, your image now has a lot of the excessive noise removed. For example, in the image above, you can see that the sky is no longer in the image.

This means that we’ve isolated the important parts of the images, which allows us to safely reduce the size of the resulting matrix without a huge loss in detail.

SIDE NOTE: You may be wondering why we can’t just downsize the image before we perform any processing steps on it. The reason for this is that if you downsize the image right away, you will almost always lose important detail. Additionally, downsizing an image can create artifacts, and if you are looking for particularly small details, like a 2-4 pixel pattern in a large image, you will almost certainly lose that detail when you scale down the image. This is why you should capture those details first before scaling down.

Pooling

In a nutshell, pooling is a technique to reduce the size of a matrix. You pool after you apply your convolutions, because each time you pool, you will lose some features.

Generally, each cycle of pooling will decrease the number of features in your image by some multiplicative constant. It’s trivial to see that if you continuously pool your image over and over again, you will eventually lose too much detail (like if you pooled until you just had a single 1×1 matrix).

Pooling works by first selecting an arbitrarily sized square. Let’s say you want to use a 4×4 square. The goal of pooling is to take this 4×4 square in a matrix, and reduce it to a single 1×1 matrix. This can be done in many ways. For example, max pooling is when you take the maximum value in that 4×4 matrix, average pooling is when you average all the values of the matrix, and min pooling is when you take the minimum value from the matrix.

As a rule of thumb, you will want to use max pooling since that captures the most prominent part of the 4×4 matrix. For example, in edge detection, you would want to use max pooling because it would downsize the matrix while still showing you the location of the edges.

What you would not use is min pooling, because if there is even a single cell where no edge was detected inside a 4×4 matrix that is otherwise full of edges, the pooling step would leave you with a value of 0, indicating that there was no edge in that 4×4 matrix.

For a better understanding of why you should pool, consider the fact that a 4K image is a 3840 x 2160 image, which is 8,294,400 individual features to process. Suppose we can process ten 4K images a second (82,940,000 features a second). Let’s compare the original 3840 x 2160 representation versus a 480 x 270 pooled representation.

# Images3840 x 2160 image (time)480 x 270 image (time)
101 second0.015625 seconds
1,00016.67 minutes1.56 seconds
1,000,00011.57 days26.04 minutes
1,000,000,00031.71 years18.0844 days

At ten 4K images a second, it would take over 30 years to process a million images, whereas it would only take 18 days if you had done pooling.

Conclusion

When processing images, especially high-resolution images, it’s important that you shrink down the number of features. This can be done through many methods. In this article, we covered converting an image to grayscale, as well as techniques such as convolution to extract important features, and then pooling to reduce the spatial complexity.

In this article, we compared the difference between pooling and not pooling, and found that the difference of analyzing a million 4K grayscale image without pooling would take 31 years, versus 18 days if we had pooled it down to a 480 x 270 image. However, not turning the images into grayscale can also have a noticeable effect.

As a final food for thought, if you had performed none of the optimizations mentioned in this article, analyzing a million full-color 4K resolution images with convolutions would take nearly an entire century, versus a measly 18 days if you had turned them into grayscale and then performed convolution and pooling.

In other words, with no optimizations, your image processing would take so long, that you could be rolling in your grave, and your program still wouldn’t be done running.