December 12, 2019
4 min read βοΈ
Fell for the ol' default JavaScript sorting algorithm trap when working with numbers. π²
It's been a little while since my last post (about a month and a half). This wasn't intentional since I have a lot of things in flight right now. That being said, I do have a lot of things in the pipeline for upcoming posts.
Tonight I made a script to automate the boring stuff when creating a blog post. It's a small-ish node driven command line script that scaffolds out a new blog post using a template. In my website's repo, like most (if not all) Gatsby driven static sites, my posts are located under content/blog. The directory names for each are prefixed with a zero-padded index, followed by a kebab-cased path (which is generally similar to the title and happens to be what makes up the slug).
For example, 0005-git-commit-m-write-more
, which maps to this post.
For each new post I make, I simply increment that prefix number for superficial sorting within my content/blog
directory.
While I do order them by date on my site, I like seeing my directory listed in posted order, since without the number prefix, the order is alphabetical.
While alphabetical is fine, I'm gaming the system since numbers are "alphabetically before" letters.
Given the above context, my next blog post will be prefixed with "0009"
.
Who Wants to be a JavaScriptonaire?
So, what did I fall for? Well, to explain, let's play a game. The stakes are high. The prize... knawwwledge. Let's consider the following code snippet:
const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];
const sortedList = list.sort();
What will the result of the sortedList
variable be?
- (A):
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
- (B):
[1, 10, 2, 3, 4, 5, 6, 7, 8, 9]
- (C):
"Bush did 911"
- (D): None of the above
I'll wait... No peeking π. Or, you can cheat and scroll down to the answer π
Array.prototype.sort(of?)
Let's go to my favorite site south of Reddit, MDN:
The
sort()
method sorts the elements of an array in place and returns the sorted array. The default sort order is ascending, built upon converting the elements into strings, then comparing their sequences of UTF-16 code units values.
Emphasis mine.
Did you catch it?
While there are plenty of things to talk about here (like sorting in place), let's focus on what happens when you don't provide your comparator function.
Let's .reverse()
and replay that back:
The default sort order is ascending, built upon converting the elements into strings, then comparing their sequences of UTF-16 code unit values.
So, the following list of numbers from ten down to one will be converted into strings when sorted. Let's pretend we can see pseudo-intermediary steps, so that:
const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];
const sortedList = list.sort();
turns into something like:
const list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1];
// convert list of numbers to strings
const listAsStrings = ['10', '9', '8', '7', '6', '5', '4', '3', '2', '1'];
// sort the list, convert back into numbers
const sortedList = listAsStrings.sort().map((number) => parseInt(number, 10));
Again, the default .sort()
behavior is:
comparing their sequences of UTF-16 code units values
which means that a comparison between two strings in the list is done character by character (really, code point by code point) until one of them is considered alphabetically before the other, e.g. "abc"
comes alphabetically before "xyz"
.
The Correct Answer Is...
If you answered (A), you're...
...INCORRECT!
If you answered (B), you're...
...CORRECT! sadly... π’
Surely you can't be serious.
The Problematic Code
The reason why this bit me is that I initially wrote the code to calculate the next prefix number using .sort()
.
Here's the old way I wrote it:
function newPostNumber() {
const postNumbers = fs
.readdirSync(blogsPath) // read the content/blog directory, like `ls`
.map((filename) => parseInt(filename.split('-')[0], 10)) // get the prefix, as a number
.filter(Boolean) // remove any falsey things, like things that don't match our pattern
.sort() // sort the list in order
.reverse(); // reverse the list so I can pick the first one (last number)
// increment the last number used by one
const postNumber = postNumbers[0] + 1;
// pad the post number with up to 3 zeros (4 number slots)
return `${postNumber}`.padStart(4, '0');
}
I used
.sort()
.
So, the list of numbers was converted to strings and then... well, you know the rest.
After "0009"
, I would have been stuck generating "0010-..."
directories! π³
The Fix
To fix the above code, all I needed to do was provide my own comparator
function:
- .sort()
+ .sort((lhs, rhs) => lhs - rhs)
This comparator function returns true when the first argument is less than the second, forcing a swap between the two. In other words, we're incrementing in ascending order, what we originally wanted.
function newPostNumber() {
const postNumbers = fs
.readdirSync(blogsPath)
.map((filename) => parseInt(filename.split('-')[0], 10))
.filter(Boolean)
.sort((l, r) => l - r)
.reverse();
const postNumber = postNumbers[0] + 1;
return `${postNumber}`.padStart(4, '0');
}
Conclusion
Look, I get it. It works for strings. The default sorting algorithm is pretty generic. But maybe that's the problem?
In the wise words of the Letterkenny gang:
JavaScript, you figure it out.