A TDD implementation of Node.js Path module

Saturday, January 20, 2024

How many TDD practitioners does it take to change a light bulb? They don’t change the bulb; they refactor the darkness!

Introduction

I have asked ChatGPT to tell me some jokes about test-driven development (TDD) and out of 12 generated answers none included unicorns. That’s a bit of a disappointment considering how similar they are. Everyone is a big advocate of TDD and how great it is but I am yet to find a team that practices it consistently.

Ironically, I’ve chosen to join the advocates group by documenting a step-by-step TDD implementation of the Node.js Path module. But before we dive in, you might be wondering, ‘What is test-driven development?’ I have asked ChatGPT to answer this question for you, and here is the result:

Test-Driven Development (TDD) is a software development approach in which tests
are written before the actual code implementation. The TDD process typically
follows these steps:

1. **Write a Test**: Before writing any code, a developer creates a test that
defines a specific function or improvement of the software.

2. **Run the Test**: The newly created test is executed, and it should fail
because the corresponding code to fulfill the requirements hasn't been
implemented yet.

3. **Write Code**: The developer writes the minimum amount of code necessary to
pass the test. The primary goal is to make the test pass, not to create a
fully-featured solution.

4. **Run Tests**: After writing the code, all tests, including the new one, are
executed to ensure that the recent changes haven't broken any existing
functionality.

5. **Refactor Code**: If necessary, the code is refactored to improve its
structure, readability, or efficiency. At each step, the tests are rerun to make
sure that the changes haven't introduced errors.

6. **Repeat**: Steps 1-5 are repeated iteratively for each new piece of
functionality.

TDD is designed to promote a more robust and maintainable codebase by ensuring
that every piece of code has corresponding tests and that new features or
changes do not negatively impact existing functionality.

Pretty straight forward, right? TDD not only fosters a more robust and maintainable codebase but also mitigates the impact of “implementation bias” (status quo bias). This cognitive bias occurs when a programmer is influenced by existing code while crafting a new unit of code.

Consider this scenario: I’ve created a function tasked with parsing and processing an array of strings (we will actually write that later on in the path.resolve implementation). If I initially write the code without accounting for certain corner cases, chances are high that I’ll overlook those same scenarios when crafting the corresponding tests. Also, another programmer aiming to enhance code coverage may be willing to write unit tests for my implementation. Their focus could be on ensuring each statement is executed at least once, potentially achieving full unit coverage without a comprehensive understanding of the implementation details. Consequently, they might overlook the same corner cases that eluded me as the original author.

Additionally, having a test-first approach will ensure less coupling between your components, and a better design of public interfaces.

Project setup

The project is configured as a basic JavaScript module without any build steps. The initial commit introduces the necessary GitHub workflow to execute the pipeline, establishes a straightforward test setup utilizing the built-in Node.js test runner, and uses a simple forwarding mechanism for the node:path module (as a starting point):

// lib/b-path.js
import path from "node:path";

export default path;

The TDD approach

We are now ready to add our first tests and make sure they fail by swapping our forwarding implementation with an empty one. Here is how that looks like:

// tests/b-path.test.js
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import bPath from "../lib/b-path.js";

describe("b-path", () => {
  describe("POSIX", () => {
    it("path.delimiter should provide the path delimiter", () => {
      assert.strictEqual(bPath.delimiter, ":");
    });

    it("path.sep should provide the path separator", () => {
      assert.strictEqual(bPath.sep, "/");
    });
  });
});

// lib/b-path.js
const posix = {
  posix: null,
  win32: null,
};

posix.posix = posix;

export default posix;

Note that we will only focus on the POSIX implementation and completely ignore the Win32 API. With that in mind, we can now move to Step 3 and write the actual code, making the tests pass. This is as simple as adding the two missing properties to our posix object.

+
+

// lib/b-path.js
const posix = {
  delimiter: ":",
  sep: "/",
  posix: null,
  win32: null,
};

path.isAbsolute(path)

Moving on to path.isAbsolute, we’ll once again start with an empty implementation and write some tests based on the official documentation.

+
+

// lib/b-path.js
const posix = {
  isAbsolute: () => {},

  delimiter: ":",
  sep: "/",
  posix: null,
  win32: null,
};

 
 
 
 
 
 
 
 
 
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+

// tests/b-path.test.js
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import bPath from "../lib/b-path.js";
describe("b-path", () => {
  describe("POSIX", () => {
    it("path.delimiter should provide the path delimiter", () => {
      assert.strictEqual(bPath.delimiter, ":");
    });
    it("path.sep should provide the path separator", () => {
      assert.strictEqual(bPath.sep, "/");
    });

    // input | expected result
    [
      ["", false],
      [".", false],
      ["foo/", false],
      ["foo.bar", false],
      ["/foo/bar", true],
      ["/baz/..", true],
    ].forEach(([input, result]) => {
      it(`path.isAbsolute("${input}") should return '${result}'`, () => {
        assert.strictEqual(bPath.isAbsolute(input), result);
      });
    });
  });
});

Step 3 rightfully emphasizes that “the primary objective is to ensure the test passes, rather than creating a fully-featured solution”. As you can see, I haven’t written any tests to address scenarios where the input is not a string and we must raise a TypeError. We can incorporate those tests a bit later, but for now, let’s focus on making our existing tests pass.

 
 
+
+
+
+
+
+
+
+

// lib/b-path.js
const posix = {
  /**
   * The isAbsolute() method determines if input path is an absolute path.
   * @param {string} path - input path
   * @returns {boolean}
   */
  isAbsolute: (path) => {
    return !!path && path.charCodeAt(0) === posix.sep.charCodeAt(0);
  },

  delimiter: ":",
  sep: "/",
  posix: null,
  win32: null,
};

Running all the tests (Step 4) will now output the following results:

> b-path@0.0.0 test
> node --test tests

▶ b-path
  ▶ POSIX
    ✔ path.delimiter should provide the path delimiter (0.745045ms)
    ✔ path.sep should provide the path separator (0.164463ms)
    ✔ path.isAbsolute("") should return 'false' (0.186978ms)
    ✔ path.isAbsolute(".") should return 'false' (0.125469ms)
    ✔ path.isAbsolute("foo/") should return 'false' (0.250108ms)
    ✔ path.isAbsolute("foo.bar") should return 'false' (0.204638ms)
    ✔ path.isAbsolute("/foo/bar") should return 'true' (1.390664ms)
    ✔ path.isAbsolute("/baz/..") should return 'true' (0.16196ms)
  ▶ POSIX (4.772938ms)

▶ b-path (6.41931ms)

ℹ tests 8
ℹ suites 2
ℹ pass 8
ℹ fail 0
ℹ cancelled 0
ℹ skipped 0
ℹ todo 0
ℹ duration_ms 104.492059

The not-so-happy path

As mentioned earlier, our current code and tests only cover the “happy path” where the input is provided as expected. Now, let’s address the “not-so-happy path” and consider various input types. As always, we’ll start with the tests first.

 
 
 
 
 
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+

// tests/b-path.test.js
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import bPath from "../lib/b-path.js";
describe("b-path", () => {
  describe("POSIX", () => {
    // ...

    [undefined, null, NaN, 42, BigInt(42), {}, [], Symbol(), false].forEach(
      (input) => {
        it(`path.isAbsolute should throw TypeError if input type is ${typeof input}`, () => {
          assert.throws(
            () => {
              bPath.isAbsolute(input);
            },
            {
              name: "TypeError",
              message: `The "path" argument must be of type string. Received ${typeof input}`,
            }
          );
        });
      }
    );
  });
});

And then proceed with the actual implementation.

 
+
+
+
+
+
+
+
+
+
+
+
+
+
+
 
 
 
 
 
 
 
+

// lib/b-path.js
/**
 * The assertIsString() function verifies whether the input is a string,
 * throwing a TypeError otherwise.
 * @param {*} input
 * @throws {TypeError}
 */
function assertIsString(input) {
  if (typeof input !== "string") {
    throw new TypeError(
      `The "path" argument must be of type string. Received ${typeof input}`
    );
  }
}

const posix = {
  /**
   * The isAbsolute() method determines if input path is an absolute path.
   * @param {string} path - input path
   * @returns {boolean}
   */
  isAbsolute: (path) => {
    assertIsString(path);
    return !!path && path.charCodeAt(0) === posix.sep.charCodeAt(0);
  },

  // ...
};

path.format(pathObject)

We continue our reverse engineering process with path.format. This method takes a pathObject as input and returns a path string based on the rules specified in the docs. We’ll start with the “happy path” where the input is an object and we’ll deal with the rest later. Once we have completed our implementation, we can swap it back with the ‘forwarding mechanism’ from the project setup and check that we have the right functionality.

Documenting every single step here would occupy too much space, instead, I have tagged the commits with pre-release tags to get a better overview. The very first version is somehow naive but it does cover most of the cases, which is a great starting point.

We should now handle the following scenario: “If only root is provided or dir is equal to root then the platform separator will not be included”. To do so, we need to make some changes to our tests.

 
 
 
 
-
+
+
+
 
-
+
 
-
+
 
-
+

// input | expected result
[
  [{}, ""],
  [{ root: "/root" }, "/root"],
  [{ root: "/root", name: "file", ext: ".txt" }, "/root/file.txt"],
  // only root is present so no separator is included
  [{ root: "/root", name: "file", ext: ".txt" }, "/rootfile.txt"],
  [{ root: "/root/", name: "file", ext: ".txt" }, "/root/file.txt"],
  // should add the extension "." if missing
  [{ root: "/root", name: "file", ext: "txt" }, "/root/file.txt"],
  [{ root: "/root/", name: "file", ext: "txt" }, "/root/file.txt"],
  // should ignore 'name' and 'ext' if 'base' is present
  [{ root: "/root", base: "base.sh", name: "file", ext: ".txt" }, "/root/base.sh"],
  [{ root: "/root/", base: "base.sh", name: "file", ext: ".txt" }, "/root/base.sh"],
  // should ignore 'root' if 'dir' is present
  [{ root: "/root", dir: "/dir", base: "file.txt"}, "/dir/file.txt"],
  [{ root: "/root/", dir: "/dir", base: "file.txt"}, "/dir/file.txt"],
].forEach(([input, result]) => {
  it(`path.format(${JSON.stringify(
    input
  )}) should return '${result}'`, () => {
    assert.strictEqual(bPath.format(input), result);
  });
});

We may attempt to correct our implementation by adjusting the return statement to return result.join("");. However, this adjustment would lead to a failure in our dir test:

✖ failing tests:

✖ path.format({"root":"/root/","dir":"/dir","base":"file.txt"}) should return '/dir/file.txt' (2.433298ms)
  AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
  + actual - expected

  + '/dirfile.txt'
  - '/dir/file.txt'
         ^
      at TestContext.<anonymous> (file:///home/elvis/eadomnica/b-path/tests/b-path.test.js:62:16)
      at Test.runInAsyncScope (node:async_hooks:206:9)
      at Test.run (node:internal/test_runner/test:580:25)
      at Suite.processPendingSubtests (node:internal/test_runner/test:325:18)
      at Test.postRun (node:internal/test_runner/test:649:19)
      at Test.run (node:internal/test_runner/test:608:10)
      at async Suite.processPendingSubtests (node:internal/test_runner/test:325:7) {
    generatedMessage: true,
    code: 'ERR_ASSERTION',
    actual: '/dirfile.txt',
    expected: '/dir/file.txt',
    operator: 'strictEqual'
  }

We can address this issue by ensuring that dir always includes a trailing slash, much like how we handled the absence of a dot in the extension. Here’s what our updated solution would look like:

/**
 * The path.format() method returns a path string from an object.
 * This is the opposite of `path.parse()`.
 * @param {object} pathObject
 * @param {string} pathObject.root
 * @param {string} pathObject.dir
 * @param {string} pathObject.base
 * @param {string} pathObject.name
 * @param {string} pathObject.ext
 * @returns {string}
 */
format: (pathObject) => {
  const result = [];

  const root = pathObject.root || "";
  let dir = pathObject.dir || "";
  if (dir && dir.charCodeAt(dir.length - 1) !== posix.sep.charCodeAt(0)) {
    dir = `${dir}${posix.sep}`;
  }
  const base = pathObject.base || "";
  const name = pathObject.name || "";
  let ext = pathObject.ext || "";
  if (ext && ext.charCodeAt(0) !== ".".charCodeAt(0)) {
    ext = `.${ext}`;
  }

  const dirName = dir || root;
  const baseName = base || `${name}${ext}`;

  if (dirName) result.push(dirName);
  if (baseName) result.push(baseName);

  return result.join("");
},

We’ll now proceed to substitute our current implementation with the ‘forwarding mechanism’ to confirm its correctness. Please note that the assertIsString function doesn’t yield the same TypeError message, so we’ll need to comment out those specific tests. Additionally, I’ll take this opportunity to write more tests and cover some other corner cases.

…

This is slightly unexpected, only Node.js v20.x tests are passing. Nonetheless, I find it okay since v20.x transitioned to a Long Term Support version last October.

As for the extra tests, I went ahead and included a few more test cases. What caught me off guard was that a trailing slash is added to the dir property even if it already exists. The only scenario where this doesn’t occur is if root is equal to dir, as specified in the docs. I had anticipated that the resulting path would be normalized but it doesn’t seem so. Consequently, I had to modify my implementation to match the original behavior by making the following change:

-
+

const root = pathObject.root || "";
let dir = pathObject.dir || "";
if (dir && dir.charCodeAt(dir.length - 1) !== posix.sep.charCodeAt(0))
if (dir && dir !== root) {
  dir = `${dir}${posix.sep}`;
}
const base = pathObject.base || "";

The full diff of the TDD approach for path.format (including the input check) can be seen here. Keep in mind that the commit at the bottom of the page is in fact our top commit.

path.normalize

Before we continue with path.parse and friends (path.basename, path.dirname, path.extname), let’s address one of the elephants in the room: path.normalize.

We can start by modifying the input sanity check test from path.isAbsolute to include our new method. With that out of the way, we can now think of the best approach in handling such a complex function.

I must admit that I have already done this exercise last year, but I have deleted my repository, so I will have to do this all over again. If I remember correctly, the approach I took was to split the string by the path.sep and use a reduce function to compute the normalized path. While this will work, I cannot stop and think about the amount of intermediate objects created and the time spent traversing the input multiple times. Can we do the normalization in one go? Well, I guess we can, with the good-old for loop.

…

Alright! The implementation so far is not as crazy as I would have thought, but there are definitely some unhandled corner cases that I will be adding later. At this point, there are a total of 20 commits, one of which includes a small refactoring. This is by far the biggest advantage of test-driven development: you can refactor your code as you see fit while having the reassurance that it will still function correctly. And, on top of that, we preserved 100% code coverage.

After adding the promised corner cases, there was a need for some small adjustments, mainly around how to handle navigating up from the top directory. But wait, what about some paths that make absolutely no sense? Something like:

["...", "..."], // makes no sense
[".../.", "..."], // same
[".../..", "."], // same

We are reverse engineering the existing behavior so I guess I will have to go ahead and fix those as well. Even with such a non-sense corner case, our final implementation doesn’t look very complex. Of course, as the author of the code I am extremely biased on this. And actually, as I pasted the code here, I realized I can do one more improvement. This is the final final version, I promise (unless you find a bug).

/**
 * Normalize a string path, reducing '..' and '.' parts.
 * When multiple slashes are found, they're replaced by a single one;
 * when the path contains a trailing slash, it is preserved.
 *
 * @param path string path to normalize.
 * @throws {TypeError} if `path` is not a string.
 */
normalize: (path) => {
  assertIsString(path);

  if (path.length === 0) {
    return ".";
  }

  if (path.length === 1) {
    return path;
  }

  const hasTrailingSep = path.charAt(path.length - 1) === posix.sep;
  const absolute = posix.isAbsolute(path);
  const fragments = [];
  let word = "";
  let dots = 0;

  const handleFragment = () => {
    if (dots > 2) {
      fragments.push(Array(dots).fill(".").join(""));
    }

    if (dots === 2) {
      if (fragments.length && fragments[fragments.length - 1] !== "..") {
        // navigate up but don't pop an existing ".."
        fragments.pop();
      } else if (!absolute) {
        // top reached and relative path,
        // absolute path is handled in the if (fragments.length === 0) below
        fragments.push("..");
      }
    }

    if (word.length) {
      fragments.push(word);
    }
  };

  for (let i = 0; i < path.length; i += 1) {
    // duplicated sep, skip
    if (
      path.charAt(i) === posix.sep &&
      i - 1 >= 0 &&
      path.charAt(i - 1) === posix.sep
    ) {
      continue;
    }

    if (path.charAt(i) === ".") {
      dots += 1;
      continue;
    }

    if (path.charAt(i) === posix.sep) {
      handleFragment();

      word = "";
      dots = 0;
      continue;
    }

    word += path.charAt(i);
  }

  handleFragment();

  if (fragments.length === 0) {
    if (absolute) {
      return posix.sep;
    }

    return hasTrailingSep ? "./" : ".";
  }

  let result = fragments.join(posix.sep);
  if (hasTrailingSep) {
    result += posix.sep;
  }

  return result;
},

For comparison, I have found a package that contains the posix-only implementation called path-browserify, which is an extract from Node.js v8.11.1 codebase. This is how that looks like:

// Resolves . and .. elements in a path with directory names
function normalizeStringPosix(path, allowAboveRoot) {
  var res = '';
  var lastSegmentLength = 0;
  var lastSlash = -1;
  var dots = 0;
  var code;
  for (var i = 0; i <= path.length; ++i) {
    if (i < path.length)
      code = path.charCodeAt(i);
    else if (code === 47 /*/*/)
      break;
    else
      code = 47 /*/*/;
    if (code === 47 /*/*/) {
      if (lastSlash === i - 1 || dots === 1) {
        // NOOP
      } else if (lastSlash !== i - 1 && dots === 2) {
        if (res.length < 2 || lastSegmentLength !== 2 || res.charCodeAt(res.length - 1) !== 46 /*.*/ || res.charCodeAt(res.length - 2) !== 46 /*.*/) {
          if (res.length > 2) {
            var lastSlashIndex = res.lastIndexOf('/');
            if (lastSlashIndex !== res.length - 1) {
              if (lastSlashIndex === -1) {
                res = '';
                lastSegmentLength = 0;
              } else {
                res = res.slice(0, lastSlashIndex);
                lastSegmentLength = res.length - 1 - res.lastIndexOf('/');
              }
              lastSlash = i;
              dots = 0;
              continue;
            }
          } else if (res.length === 2 || res.length === 1) {
            res = '';
            lastSegmentLength = 0;
            lastSlash = i;
            dots = 0;
            continue;
          }
        }
        if (allowAboveRoot) {
          if (res.length > 0)
            res += '/..';
          else
            res = '..';
          lastSegmentLength = 2;
        }
      } else {
        if (res.length > 0)
          res += '/' + path.slice(lastSlash + 1, i);
        else
          res = path.slice(lastSlash + 1, i);
        lastSegmentLength = i - lastSlash - 1;
      }
      lastSlash = i;
      dots = 0;
    } else if (code === 46 /*.*/ && dots !== -1) {
      ++dots;
    } else {
      dots = -1;
    }
  }
  return res;
}

//...

function normalize(path) {
  assertPath(path);

  if (path.length === 0) return '.';

  var isAbsolute = path.charCodeAt(0) === 47 /*/*/;
  var trailingSeparator = path.charCodeAt(path.length - 1) === 47 /*/*/;

  // Normalize the path
  path = normalizeStringPosix(path, !isAbsolute);

  if (path.length === 0 && !isAbsolute) path = '.';
  if (path.length > 0 && trailingSeparator) path += '/';

  if (isAbsolute) return '/' + path;
  return path;
},

You can be the judge of which version is easier to grasp.

path.parse(path)

To be continued…