What does code coverage measure?

Friday, November 3, 2023

This week, I was having a conversation with another engineer, and they had an interesting remark. They claimed that devops should have some control over a project pipeline workflow in case we impose a code coverage threshold. Otherwise, “an engineer would be able to just lower it to pass the pipeline”. There are a few issues with this claim:

The change to lower the coverage threshold would show in the review, so it is not very easy to sneak it in.
An engineer can do much more damage than cheating the code coverage threshold in an environment governed by mistrust.
If the entire dev team is “evil”, even if they don’t have control over the config for the threshold, they can still easily cheat the code coverage instead.

What does code coverage measure?

Code coverage measures lines of code executed, not lines of code validated, as we might mistakenly expect. To cheat, one can write some unit tests that are providing just the right input (in order to cover as many lines of code as possible) without any consideration for the results of the code execution. Take the following silly function as an example.

// foo.js
function foo(...args) {
  switch (args.length) {
    case 0:
      throw new Error("Expected at least 1 param");
    case 1:
      throw new Error("Neh, 2 would do");
    case 2:
      return 2;
    case 3:
      return undef.something;

    default:
      return -1;
  }
}

module.exports = foo;

We can easily cover each branch of the switch statement by calling foo with 0, 1, 2, 3, and 4 arguments. We can do this in a simple for loop.

for (let i = 0; i < 5; i++) {
  const args = Array.from(Array(i).keys()); // [0..i]
  foo(...args);
}

Here is how the entire test file looks like. We guarded our function calls in a try/catch, and we are only doing a dummy assertion at the end.

// foo.test.js
const foo = require("./foo");

test("test foo", () => {
  // call test function ignoring its execution
  for (let i = 0; i < 5; i++) {
    try {
      const args = Array.from(Array(i).keys()); // [0..i]
      foo(...args);
    } catch {}
  }

  expect(true).toBeTruthy();
});

To make everyone’s life easier, I put all the code together on my GitHub page. After installing the dependencies and running npm run test -- --coverage, we will get a report similar to the one below. We have indeed covered all the branches of the switch statement, while all we asserted is if true is a value that is coerced to true when a boolean is expected.

// foo.js
function foo(...args) {
  switch (args.length) {
    case 0:
      throw new Error("Expected at least 1 param");
    case 1:
      throw new Error("Neh, 2 would do");
    case 2:
      return 2;
    case 3:
      return undef.something;

    default:
      return -1;
  }
}

module.exports = foo;

Nice cheat! We didn’t even care that undef.something will throw an Uncaught ReferenceError: undef is not defined exception, all the errors were caught and the test passed.

How does code coverage work?

When we pass --coverage to the test runner (jest in my case), our code is first preprocessed to include some extra code which will do the coverage line counting. Here is how a very simplified version of that would look like.

const cov = Array(18).fill(0);// foo.js
function foo(...args) {
cov[2]++;switch (args.length) {
    case 0:
cov[4]++;throw new Error("Expected at least 1 param");
    case 1:
cov[6]++;throw new Error("Neh, 2 would do");
    case 2:
cov[8]++;return 2;
    case 3:
cov[10]++;return undef.something;

    default:
cov[13]++;return -1;
  }
}

cov[17]++;module.exports = foo;
module.exports.__cov__ = cov;

We can now simulate our test coverage with the same loop we used earlier.

const foo = require("./foo-cov");

for (let i = 0; i < 5; i++) {
  try {
    const args = Array.from(Array(i).keys()); // [0..i]
    foo(...args);
  } catch {}
}

console.log(JSON.stringify(foo.__cov__));

Et voilà, it printed the same array that I’ve used to configure the code coverage component used earlier in the article.

import CodeBlock from "../../components/CodeBlock.astro";
export const coverage = [0, 0, 5, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1]

<CodeBlock info={coverage} showLineNumbers={true} print={(d) => `${d}x`}>
```javascript
// foo.js
function foo(...args) {
  switch (args.length) {
    case 0:
      throw new Error("Expected at least 1 param");
    case 1:
      throw new Error("Neh, 2 would do");
    case 2:
      return 2;
    case 3:
      return undef.something;

    default:
      return -1;
  }
}

module.exports = foo;
```
</CodeBlock>

Is code coverage useless then?

No, it is not. Code coverage is a great tool to help you understand how well you are doing on testing. But, like with any type of measurements, you should take its results with a grain of salt. For instance, snapshot testing could lead to false positives in terms of line of code covered but not properly verified. Also, imposing a ridiculous threshold (like 98% coverage) will cause more damage that it will help. Developers will have to start exposing aspects of the code that were supposed to be encapsulated, just for the sake of reaching the necessary mark needed for the pipeline to pass.