What does code coverage measure?
This week, I was having a conversation with another engineer, and they had an interesting remark. They claimed that devops should have some control over a project pipeline workflow in case we impose a code coverage threshold. Otherwise, “an engineer would be able to just lower it to pass the pipeline”. There are a few issues with this claim:
- The change to lower the coverage threshold would show in the review, so it is not very easy to sneak it in.
- An engineer can do much more damage than cheating the code coverage threshold in an environment governed by mistrust.
- If the entire dev team is “evil”, even if they don’t have control over the config for the threshold, they can still easily cheat the code coverage instead.
What does code coverage measure?
Code coverage measures lines of code executed, not lines of code validated, as we might mistakenly expect. To cheat, one can write some unit tests that are providing just the right input (in order to cover as many lines of code as possible) without any consideration for the results of the code execution. Take the following silly function as an example.
// foo.js
function foo(...args) {
switch (args.length) {
case 0:
throw new Error("Expected at least 1 param");
case 1:
throw new Error("Neh, 2 would do");
case 2:
return 2;
case 3:
return undef.something;
default:
return -1;
}
}
module.exports = foo;
We can easily cover each branch of the switch
statement by calling foo
with
0
, 1
, 2
, 3
, and 4
arguments. We can do this in a simple for
loop.
for (let i = 0; i < 5; i++) {
const args = Array.from(Array(i).keys()); // [0..i]
foo(...args);
}
Here is how the entire test file looks like. We guarded our function calls in a
try/catch
, and we are only doing a dummy assertion at the end.
// foo.test.js
const foo = require("./foo");
test("test foo", () => {
// call test function ignoring its execution
for (let i = 0; i < 5; i++) {
try {
const args = Array.from(Array(i).keys()); // [0..i]
foo(...args);
} catch {}
}
expect(true).toBeTruthy();
});
To make everyone’s life easier, I put all the code together on my GitHub
page. After installing the
dependencies and running npm run test -- --coverage
, we will get a report
similar to the one below. We have indeed covered all the branches of the
switch
statement, while all we asserted is if true
is a value that is
coerced to true
when a boolean
is expected.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
5x 1x 1x 1x 1x 1x 1x
// foo.js
function foo(...args) {
switch (args.length) {
case 0:
throw new Error("Expected at least 1 param");
case 1:
throw new Error("Neh, 2 would do");
case 2:
return 2;
case 3:
return undef.something;
default:
return -1;
}
}
module.exports = foo;
Nice cheat! We didn’t even care that undef.something
will throw an
Uncaught ReferenceError: undef is not defined
exception, all the errors
were caught and the test passed.
How does code coverage work?
When we pass --coverage
to the test runner (jest in my
case), our code is first preprocessed to include some extra code which will do
the coverage line counting. Here is how a very simplified version of that would
look like.
const cov = Array(18).fill(0);// foo.js
function foo(...args) {
cov[2]++;switch (args.length) {
case 0:
cov[4]++;throw new Error("Expected at least 1 param");
case 1:
cov[6]++;throw new Error("Neh, 2 would do");
case 2:
cov[8]++;return 2;
case 3:
cov[10]++;return undef.something;
default:
cov[13]++;return -1;
}
}
cov[17]++;module.exports = foo;
module.exports.__cov__ = cov;
We can now simulate our test coverage with the same loop we used earlier.
const foo = require("./foo-cov");
for (let i = 0; i < 5; i++) {
try {
const args = Array.from(Array(i).keys()); // [0..i]
foo(...args);
} catch {}
}
console.log(JSON.stringify(foo.__cov__));
Et voilà, it printed the same array that I’ve used to configure the code coverage component used earlier in the article.
import CodeBlock from "../../components/CodeBlock.astro";
export const coverage = [0, 0, 5, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1]
<CodeBlock info={coverage} showLineNumbers={true} print={(d) => `${d}x`}>
```javascript
// foo.js
function foo(...args) {
switch (args.length) {
case 0:
throw new Error("Expected at least 1 param");
case 1:
throw new Error("Neh, 2 would do");
case 2:
return 2;
case 3:
return undef.something;
default:
return -1;
}
}
module.exports = foo;
```
</CodeBlock>
Is code coverage useless then?
No, it is not. Code coverage is a great tool to help you understand how well you are doing on testing. But, like with any type of measurements, you should take its results with a grain of salt. For instance, snapshot testing could lead to false positives in terms of line of code covered but not properly verified. Also, imposing a ridiculous threshold (like 98% coverage) will cause more damage that it will help. Developers will have to start exposing aspects of the code that were supposed to be encapsulated, just for the sake of reaching the necessary mark needed for the pipeline to pass.