PHPUnit and C.R.A.P index

Is it really a warning? Generally the threshold for warnings is set much higher than 1 (perhaps around 30). There is a good SO post here that shows how the number is calculated. There seems to be a few hardcoded values in my phpunit setup for CRAP of 30.

According to Alberto Savoia, the creator of the CRAP index:

"The C.R.A.P. (Change Risk Analysis and Predictions) index is designed to analyze and predict the amount of effort, pain, and time required to maintain an existing body of code."

The minimum CRAP number will be the cyclomatic complexity for code with 100% coverage. The idea being that changes to complex code are more likely to create problems than changes to simple code.

CRAP is only an indicator. On it's own it's about as useful as "How long is a piece of string?" except in addition to being a question with an indeterminate answer, it's also an answer with an indeterminate question.

If you know what it's measuring then you can use it as a very basic indicator of complexity. To make more use of it, you need a fair bit of experience to compare implementations. After that you ideally want comparisons between implementations of the same thing. After that you need to know intimately about the code being tested and chances are if you do, you have better insight than the CRAP score.

The higher it is the higher the probability there is to improve it on a few fronts such as testability (including efficiency) and points of change. However, it's not until scores of over 8000 or 9000 that probability of something being absolute CRAP starts to approach certainty. Something as basic as processing the nodes from a parsed XML document for a function that can't be improved in any decisive manner can easily hit complexities into the hundreds while being perfectly fine.

It's a bit like spending money. For a given purpose you might have to spend a minimum amount. It could be a million or it could be a thousand but regardless of the purpose we tend to assume that the higher the spending the chance of it being excessive. But perhaps it needs to be high, perhaps you're buying a yacht. Naively forcing numbers down isn't just making the same mistake in the other direction but is genuinely dangerous. 71 people were burnt to death or asphyxiated in Grenfell Tower due to this catastrophic error in thinking, believing that something could be best achieved purely going by the numbers alone.

You shouldn't assume that reducing CRAP improves testability or maintainability. Quite often a high CRAP is simply reporting a measure of mandatory complexity. You can technically reduce CRAP, gaming the numbers, while decreasing testability, maintainability and readability. You can only improve these things by actually improving them. CRAP isn't even a reliable measure of improvement. Sometimes CRAP might go down after an improvement. Sometimes it might go up. The problem is with metrics games is people often just displace the problem or hide the thing being measured as an indicator of complexity.

A common example is to use a map instead of a switch or if statements. I tend to do this myself religiously. We forget however that we're displacing the complexity. In this case we can, we have a library with a map utility that's maintained and can be relied upon. If you include that map function compared to a few if statements, an overall measure of complexity would be through the roof. When you don't have such a utility at your disposal, you need to be very careful how you go about reducing complexity. For example, if you go about it trying to eliminate if statements and for loops altogether, I wish you good luck with that.

Cyclomatic complexity does tend to reflect quite well if the speed of tests can be improved. This applies trivially to cases such as having two if statements in a function. If the state the second relies upon differs based on if the first matches or not then you have to run the first redundantly (4 times instead of 2). When you combine code often the possible permutations goes up non-linearly. If you have eight functions that take a boolean and return a boolean then you can test each one individually to get 8 times 2 (boolean has two possible input values) tests (16 tests). However if you combine all of those functions then there are 256 different possible combinations of input. Cyclomatic complexity can help indicate where this might be the case.

It also gives some indication of how many tests you really need in terms of function times parameters. If you have a function with one boolean parameter and one if statement based on it then you need at least two tests to get at least full code coverage. Two boolean parameters and two if statements then either four or three depending if the ifs are nested. The number of combinations you might need to test increases by the power of two just for each if added after an if in the worst case.

This can create a conflict, forcing you to fragment code prematurely, as in runtime you may never actually have that issue. You generally shouldn't worry about that until tests start to consume a lot of resources or actually start to become disproportionately unwieldy. In that case you wouldn't rely on CRAP but understanding of the code's execution and benchmarks.

CRAP can get it wrong as it makes a fairly naive guess about complexity. You're getting closer to a pessimistic or worst case estimate with it. I'm looking at a piece of code that's got a high CRAP but it's not able to tell between having if($constantInScope)etc;etc;if($constantInScope)etc; or if($varA)etc;etc;if($varB)etc;.

If a single function genuinely has hundreds, thousands or millions of possible outcomes in terms of execution path then that is probably a good sign that there are issues testing it. that might be inescapable. Conversely, it might be far easier to test than indicated. The limited ability for cyclomatic complexity to measure that might be why CRAP also appears to include a counter weight of coverage. If you tested it and got a lot of coverage then it probably isn't that hard to test as the cyclomatic complexity thought, especially keeping in mind that it's possible to make things impossible to fully test in terms of execution paths alone by necessity, albeit very rare.

A simple example of why CRAP is useless, unroll your loops and replace if statements with mathematical statements and the like to reduce CRAP.

The minimum CRAP score is 1, not 0. This is because the algorithm for CRAP is

CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)

and the minimum cyclomatic complexity (comp) value for a function is one. So the problem is not in phpunit, but whatever is flagging a CRAP of 1 as a problem.

In general, you want to set your CRAP threshold somewhere around 5, anywhere lower and you may as well just use a simple code coverage metric (and shoot for 100%) since the complexity factor barely weighs in. A CRAP of >= 30 means that no amount of testing can make your method not crappy.

Cyclomatic complexity can generally (but there is more than one definition) be hand calculated as:

add 1 point for the function call
add 1 point for every loop
add 1 point for every branch

PHPUnit and C.R.A.P index

Tags:

Php

Phpunit

Metrics

Phpundercontrol

Related

Recent Posts