The blocks in code coverage with VS2010
The technical term for a block in code coverage is basic block. To crib directly from the Wikipedia entry:
The code in a basic block has one entry point, meaning no code within it is the destination of a jump instruction anywhere in the program, and it has one exit point, meaning only the last instruction can cause the program to begin executing code in a different basic block. Under these circumstances, whenever the first instruction in a basic block is executed, the rest of the instructions are necessarily executed exactly once, in order.
A basic block is important in code coverage because we can insert a probe at the beginning of the basic block. When this probe is hit, we know that all of the following instructions in that basic block will be executed (due to the properties of a basic block).
Unfortunately, with compilers (and especially with optimizations), it's not always apparent how source code maps to basic blocks. The easiest way to tell is to look at the generated assembly. For example, let's look at your original main
& testfunction
:
For main
, I see the assembly below (interleaved with the original source). Similarly to what Peter does here, I have noted where the basic blocks start.
int main()
{
013B2D20 push ebp <--- Block 0 (initial)
013B2D21 mov ebp,esp
013B2D23 sub esp,40h
013B2D26 push ebx
013B2D27 push esi
013B2D28 push edi
testfunction(-1);
013B2D29 push 0FFFFFFFFh
013B2D2B call testfunction (013B10CDh)
013B2D30 add esp,4 <--- Block 1 (due to call)
testfunction(1);
013B2D33 push 1
013B2D35 call testfunction (013B10CDh)
013B2D3A add esp,4 <--- Block 2 (due to call)
}
013B2D3D xor eax,eax
013B2D3F pop edi
013B2D40 pop esi
013B2D41 pop ebx
013B2D42 mov esp,ebp
013B2D44 pop ebp
013B2D45 ret
We see that main
has three basic blocks: one initial block, and the other two because of the function calls. Looking at the code, this seems reasonable. testfunction
is a little tougher. Just looking at the source, there appears to be three blocks:
- The entry to the function and logic test (
input > 0
) - The condition true branch (
return 1
) - The condition false branch (
return 0
)
However, because of the actual generated assembly, there are four blocks. I'm assuming you built your code with optimizations disabled. When I build with VS2010 in the Debug configuration (optimizations disabled), I see the following disassembly for testfunction
:
int testfunction(int input)
{
013B2CF0 push ebp <--- Block 0 (initial)
013B2CF1 mov ebp,esp
013B2CF3 sub esp,40h
013B2CF6 push ebx
013B2CF7 push esi
013B2CF8 push edi
if (input > 0) {
013B2CF9 cmp dword ptr [input],0
013B2CFD jle testfunction+18h (013B2D08h)
return 1;
013B2CFF mov eax,1 <--- Block 1 (due to jle branch)
013B2D04 jmp testfunction+1Ah (013B2D0Ah)
}
else {
013B2D06 jmp testfunction+1Ah (013B2D0Ah) <--- Not a block (unreachable code)
return 0;
013B2D08 xor eax,eax <--- Block 2 (due to jmp branch @ 013B2D04)
}
}
013B2D0A pop edi <--- Block 3 (due to being jump target from 013B2D04)
013B2D0B pop esi
013B2D0C pop ebx
013B2D0D mov esp,ebp
013B2D0F pop ebp
013B2D10 ret
Here, we have four blocks:
- The entry to the function
- The condition true branch
- The condition false branch
- The shared function epilog (cleaning up the stack and returning)
Had the compiler duplicated the function epilog in both the condition true and condition false branches, you would only see three blocks. Also, interestingly, the compiler inserted a spurious jmp
instruction at 013B2D06
. Because it's unreachable code, it's not treated as a basic block.
In general, all of this analysis is overkill since the overall code coverage metric will tell you what you need to know. This answer was just to highlight why the number of blocks isn't always obvious or what's expected.
According to MSDN on Code Coverage Data Overview:
Code coverage data is calculated for code blocks, lines of code, and partial lines if they are executed by a test run. A code block is a code path with a single entry point, a single exit point, and a set of instructions that are all run in sequence. A code block ends when it reaches a decision point such as a new conditional statement block, a function call, exception throw, enter, leave, try, catch, or a finally construct.
Main Block:
- Method entry
- testfunction
- testfunction
Testfunction block:
- Method entry
- If / Else
- Return
- Method call