LCOV

<h1>LCOV</h1>
<h3>
LCOV provides a very valuable starting point on how to improve
test quality, especially when visualizing large amounts of code.

LCOV provides a web-based view of source code to see what parts of 
the code has or has not been tested.
</h3>
<p>
By Jeremy C. Reed
reed@
</p>

<p>
This article introduces gcov, the GCC coverage testing tool, and
LCOV, a nice front-end for collecting coverage data and creating
navigatable HTML page reports. Note this is GCC specific.
</p>

<p>
These tools may be used to help someone new to some source code to
better understand it, help the developer by identifying places in
the code that could be optimized, and help the quality assurance
of the code by showing the parts of the code that aren't exercised
by tests.  In particular, tests often are written to show that
common or normal usage of the program works as intended, while a
coverage report may show that tests are also needed for input
errors, corner cases, or extreme settings.  The reporting is also
useful to verify that good test-driven development (TDD) is followed.
</p>

<p>
The gcov tool provides useful outputs, but quickly correlating a
lot of its provided details is difficult and time-consuming. The
LCOV suite makes is easier and quicker to analyze and understand
the coverage data.
</p>

<p>
The gcov tool (and library) are commonly included with the main
GCC package (such as the Red Hat gcc RPM or the Debian gcc package).
On some systems, the library may also be provided in a separate
library or development package for gcc.)
</p>

<p>
The gcov statistics are based on line-by-line recordings of when
lines of source code are executed. You may want to consider using
inline functions instead of multi-line macros, but it may still
have expected results. Also so the results are useful, follow a
programming style that only has one statement per line.
</p>

<p>
You may also want to compile without optimization to make sure that
compiler improvements don't hide places you may want to better
optimize. (For example, you may need to turn off the gcc -O
optimizations.)
</p>

<p>
To generate the gcov data, compile your program with two GCC options:
-fprofile-arcs -ftest-coverage.
</p>

<p>
The -fprofile-arcs gcc switch tells the compiler to add code so
the program will keep track of how many times lines of code or code
blocks are executed.  As the running program exits, it will save
this record to a file with extension ".gcda" corresponding for each
source file. These outputs are saved in the same locations as the
corresponding object files. Note that the .gcda files are not
overwritten, but updated with the new execution counts, on multiple
or concurrent runs.
</p>

<p>
The -ftest-coverage gcc switch is used to create notes files that
gcov uses to show the program coverage.  These files are created
at compile time with the file extension of ".gcno".
</p>

<p>
(You may use your other compiler flags as desired, but note again,
that some optimizations may skew the line-by-line coverage reports.)
</p>

<p>
The instrumentation code is provided in the libgcov library, commonly
included with GCC. So use the gcc switch -lgcov to tell the linker
to combine with it.
</p>

<p>
A short-cut for these three switches is the GCC --coverage switch.
For common autoconf/automake based builds, you may use the following
to prepend the switch to the additional compiler flags:
</p>

<pre>
  ./configure CFLAGS="--coverage"
</pre>

<p>
Then build the program with "make".  In some build environments
you may need to modify the Makefile or build specification to
introduce the compiler switch (and to optionally turn off compiler
optimizations).
</p>

<p>
When the program is ran, either manually or part of a test suite,
the corresponding .gcda file or multiple files are created -- one
for the main executable and for the separate object files it was
made from.  Note that you can run the program in an installed
location (and doesn't have to be in the build tree) and it will
still know where to put its .gcda files with the original object
files.
</p>

<p>
Run gcov in the directory containing the .gcno and .gcda files.
Its main argument is the name of the source code file to show the
analysis for. If the object file has a different name, then for
the argument use the base name without the ".o" or ".gcda" suffix.
</p>

<p>
When running the gcov tool to see the summary, it also creates the
corresponding .gcov output file.  (If you don't want the file
created, use the --no-output gcov switch.)
</p>

<p>
The "branch" term for gcov indicates that the code has different
code paths it make take; for example, when using the if or switch/case
constructs, while or for loops, or using a function.
</p>

<p>
The "call" term corresponds with functions to indicate if they
were used and returned from, or if they were never executed.
</p>

<p>
Two useful gcov switches are:
</p>

<p>
--function-summaries displays the names of each function and
the percentage of the lines executed in that function.
</p>

<p>
--branch-probabilities shows the branches executed percentages
including what was taken at least once and the calls executed. This
will add these details to the .gcov output and provide a summary
to the console.
</p>

<p>
Combine those two gcov switches to see the further information
per function.
</p>

<p>
The .gcov output file is human-readable with each line containing this
format:
</p>

<pre>
 number of executions for the line : the line number : the code from that line
</pre>

<p>
If no code exists in that line (like it is blank or a comment only),
the number of executions will be represented with a dash (-).
If the line was never ran, then it will have a "#####" (four hash marks).
(Line number 0 is used for a preamble for gcov.)
</p>

<p>
When using the --branch-probabilities switch, the additional lines
starting with "branch", "call", or "function" indicate if it
was executed or not, for example.
</p>

<p>
Using the single-file gcov textual output does not scale well for
huge projects (such as those with thousands of source files).
Third-party tools are available to better manage or visualize the
results, including gcovr and ggcov.  For this article, we will
explore using LCOV.
</p>

<p>
LCOV, originally released in 2002, was developed and is maintained
by the Linux Test Project.  The LCOV webpage and official downloads
are at <a href="http://ltp.sourceforge.net/coverage/lcov.php">http://ltp.sourceforge.net/coverage/lcov.php</a>
</p>

<p>
The common packaging systems also include a "lcov" package,
so install it using yum or apt-get or your favorite installer.
It depends on Perl and provides five executables and
corresponding manual pages.
</p>

<p>
Many popular projects such as PHP, Mozilla Firefox, elfutils, Samba,
and WINE have used LCOV to analyze and improve their testing.  Some
projects provide make targets or build configurations to build with
the gcov support and to generate coverage reports.  For example,
ISC Kea (DHCP servers) and the popular GNU Coreutils (which provides
many fundamental Unix tools) provide make targets to generate test
coverage reports.  But for this article, we will use command-line
examples directly using the LCOV suite.
</p>

<p>
The four main steps are:
</p>

<p>
1) build the source code using GCC with the special
--coverage compiler options;
</p>

<p>
2) run the executable or run the tests (such as with "make check");
</p>

<p>
3) use the lcov tool to analyze the gcov outputs for the programs
just ran and saving the details to a LCOV trace file; and
</p>

<p>
4) use the genhtml tool to create the report webpages using the source
code and trace file.
</p>

<p>
For example (with console outputs not provided here):
</p>

<pre>
    $ ./configure CFLAGS="--coverage"
<br />

    $ make
<br />

    $ make check
<br />

    $ lcov --capture --directory . --output-file coverage.info
</pre>

<p>
After building the code with the gcov instrumentation and running
the results (in this case by using the tests), the lcov step tells
it to process the gcov data files in the current directory (.) and
subdirectories (by default) to generate its LCOV trace file. (Note
that the output file uses the extension ".info" by LCOV convention;
be sure to don't overwrite documentation in the Info format by
using the same name.) Depending on the size of the project, this
may have hundreds of lines of console output as it processes the
files.
</p>

<pre>
    $ genhtml --output-directory coverage-html --legend coverage.info
</pre>

<p>
This genhtml step creates a directory containing a few images (used
to create the bar graphs) and a CSS stylesheet, and generates the
various HTML webpages for the overview pages and for each source
code file referenced by the named trace file. The --legend switch
simply provides the color (and branching symbol) explanations on
the webpages. At the end of the many lines of console output,
genhtml will also display a summary of the overall coverage rates.
</p>

<p>
Then use a web browser to review the results starting on the
top summary page:
</p>

<pre>
    $ firefox coverage-html/index.html
</pre>

<p>
It is also usable via a console browser like elinks or lynx.
</p>

<p>
Note that the lcov --directory switch is important. LCOV was initially
designed for investigating kernel code, so to change the default to
evaluate your userspace code, use that switch to point to where your
local gcov .gcda data files are. In addition, you may need the
--base-directory switch to specify the directory when the
source code is in a different location than the object files.
</p>

<p>
Figure 1 is an example LCOV-generated code coverage report.  It
represents the coverage of the src/ directory in GNU Coreutils 8.25
after running its many tests. The legend explains the colors for
the overview:  red for poor coverage, yellow for acceptable or okay
coverage, and green for good coverage. The top right gives total
counts for the number of lines and functions (and optionally,
branches). The hit column indicates how many times the gcov
instrumentation saw that line or function (or branch) executed.
</p>

<img src="lcov-figure1.png" alt="LCOV Code coverage report summary for Coreutils source" />

<p>
The main table, sorted by the source filenames by default,
shows the hits and totals for the lines and functions (and optionally
branches) per file.  A simple colored bar graph makes it easy to
quickly visualize the results.
</p>

<p>
Multiple webpages are created for each report overview.  Clicking
on the arrows after the column name will display the fields also
sorted starting with the worst line or function (or branches)
coverage to quickly see what needs work.  (Scroll down to the bottom
to see best coverage.)
</p>

<img src="lcov-figure2.png" alt="LCOV example 1 of untested feature in Coreutils date command" />

<p>
Figures 2 and 3 show part of Coreutil's date.c after running
its tests. It is an example of a "date" feature that is untested
(display of a file's last modification time). The blue color
lines are for source code lines that are executed (or hit)
and the red lines are for source code lines that weren't used (or
not hit).
</p>

<img src="lcov-figure3.png" alt="LCOV example 2 of same untested feature in Coreutils date command" />

<p>
LCOV has many configurations to define how the report webpages are
generated and look. Your customizations may be placed in your home
dot-file at ~/.lcovrc or in the system-wide /etc/lcovrc (or different
location depending on how your LCOV suite is installed).  See the
lcovrc(5) manual page for details on its many configurations.
These settings may also be set on the lcov and genhtml command
lines by using the --rc switch.
</p>

<p>
Depending on your installation of LCOV you may also have branching
details on the generated webpages too. To make sure this is
done set the lcov_branch_coverage=1 in your ~/.lcovrc dot-file
or use --rc lcov_branch_coverage=1 on both your lcov and genhtml
command lines.
</p>

<img src="lcov-figure4.png" alt="LCOV source code coverage report branches summary for Coreutils ls.c" />

<p>
Figures 4 and 5 are for a source code report that includes the
branching details. This example is for the popular ls tool
when running it with its -ltr switches. (This example does
not include running the coreutils tests for ls.) Notice in the top right
summary that the 1664 line source code file contains 1460 branching
possibilities.  Branch lines in the source code view indicate
if the branch is taken (+), not taken (-), or simply not executed (#).
On line 3878 in the Figure 5 example, it shows the [ - + - -] branch
data which indicates for that switch/case construct that only the
second case statement is matched. Line 3895's branch data [ - + ] shows
that the if statement's condition wasn't successful so it didn't
branch into its following code.
</p>

<img src="lcov-figure5.png" alt="LCOV source code coverage showing branching example (in ls.c" />

<p>
While LCOV defaults to 90% or better has the green high coverage
and 75% for the yellow medium coverage, you may have other rates
that match your development goals or requirements. You may use the
genhtml_hi_limit and genhtml_med_limit configurations to adjust
that; for example:
</p>

<pre>
   $ genhtml --output-directory coverage-html --legend \
    --rc lcov_branch_coverage=1 --rc genhtml_hi_limit=80
    --rc genhtml_med_limit=50 coverage.info
</pre>

<p>
(Note that anything below that 50% medium limit is the red low rates.)
</p>

<p>
If you want to make further changes to the HTML presentation, copy
the gcov.css file to a new file name and make your CSS stylesheet
modifications (such as colors, font sizes and faces, table formatting,
etc.). Then use the genhtml --css-file switch to point to your new
stylesheet file.
</p>

<p>
If you need to generate a new report from new data, you can reset the
trace files with the lcov --zerocounters switch. Use it with the
"--directory ." argument to delete all the gcov .gcda data files
in the current directory and its subdirectories.  Note again that
these files are created when the gcov-instrumented program is ran.
</p>

<p>
A common issue with using LCOV is that the gcov data doesn't
match the binary. If you updated your source or rebuilt binaries,
previously existing .gcno notes files or .gcda data files may not
match. The lcov run may indicate the files are corrupted or are
mismatched. In addition to removing the .gcda files, you may
want to remove out-of-date .gcno files too.
</p>

<p>
You may want to zero counters and generate different reports for
manually ran tests, unit tests, system tests, and combinations
of tests so you can compare the results and to better understand
your different test coverages.
</p>

<p>
Other common genhtml options include:
</p>

<p>
--title to provide a brief name or identification for the report.
</p>

<p>
--frames which creates a narrow graphic image showing an unreadable
overview of the source code used as a clickable map (on the left
side of the webpage) to quickly jump to some section in the
corresponding source code. (For example, clicking in the middle of
the overview image will take you to the middle of the source code view.)
</p>

<p>
Note you may have multiple trace files used for generating a single
report. Trace files may also be combined for multiple runs
or pruned of specific paths or source files you don't want
included in the report. Running an instrumented program and
generating a report for it may include other source code
from your system or other dependencies that you are not interested in
and may clutter or over-complicate your report and skew your
statistics. The following is an example of removing the files
in a subdirectory from the previously-created trace file
and then generating the website report using the new trace file.
</p>

<pre>
   $ lcov --remove coverage.info lib/selinux/\* --output report.info
<br />

   $ genhtml --output-directory coverage-html --legend report.info
</pre>

<p>
(Notice the escaped \* for the pattern of the files to match to exclude.)
</p>

<p>
Again the main purposes for LCOV are to review what code was
exercised by tests and to better understand the flow of the code.
By running the program (not using tests) for simple scenarios, a
developer can follow in the LCOV report the code path actually used
to associate its basic features with the source.
</p>

<p>
LCOV provides a very valuable starting point on how to improve
test quality, especially when visualizing large amounts of code.
Code coverage output is useful for sharing between management,
test development teams, and the code development teams.
For more information about LCOV, be sure to read the manual pages.
</p>

<p>
------------------
</p>

<p>
Jeremy C. Reed has over ten years experience in software
Quality Assurance for open source and proprietary software.
He is a board member for The NetBSD Foundation.
The author has used LCOV for over seven years
on various Linux and BSD platforms in researching existing test
cases and for identifying testing needs for BIND, Kea, ISC DHCP, nmsg, wdns,
mtbl, and other software suites. 
He is currently authoring books about pfSense
and the history of Berkeley Unix. His homepage is at
<a href="http://reedmedia.net/~reed/">http://reedmedia.net/~reed/</a> .
</p>

<p>
(This article was authored in September 2016.)
</p>