Saturday, 4 September 2010

Memory Usage Graphs with ps and Gnuplot

When developing the import from F-Spot feature in Shotwell, a user who tested the patch found out that there was a bit of a memory leak. After finding the cause, I produced a patch to fix it but I also wanted to identify what the difference was between the development trunk and the patch. So here's how I did it.

Gathering Data

The first step was to gather relevant memory usage data. For this I needed a repeatable test and perform that test both with the trunk build and the patch build. As I had a test F-Spot database, that proved quite straightforward:

  1. Delete the Shotwell database,
  2. Build the trunk version,
  3. Import the test F-Spot database using the trunk build,
  4. Delete the Shotwell database again,
  5. Build the patched version,
  6. Import the test F-Spot database using the patched build.

With the test process sorted, I needed to gather memory data during steps 3 and 6. That's easily done using the ps command in a loop and sending the output to files. So, for the trunk build, I just started this command in a terminal before starting Shotwell and stopped it once finished:

$ while true; do
ps -C shotwell -o pid=,%mem=,vsz= >> mem-trunk.log
sleep 1
done

The one for the patched version is virtually the same:

$ while true; do
ps -C shotwell -o pid=,%mem=,vsz= >> mem-patch.log
sleep 1
done

Note the the equal sign (=) after each field specification tells ps not to output the column header. So at the end of this, you end up with two files that contain 3 columns of data each: the PID, the percentage of memory used and the total virtual memory used for the process at intervals of one second. In both cases, I let Shotwell run idle for a few seconds at the end of the import before closing it to ensure that everything had stabilised.

Next, I checked how many lines of data I had in each file:

$ wc -l mem-*.log

And truncated the longer file to the length of the sorter one, just to make sure I had the same number of data points for both.

Creating the Graph

After that, I wanted to create a single graph that included four lines: VSZ and %MEM for both trunk and patch. And I wanted to output the result to a PNG file. Gnuplot can do all of this, you just need to know how to set its myriad of options. So here's the Gnuplot script in details.

set term png small size 800,600
set output "mem-graph.png"

Gnuplot works with the concept of terminals. So the first line tells it to use the special terminal called png with a small font and a size of 800 pixels wide by 600 pixels high. The second line is fairly self-explanatory: output to the given file.

set ylabel "VSZ"
set y2label "%MEM"

The two sets of values I am interested in have very different ranges. VSZ is a number of bytes and will have values in the hundreds of thousands if not millions, while %MEM is a percentage so will have a value somewhere between 0 and 100. So to make sure that both types of graphs fit in the output, I will use the ability that Gnuplot has to use left and right Y axes with different ranges: VSZ will go on the default Y axis (left, called y), while %MEM will go on the other one (right, called y2). So I set the labels for both.

set ytics nomirror
set y2tics nomirror in

As the right Y axis is not used by default, I need to enable it and to set where the tics go. To do that, I first disable the mirror option on the left Y axis and enable tics on the right Y axis by telling Gnuplot that their position will be in.

set yrange [0:*]
set y2range [0:*]

The last piece of setup is to customise the range on both axes. By default, Gnuplot will adjust the range so that there is as little white space as possible above or below the graph. But in this case, I want both sets of graphs to start at zero so that I can have a better idea of total memory used.

The next bit is quite long so I will start by explaining the instruction for a single graph before bringing all four together.

plot "mem-trunk.log" using 3 with lines axes x1y1 title "Trunk VSZ"

In the line above, I tell Gnuplot to take its data from the third column in the file called mem-trunk.log. The with lines section specifies that I want a line graph. The axes x1y1 specifies that I want it to be drawn against the first X axis and the first Y axis (the default, but here for completeness). And the last bit specifies what I want the title for this graph to be. Then it's just a case of plotting all four graphs in a single plot command separated by commas. So here's the full script:

set term png small size 800,600
set output "mem-2334-graph.png"
set ylabel "VSZ"
set y2label "%MEM"
set ytics nomirror
set y2tics nomirror in
set yrange [0:*]
set y2range [0:*]
plot "mem-trunk.log" using 3 with lines axes x1y1 title "Trunk VSZ", \
     "mem-patch.log" using 3 with lines axes x1y1 title "Patch VSZ", \
     "mem-trunk.log" using 2 with lines axes x1y2 title "Trunk %MEM", \
     "mem-patch.log" using 2 with lines axes x1y2 title "Patch %MEM"

Make sure that there is absolutely no white space between the backslash characters and the end of lines in the plot command otherwise Gnuplot will complain. Save the script to a file called mem.gnuplot and run it:

$ gnuplot mem.gnuplot

And here is the output I got, which shows the improvement in memory usage between trunk and patch:

Memory usage graph

3 comments:

stuaxo said...

Nice... theres a few things here too

http://live.gnome.org/MemoryReduction_2fTools

Unknown said...

@stuaxo: yes indeed. Valgrind in particular is a very useful tool. In this case, the source of the leak was rather obvious: a couple of Gdk.Pixbuf for each imported photo left languishing around and never used again. Measuring the impact of the fix was actually a lot more fun that the fix itself :-)

Anonymous said...

Thanks, useful stuff.