Quantcast
Channel: Technical Blog by THE NET-A-PORTER GROUP » dakkar
Viewing all articles
Browse latest Browse all 9

Finding memory leaks

$
0
0

One of our programs was leaking memory. Not much, but enough that Tech Ops were not going to allow us to put it into production. Fair enough, I wouldn’t allow it either, if I were on-call.

So I did the obvious: started looking for the leak. This is not as easy as I’d like.

First I tried Test::LeakTrace, which gives lots of information, but:

  1. It gives too much information
  2. It slows things down unbearably

For an example of the slowness, a test that usually runs in less than one minute, took about a week when run with Test::LeakTrace. Since I planned to run several tests multiple times, it was clearly not a viable option.

Second thing I tried: look at /proc/self/stat to see how much memory the process is using. The plan of attack was:

  1. Run some test code 10 times
  2. Measure memory
  3. Run some test code 20 times
  4. Measure memory
  5. Etc…

This did not work: I was expecting to see a linear increase of used memory, but in fact I saw random numbers. Perl‘s allocator is clever, and the kernel’s allocator is clever, and I’m not clever enough to figure out what they’re doing.

So I started looking at perlguts, perldebguts, perlhacktips, and other scary documentation files. They talk about “SV allocation logging”, “memory profiling”, and so on. But, getting those requires re-compiling a Perl. Was I brave enough?

Well, normally I wouldn’t be, but PerlBrew makes compiling a Perl almost easy. I’ll save you the three failed attempts (I found the configuration switches difficult to understand), and show a compressed version of the script I ended up using:

#!/bin/bash

perlbrew switch perl-5.14.2
perlbrew uninstall debug-perl
perlbrew install perl-5.14.2 -n -j5 --as debug-perl 
   -DDEBUGGING -DPERL_MEM_LOG -DDEBUG_LEAKING_SCALARS 
   -DPERL_MEM_LOG -Dusedebugging -Dusemymalloc
perlbrew switch debug-perl

perlbrew install-cpanm

cpanm -n <<EOF
Acme::MetaSyntactic
Alien::ActiveMQ
App::Ack
…
parent
true
version
EOF

cd /tmp
rm -rf Data-Rx*
tar zxvf ~/src/CPAN/Data-Rx-0.007.tar.gz
cd Data-Rx*
patch -p1 < ~/src/CPAN_distroprefs/Data-Rx-0.007.patch
perl Makefile.PL
make install

cd ~/src/catalyst-engine-stomp/
perl Makefile.PL
make install

cd ~/src/Data-MultiValued/
dzil install

# etc etc, for our in-house modules

cd

This allowed me to have a working Perl with all the dependencies I needed. Still, things like PERL_MEM_LOG were not working, and the values returned by Devel::Peek were not exactly clear to me.

Asking on #london.pm revealed that the memory logging facilities were removed from Perl a long time ago, and that nobody knows how to properly read the values from Devel::Peek. So I took some guesses, and wrote this program:

#!/usr/bin/env perl
use strict;
use warnings;
use Devel::Peek;
use MyTest;

{
# pre-alloc some memory
my %report;my @diffs=(100)x100;
sub measure {
    my (%args) = @_;
    my $code = $args{code} // sub {};
    my $cleanup = $args{cleanup} // sub {};
    my $loops = $args{loops} // [1];

    $code->();

    mstats_fillhash(%report);
    $diffs[0]=$report{total}-$report{totfree};

    keys @$loops;
    while (my ($i,$count) = each @$loops) {
        say "$i: looping $count times";

        $code->() for 1..$count;
        $cleanup->();

        mstats_fillhash(%report);
        $diffs[$i+1]=$report{total}-$report{totfree};

        say " diff: ",$diffs[$i+1]-$diffs[$i];
        say '';
    }

    for my $i (1..@$loops) {
        printf "% 3d (% 5d times): % 10d % 10.1fn",
            $i,$loops->[$i-1],
            $diffs[$i]-$diffs[$i-1],
            ($diffs[$i]-$diffs[$i-1])/$loops->[$i-1];
    }
}
}

measure
    code => sub {
      MyTest->test_it,
    },
    loops => [ 10, 20, 30, 40 ];

This, finally, got me a roughly linear increase in memory usage. Then, it was a matter of bisecting the code paths inside the test, checking which changes made the diffs go to 0.

In the end, it was Benchmark::Timer that was allocating memory. Yes, I know, it’s designed to work that way, and I have no-one to blame but myself for using a library without reading all its code.

Anyway, I’ve removed Benchmark::Timer from the code, I wasn’t using its results anyway, and now the program can go to production. It only took me a week…


Viewing all articles
Browse latest Browse all 9

Trending Articles