#!/usr/bin/ruby

=begin
Purpose:
    Shows non-zero differences between two snapshots (dumps) of /proc/slabinfo,
    which is useful for discovery of kernel memory leaks.

Story:
    Processes on one LVS cluster node were dying by the hand of oom-killer while
    other nodes with same software setup (but slightly different hardware) were
    running just fine. I tried protecting vital processes of the system with
    OOM_DISABLE only to find whole system (unprotected processes) killed by
    oom-killer and the machine left in unusable state. After brief consultation
    with Rik van Riel, author of oom-killer, I was recommended to watch slabinfo
    for changes because the machine had over 800MB worth of slab cache (non-swappable
    kernel memory [buffers]) allocated at the time oom-killer started his crusade.

    This script was used to detect changes in slab cache and to pinpoint the culprit
    (btw, it was scsi_cmd_cache leak).

Author: Wejn <wejn at box dot cz>
Thanks to: Rik van Riel <riel at redhat dot com>
License: GPLv2 (without the "latter" option)
Requires: Ruby
TS: 20060328175500

Examples of use:

p3 slabinfo # ./slabdiff.rb ac 20060328-101914 /proc/slabinfo  | tail -n 5
reiser_inode_cache: 907752
skbuff_fclone_cache: 1028736
size-512: 1368576
size-8192: 2105344
scsi_cmd_cache: 18525696

-> scsi_cmd_cache has grown by 18MB between snapshots -- might be a LEAK!
=end

if ARGV.size != 3
    $stderr.puts "Usage: #{File.basename($0)} <[ac]tive|[al]located> <file1> <file2>"
    $stderr.puts "\twhere <file[12]> is /proc/slabinfo dump"
    exit 1
end

active = true

case ARGV.shift
when "active", "ac"
    # no action
when "allocated", "al"
    active = false
else
    $stderr.puts "Error: you must select one of: active (ac), allocated (al)"
    exit 1
end

def load_slab(filename, active)
    slab = {}
    content = File.open(filename, 'r')
    raise "unsupported version" unless content.gets.strip == 'slabinfo - version: 2.1'
    content.each do |ln|
        next if ln =~ /^\s*#/
        name, active_objs, num_objs, objsize, rest = ln.strip.split(/\s+/, 5)
        slab[name] = (active ? active_objs.to_i : num_objs.to_i) * objsize.to_i
    end
    slab
end

old = load_slab(ARGV.first, active)
new = load_slab(ARGV.last, active)

def slabdiff(old, new)
    diff = {}
    (old.keys + new.keys).uniq.each do |k|
        diff[k] = (new[k] || 0) - (old[k] || 0)
    end
    diff
end

slabdiff(old, new).to_a.sort { |a,b| a[1] <=> b[1]}.each do |k, v|
    puts "#{k}: #{v}" unless v.zero?
end