Friday, 21 September 2012

Process Watchman

Update: If the code it is running completes before the maximum time, it will block the program from completing until the time runs out. Still needs work! (Read: It doesn't work XD)

I was looking for a process monitor, sort of like monit, or god but these two solutions are for keeping a process alive or restarting a process if it is misbehaving. What I wanted was a simple process watcher that would simply kill a process if it misbehaves or runs out of time, while returning the completed or partial output of the process it was watching.

For this I quickly wrote a small tool to do this on *nix machines.
It is quite hacked together so I must apologise for the mess, it is in its first version here and the program was written as fast as I could imagine without prior design. There will be numerable ways of improving this tool.

Currently it accepts 3 args, a command, an integer value for seconds and a float value for ram megabytes. There is probably a problem with commands that are more than one word long which is most of them, experimentation and improvements are needed.

#Author: Greg Myers
#Date: 30/08/12
@command = ARGV[0]
@time_limit = ARGV[1].to_i #seconds
@ram_limit = ARGV[2].to_f #megabytes

def watch_time(sec)
  exception = Class.new(Interrupt)
  begin
    x = Thread.current #x will contain whatever runs in yield
    y = Thread.start { #y watches and causes x to throw when timeout.
      begin
        sleep(sec)
      rescue => e #This raises any error x naturally hits
        x.raise e
      else #This executes if no exceptions happened until now
        x.raise "Process ran out of time to execute."
      end
    }
    return yield(sec)
  ensure
    if y
      y.kill
      y.join
    end
  end
end

def watch_ram(megabytes, pid)
  exception = Class.new(Interrupt)
  begin
    x = Thread.current #x will contain whatever runs in yield
    y = Thread.start { #y watches and causes x to throw when timeout.
      begin
        loop {
          rss_use = `ps -o rss= -p #{pid}`.to_i #use ps to get rss of pid
          raise "Hit Ram Limit #{megabytes*1024}kb, with #{rss_use}kb" if megabytes*1024 < rss_use
          sleep(0.5)
        }
      rescue => e #This raises any error x naturally hits
        x.raise e
      end
    }
    return yield
  ensure
    if y
      y.kill
      y.join
    end
  end
end

def get_payload_child_pid(pid)
  pipe = IO.popen("ps -ef | grep #{pid}")
  pipe.readlines[2] =~ /\w+\s+(?\d+)\s+(?\d+)/ #Always line 3, Line 1 = spawn cmd, Line 2 = IO.popen, Line 3 = ruby, Line 4 = grep
  pipe.close
  return $~[:child_pid]
end

if @command
  if @time_limit > 1
    if @ram_limit > 0
      watch_time(@time_limit){
        require 'pty'
        PTY.spawn("#{@command} 2>&1") do |r,w,p|
          child_pid = get_payload_child_pid(p)
          watch_ram(@ram_limit, child_pid){ loop { puts r.gets } }
        end
      }
    else
      puts "Invalid memory limit #{ARGV[2]}"
    end
  else
    puts "Invalid time limit #{ARGV[1]}"
  end
else
  puts "Invalid command #{ARGV[0]}"
end
I will be putting this on github publicly shortly.

No comments:

Post a Comment