I'm not totally familiar with how the server is setup, but <a href="http://mmonit.com/monit/">monit</a> would be a good last resort to have apache automatically restarted whenever memory usage goes over a threshold.<div>

<br></div><div>Forgive me if this is a non-sequitur to the discussion :)<br><br><div class="gmail_quote">On Tue, Feb 19, 2013 at 4:06 PM, Ben Kochie <span dir="ltr"><<a href="mailto:ben@nerp.net" target="_blank">ben@nerp.net</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ok, here's what I've done so far.<br>

<br>

I've added memory cgroup support to most of the init scripts that matter.<br>

<br>

There's now a script for checking memory use by process tree:<br>

get-cgroup-memory-use.sh<br>

<br>

11216 KiB - /cgroup/bind9/memory.usage_in_<u></u>bytes<br>

680 KiB - /cgroup/clamsmtp/memory.usage_<u></u>in_bytes<br>

4252 KiB - /cgroup/posfix/memory.usage_<u></u>in_bytes<br>

69432 KiB - /cgroup/mysql/memory.usage_in_<u></u>bytes<br>

80396 KiB - /cgroup/mailman/memory.usage_<u></u>in_bytes<br>

175448 KiB - /cgroup/clamav-daemon/memory.<u></u>usage_in_bytes<br>

51168 KiB - /cgroup/apache2/memory.usage_<u></u>in_bytes<br>

<br>

The last one output by teh script is the system total.<br>

1048708 KiB - /cgroup/memory.usage_in_bytes<br>

<br>

I've manually set a memory limit for apache2 to 500MB<br>

<br>

This in theory should catch apache2 going over memory limits.  We can improve the memory cgroup settings in the future.<br>

<br>

The one difficult thing is this, this does not work well for upstart jobs since the scripting for upstart is crappy.<span class="HOEnZb"><font color="#888888"><br>

<br>

-ben</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

On Tue, 19 Feb 2013, Ben Kochie wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Fucking crap.  I've started testing using memory cgroups to limit apache memory use to keep it from blowing up the machine.<br>

<br>

I'm also testing adjustments to the oom killer to make apache the more likely target.<br>

<br>

I'll likely create some scripts to automatically deal with this.<br>

<br>

-ben<br>

<br>

On Tue, 19 Feb 2013, Andy Isaacson wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Tue, Feb 19, 2013 at 01:10:15PM -0800, Jonathan Lassoff wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Somethings awful slow with the wiki again.<br>

<br>

Halp.<br>

</blockquote>

<br>

Page loads were taking 10 seconds each.<br>

<br>

memcached got OOMkilled.  Restarted, loads back down to 300 ms again.<br>

<br>

Thanks for noticing.  Can we get an external latency monitoring system<br>

up and running again?  Both TCP and HTTPS GET latency would be useful..<br>

<br>

-andy<br>

______________________________<u></u>_________________<br>

Rack mailing list<br>

<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>

<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>

<br>

</blockquote>

______________________________<u></u>_________________<br>

Rack mailing list<br>

<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>

<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>

<br>

</blockquote>

______________________________<u></u>_________________<br>

Rack mailing list<br>

<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>

<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>

</div></div></blockquote></div><br></div>