I'm not totally familiar with how the server is setup, but <a href="http://mmonit.com/monit/">monit</a> would be a good last resort to have apache automatically restarted whenever memory usage goes over a threshold.<div>
<br></div><div>Forgive me if this is a non-sequitur to the discussion :)<br><br><div class="gmail_quote">On Tue, Feb 19, 2013 at 4:06 PM, Ben Kochie <span dir="ltr"><<a href="mailto:ben@nerp.net" target="_blank">ben@nerp.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ok, here's what I've done so far.<br>
<br>
I've added memory cgroup support to most of the init scripts that matter.<br>
<br>
There's now a script for checking memory use by process tree:<br>
get-cgroup-memory-use.sh<br>
<br>
11216 KiB - /cgroup/bind9/memory.usage_in_<u></u>bytes<br>
680 KiB - /cgroup/clamsmtp/memory.usage_<u></u>in_bytes<br>
4252 KiB - /cgroup/posfix/memory.usage_<u></u>in_bytes<br>
69432 KiB - /cgroup/mysql/memory.usage_in_<u></u>bytes<br>
80396 KiB - /cgroup/mailman/memory.usage_<u></u>in_bytes<br>
175448 KiB - /cgroup/clamav-daemon/memory.<u></u>usage_in_bytes<br>
51168 KiB - /cgroup/apache2/memory.usage_<u></u>in_bytes<br>
<br>
The last one output by teh script is the system total.<br>
1048708 KiB - /cgroup/memory.usage_in_bytes<br>
<br>
I've manually set a memory limit for apache2 to 500MB<br>
<br>
This in theory should catch apache2 going over memory limits. We can improve the memory cgroup settings in the future.<br>
<br>
The one difficult thing is this, this does not work well for upstart jobs since the scripting for upstart is crappy.<span class="HOEnZb"><font color="#888888"><br>
<br>
-ben</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
On Tue, 19 Feb 2013, Ben Kochie wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Fucking crap. I've started testing using memory cgroups to limit apache memory use to keep it from blowing up the machine.<br>
<br>
I'm also testing adjustments to the oom killer to make apache the more likely target.<br>
<br>
I'll likely create some scripts to automatically deal with this.<br>
<br>
-ben<br>
<br>
On Tue, 19 Feb 2013, Andy Isaacson wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Tue, Feb 19, 2013 at 01:10:15PM -0800, Jonathan Lassoff wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Somethings awful slow with the wiki again.<br>
<br>
Halp.<br>
</blockquote>
<br>
Page loads were taking 10 seconds each.<br>
<br>
memcached got OOMkilled. Restarted, loads back down to 300 ms again.<br>
<br>
Thanks for noticing. Can we get an external latency monitoring system<br>
up and running again? Both TCP and HTTPS GET latency would be useful..<br>
<br>
-andy<br>
______________________________<u></u>_________________<br>
Rack mailing list<br>
<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>
<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>
<br>
</blockquote>
______________________________<u></u>_________________<br>
Rack mailing list<br>
<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>
<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>
<br>
</blockquote>
______________________________<u></u>_________________<br>
Rack mailing list<br>
<a href="mailto:Rack@lists.noisebridge.net" target="_blank">Rack@lists.noisebridge.net</a><br>
<a href="https://www.noisebridge.net/mailman/listinfo/rack" target="_blank">https://www.noisebridge.net/<u></u>mailman/listinfo/rack</a><br>
</div></div></blockquote></div><br></div>