How to make Drupal hog CPU cycles
Jan. 29th, 2006 11:24 pmWell, I had a fun weekend. It all started on Friday:
Web Host: Dude, your Drupal installation is taking up too much CPU time.
Me: How much?
WH: It's spinning the CPU for 60-90 seconds on a single page load.
Me: Oops.
WH: Here are the times and URLs it's happened.
Me: Hmm... those correspond to my crontabs...
So, after some more investigation, and placing of debugging hooks into poormanscron, I determined that the search module's search_cron() is chewing up lots of CPU time and even timing out in some cases.
So, I told the search module to only index 10 pages per run instead of 100. That lowered the load, but it's still taking up to a minute in some runs. I did some more experimenting and discovered that sometimes the module makes thousands of iterations, presumably to index a bunch of words from a really big document. This means that my next step is to take some of the really large documents on my website and break them down into smaller sections to make for shorter cron runs.
The other thing I have yet to resolve is why, when the admin page says that 100% of the nodes are indexed, the counter goes back to 0% on the next cron run and everything gets indexed again. Since I have 3 other Drupal powered sites that don't do this, I'm guessing that something go corrupted on mine. Whee fun. :-)
I'd also like to give mad props to my webhosting provider, Nearly Free Speech, for working with me on the problem. I was able to have some seriously geeky discussions with them while trying to figure out the cause of the problem. They totally rock when it comes to tech support.
And that's pretty much how my weekend went.
Web Host: Dude, your Drupal installation is taking up too much CPU time.
Me: How much?
WH: It's spinning the CPU for 60-90 seconds on a single page load.
Me: Oops.
WH: Here are the times and URLs it's happened.
Me: Hmm... those correspond to my crontabs...
So, after some more investigation, and placing of debugging hooks into poormanscron, I determined that the search module's search_cron() is chewing up lots of CPU time and even timing out in some cases.
So, I told the search module to only index 10 pages per run instead of 100. That lowered the load, but it's still taking up to a minute in some runs. I did some more experimenting and discovered that sometimes the module makes thousands of iterations, presumably to index a bunch of words from a really big document. This means that my next step is to take some of the really large documents on my website and break them down into smaller sections to make for shorter cron runs.
The other thing I have yet to resolve is why, when the admin page says that 100% of the nodes are indexed, the counter goes back to 0% on the next cron run and everything gets indexed again. Since I have 3 other Drupal powered sites that don't do this, I'm guessing that something go corrupted on mine. Whee fun. :-)
I'd also like to give mad props to my webhosting provider, Nearly Free Speech, for working with me on the problem. I was able to have some seriously geeky discussions with them while trying to figure out the cause of the problem. They totally rock when it comes to tech support.
And that's pretty much how my weekend went.
NearlyFreeSpeech.net
Date: 2006-01-30 06:22 am (UTC)Hmm, an odd restriction on their service. Do they mean I can't use numeric addresses or I can't use IP-based restrictions at all? My website logfile allows access from home and office computers only.
Re: NearlyFreeSpeech.net
Date: 2006-01-30 03:45 pm (UTC)> the laws of Texas.
Not much less than any other state, I imagine. That also assumes that they have all of their servers in one state, which they do not. :-)
> Hmm, an odd restriction on their service. Do they mean I can't use numeric
> addresses or I can't use IP-based restrictions at all? My website logfile
> allows access from home and office computers only.
Why not write support AT nearlyfreespeech DOT net and ask them? I suspect it has to do with their front-end caching, but they could tell you more.
As far as doing restrictions though, you can use Perl and PHP, so it's trivial to set up per-IP restrictions there. (and then modify them through the ssh account that every customer gets!)
(no subject)
Date: 2006-01-30 08:53 am (UTC)(no subject)
Date: 2006-01-30 01:14 pm (UTC)(no subject)
Date: 2006-01-30 03:08 pm (UTC)I'll have to consider them if they ever stop letting me host zorin.org at work. Wow, a hosting provider with brains and customer service. How rare these days...
-Z
Why wait?
Date: 2006-01-30 03:19 pm (UTC)Why not do what I did? Create an account and fund it with $5. Then set up a website there just to store files on. (I first created filepile.claws-and-paws.com (http://filepile.claws-and-paws.com/) for that purpose)
If you end up liking the service, you can always migrate more sites there in the future. I had filepile running just by itself for many months before moving my main website there.
(no subject)
Date: 2006-01-30 06:21 pm (UTC)(no subject)
Date: 2006-01-30 06:26 pm (UTC)You're the one who originally told me about them like 3 years ago, you know. :-)
(no subject)
Date: 2006-01-30 08:43 pm (UTC)