giza: Giza White Mage (Default)
At 9 AM this morning, we opened up the hotel registrations for Anthrocon 2011. Here are the performance graphs, with annotations:

Network traffic graph (Annotated) Memory usage graph (Annotated)

Netstat (Annotated) CPU usage graph (Annotated)

The arrows point to the spikes at 9 AM when I made the changes to the site live. Here's a breakdown of those stats in more detail:
  • Bandwidth went from a weekly average of 326 kilobits/sec to just over 5 megabits/sec.

  • # of concurrent network connections went from ~41 to 350

  • CPU usage went from 45% of 1 core to roughly 200% of one core (we have 4 cores)

  • Memory usage was largely unaffected. Yay for an async I/O webserver!
So the site held up just fine to the onslaught of users.

One thing I could have done better was the deployment. At about 8:30 AM, I replaced www.anthrocon.org with a static HTML page with the status, and pointed another DNS name at our Drupal installation. Unfortunately, Drupal is a bit finicky with its $base_url variable, and I had to tweak that by hand, as well as tweak the webserver config by hand. I then went to our temporary URL, published the hotel pages, made sure everything was okay, and undid all of those changes to re-deploy the website. Unfortunately, I forgot to clear out the cache in Drupal, which meant some people did not see the new hotel page until as late as 9:02 AM. Okay, so it's only 2 minutes, but I felt that it was a silly mistake that shouldn't have happened.

Next year, I think I'll look into an "auto-publishing" system of some sort to automagically make the pages in question live at 9 AM. That should simplify things for me quite a bit, and maybe not even require me to actively do anything.

Other than these minor issues, the whole hotel thing went over pretty well this morning. I monitored Twitter for the next hour or so, and was able to Tweet status updates to everyone as things happened.

As of this writing, the Courtyard is completely sold out, and Doubles in the Westin are sold out. (but Kings are available) Rooms at the Doubletree are still available (along with suites!), as well at The Omni William Penn.

(X-posted to [livejournal.com profile] anthrocon)
giza: Giza White Mage (Default)
I've been playing with Amazon EC2 over the weekend, and trying to set up popular software packages on it. I was able to get Munin set up, but every time I tried to load the Munin webpages (e.g. http://ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com/munin/), I would get redirected to whatever the server_name directive was set to in Nginx. Usually it'd be something like "localhost", which was completely unhelpful.

I tried playing around with the server_name directive for awhile, and putting in the DNS name of the machine would work, but since I only used this instance sparingly, and starting it up caused it to be assigned a different IP each time, I would have to keep updating the config file. Not a viable solution.

I tried doing some fancy rewrites to the $server_name variable, but since Amazon's EC2 servers have their own internal IP in 10.0.0.0/8, my browser would just get redirected to an IP address that was not routable outside of Amazon's network.

I finally found the right directive:


server_name_in_redirect off;


This tells nginx to honor the Host: field in the initial HTTP request, instead of determining the name of the server on its own. This fixed the issue just fine.

Hopefully others will find this post and spend less time solving this problem than I did. :-)

(And yes, Ubuntu has official AMIs for EC2...)
giza: Giza White Mage (Default)
Looking back after this morning's stampede, I thought I'd share with folks how the webserver held up, since I know I am not the only geek out there. And, truth be told, I was a bit nervous myself, since I wasn't quite sure just how much traffic we would get and if the webserver would survive, or turn into a smoking crater.

Well, here's what we got:

Ethernet Traffic

The first hump is a manual backup I did last night. The second is the automatic backup that runs every morning, where the database and files are rsynced to a machine at another data center. The third hump at 9 AM was when we opened hotel reservations. 1.4 Megabits/sec doesn't look too bad, until you look at:

Active Connections

The 336 simultaneous connections a second was far more interesting. That's about 16 times the normal number of connections to the webserver.

So, what were the effects? Let's look at MySQL first:

MySQL Queries

There were nearly 1000 queries per second, 500 of which were in the cache. Between that and the other queries which are not hitting the MySQL cache, there's definite room for improvement. But before I look at the RAM situation, let's look at the CPU usage:

CPU Usage

Load Average

One of the cores was close to 100%, but there was virtually no I/O wait. The load average was also good to see--it was actually less than the load during the nightly backups. From a performance standpoint, both of these graphs look very good, as it means the disk was not the bottleneck. But why not? Well, here's the final piece of the puzzle:

Memory Usage

The RAM usage is what ties all of the other graphs together. By keeping the memory usage near constant, I was able to avoid hitting swap space, which would have incurred a huge performance penalty and quite possibly a "death spiral".

How did I keep RAM usage so low? Instead of running the Apache webserver, which requires a separate process for each connection, I instead ran the Nginx webserver. Unlike Apache, it uses asynchronous I/O to handle incoming requests. This approach scales much better than Apache, which creates a separate child process for each "listner" and chews up a lot of memory.

For comparison, the number of simultaneous connections peaked at "only" around 100 during last year's convention. We broke the old record by a factor 3.

"And what we have learned?"

Even under the highest load to date, we were in no danger of running out of RAM. This means that I can (and probably should) allocate more memory to MySQL so that more queries are cached and overall performance is increased even more. There are also some more advanced caching modules that I intend to research, to see if we can cache straight off of the filesystem and avoid the database altogether. More on that as it happens.
giza: Giza White Mage (Default)
If you run your own machine, you probably don't want to use Apache as a webserver. Its configuring is complex, arcane, and it sucks up memory like a Microsoft toaster. There are lighter, faster alternatives out there, such as Nginx or thttpd. But, if you buy webserver from someplace, you'll probably be forced into using Apache, and will be for the foreseeable future. That being said...

What is a rewrite rule?

A rewrite rule is a way that Apache can rewrite an incoming request "on the fly". In essence, a user can ask a webserver for one file, and the webserver can serve up a completely different file to the user instead.

Isn't this like a redirect?

No, a redirect is when the webserver tells a user, "that's not the file you want, you need to go over here", at which point the browser loads the new URL. Redirects aren't always desirable, especially of the files/PHP scripts/whatever are currently living a temporary location.

Why would I want to use one?

Let's say you're installing an app that somebody else wrote, and the app lives under /very-long-application-name. You could use a rewrite rule so that users could go to /short-app-name, and Apache would rewrite that request behind the user's back to be /very-long-application-name. And the best part is that this is transparent to a properly built application.

Another example is maybe you're starting up a video "tube" site, and you want to have the smallest embed code possible for sharing videos. Problem is, the video player lives at http://mysite/app/version1.0/video/player.swf. You could use a rewrite rule so that http://mysite/player.swf is rewriting to the longer URL. And what's even better is that when you release version 2.0 of the video player, you can just update the rewrite rule, and everybody will start seeing version 2.0 of your player.

Is this really used in real life?

Yep. The best example I can think of is Drupal. When you load a URL from a Drupal-powered site, such as http://drupal.org/drupal-6.6, that is really rewritten by Apache to be http://drupal.org/?q=drupal-6.6. Regardless which URL you load, Drual sees that the variable $q is really set to "drupal-6.6". It doesn't care which URL was used.

Okay, so how do I do it?

Ah, here's the good part. The first thing you need to do is put the following in your .htaccess file:
RewriteEngine On
If doing this gives you a "500 Server Error" when you try to load a webpage, go ask your web host to enable mod_rewrite. If they refuse, I would suggest moving to a webhost that doesn't suck.

Now, let's say you wanted to rewrite the path to your video player, as described in the example above:
RewriteRule ^player.swf$ /app/version/1.0/player.swf
Note the "^" and "$" in the "left hand side" of that rewrite rule. That tells Apache to match ONLY the string "player.swf", and not "/path/to/player.swf". This will prevent an infinite loop wherein that string is matched over and over. (Infinite loops are bad, m'kay?)

Most Drupal configurations have something like this:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?q=$1 [L,QSA]
That's some pretty advanced stuff there, and it uses a new directive, "RewriteCond". RewriteCond allows you to do 1 or more levels of matching in addition to the matching in the RewriteRule line. In this case, if the URI that the user attempts to load is neither a valid file nor a valid directory, then it is rewritten as index.php with the original URL passed in as $q.

I hope that this post was helpful, and saves you from going through some of the pain that I went through when learning how rewrite rules work. :-) If this wasn't enough pain for you, full documentation on Apache rewrite rules can be found on their official site at:

http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html

Enjoy!
giza: Giza White Mage (Default)
Been setting up a new machine for the migration of Anthrocon's webserver. This should address some of the speed issues we've been having as well as allowing us to use SSL on the site. Plus, since I now have root access and things like crontab available, I can start doing nightly rsyncs of the data.

Fun fact: Nginx lets you chain SSL certificates merely by appending the extra certificates on the end of your site's certificate. No extra webserver configuration requried!

Been building the pre-registration system for 2009, too. I expect that to be finished in the next few weeks. Also, I will be releasing the final product under the GNU Public License. I figure I can at least entertain others by letting them laugh at my code.

I slept poorly a few nights ago and must have done something to the muscles in my neck. This has resulted in me having headaches the last few days. Advil helps the symptoms, though.

Heading up north tomorrow to visit the family for a party. We're celebrating my kid sister's Masters Degree. Let her be the highest educated person in my family for a change. :-P

[Edit: Forgot to mention co-worker J. He wants to tag along with me to the next furry convention I attend, which would be FurFright. He has NO idea what he's in for! :-)]
giza: Giza White Mage (Default)
Having recently upgraded my machines to OS/X Leopard (which will be the subject of another post), I decided to try installing the program Duplicity onto my G4/G5 Mac so that I could use it in conjunction with rsync.net. Duplicity is a wrapper for rsync and rdiffdir that can be used to perform both full and incremental backups of your filesystem. It also supports many transfer methods including FTP, scp, rsync, WebDEV, and Amazon S3. In short, it's a pretty badass backup system.

While there don't seem to be any major issues when trying to install it under Linux, installing it under OS/X Leopard is a bit... interesting. The purpose of this post is to more clearly define "interesting" and document how to actually get Duplicity working, and to save others the same issues that I had deal with earlier in the afternoon.

First things first, make sure you've downloaded and installed Fink. Fink provides apt-get functionality to OS/X and gives you access to a whole bunch of UNIX software.

After downloading and untarring the package from the Duplicity website, try building the software with:
python ./setup.py build
If you're a lucky winner, you'll see a bunch of errors starting with this:
_librsyncmodule.c:25:23: error: librsync.h: No such file or directory
This means that librsync hasn't been installed. So go do that:
apt-get install librsync
Since the library will be installed in /sw (used by Fink to keep its stuff separate), you'll need to build Duplicity as follows:
python ./setup.py --librsync-dir=/sw build

Now install Duplicity:
sudo python ./setup.py install
Now try running it. Duplicity should promptly complain in the form of a traceback that ends with:
ImportError: dlopen(/Library/Python/2.5/site-packages/duplicity/_librsync.so, 2): Library not loaded: /sw/lib/libintl.1.dylib
Referenced from: /sw/lib/librsync.1.dylib
Reason: image not found
Some quick checking revealed that version 3 of libintl was installed by the libgettext3-shlibs package. Why was version 1 not installed? I don't know. Why does librsync require version 1? No idea.

At this point, we do the computer equivalent of a "Hail Mary" pass:
ln -s /sw/lib/libintl.3.4.3.dylib /sw/lib/libintl.1.dylib
Yes, I really did just tell my system that version 3 of that library can be used as version 1. Amazingly enough, it actually worked! In general, doing this sort of thing is not a good idea, since we're expecting a library that is 2 major revisions older than what is actually installed. The fact that it does work however, does speak good about the programmer who wrote the library.

So, let's try and run Duplicity again:
File "/Library/Python/2.5/site-packages/duplicity/gpg.py", line 22, in <module>
import GnuPGInterface, misc, log, path
ImportError: No module named GnuPGInterface
...somebody shoot me.

So anyway, now we have to go to the GnuPGInterface page and download that module. Assuming you downloaded and untarred it, here's how to install it:
python ./setup.py build
python ./setup.py install
Amazing. That was actually painless.

Let's try running Duplicity again:
[pardine:~/tmp/duplicity ] $ duplicity
Command line error: Expected 2 args, got 0
Enter 'duplicity --help' for help screen.
Success! Well, sorta. At least the program runs now. Documentation is is here and the manpage is over here. I can also offer some advice that I learned:

- The first backup is a full backup. Subsequent backups are automatically incremental backups.

- Want to list files in the backup? Use the list-current-files command.

- If you don't want to use encrpytion, try the --no-encryption option. Personally, I'm not a big fan of encrypting backups, because if you lose your key, all your backups go poof.

- If you are using SSH/scp to back up and you don't have an ssh key set up, be sure to tell Duplicity to ask for your password with --ssh-askpass. Even if you set the environment variable FTP_PASSWORD with your passphrase, you'll need to leave this switch in.

Happy backups!
giza: Giza White Mage (Default)
 

So, awhile back I wrote about external hard drives, partly because I was interested in performing backups. I also noticed some speed increases when I did backups to my external hard drive, so I decided to look into performing regular backups onto it in additional to the semi-regular backups I do onto DVDs.

Why back up to an external hard drive?

I don't have to waste a DVD every time perform a backup, especially if I am making backups on a daily basis. Plus, the same amount of data can be backed up to my external hard drive in less time.

Note that I still back up to DVDs, since those are more durable and can easily be taking offsite. But those backups are done every few weeks at best, so backups to my external hard drive are done more frequently -- usually every few days.

What is backed up?

Various documents, my photography (I take a lot of pictures), source code for projects I am working on, my Moneydance financial data, and tarred/gzipped backups of websites that I manage.

What is not backed up?

Any movies and music that I have--since those files are static (i.e., they never change), I just burn them to DVD when I have enough content to actually fill a DVD. There's simply no need for me to keep backing them up over and over. Sure, it's nice to have multiple copies of this stuff, but I simply do not have that much of a need for my MP3s. (And it's not like I can't rerip them from my CD collection) Any pictures that are older than a year old are also burned to DVD and removed from my Pictures/ directory, since I am no longer working with them on a day-to-day basis.

It only goes downhill from here, with lots of technical stuff. I warned ya! )
giza: Giza White Mage (Default)
Seems that one of our developers didn't fully test something on Friday. I know, because I ran into this earlier:

drwxr-x--- 2 exim exim 9502720 Sep 10 09:37 input
drwxr-x--- 2 exim exim 4153344 Sep 10 09:37 msglog


Those aren't files, those are directory structures that are nearly 15 Megs in total...

The normal size for a directory entry is 4096 bytes...
giza: Giza White Mage (Default)
 
So, those crazy guys from the Sysadmin of the Year website are now offering additional money for people to sing lyrics for their "Sysadmin Rock Star" theme song.

They have the original song and some of the entries thusfar on their website at:

http://www.sysadminoftheyear.com/song

I've listened to the entries on that page, and well... let's just say that phrases like, "Filk music on PCP and mushrooms" come to mind.

No, [livejournal.com profile] filkertom, you do not need to be a sysadmin to enter. The contest is open to all. :-)

Profile

giza: Giza White Mage (Default)
Douglas Muth

April 2012

S M T W T F S
1234567
891011121314
15161718192021
22232425262728
2930     

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags