Archive for January, 2009

The pain of backups, the sweetness of speed!

Tuesday, January 20th, 2009

I have written over and over that great things are born from trials and failure.  I have also written that very often when you need something done right you just do it yourself!  Although web hosting backups don’t count as one of life trials they have been a pretty big thorn in my side.

As most of you know we use Cpanel as the backend system for our control panel.  Is works reasonably well for most functions, but one area where it falls flat on its face is backups.  We have been using our own system for a long time and the performance is adequate but far from what we thought was ideal.  We tried several open source tools as well as a paid option from R1Soft (The WORST software I have ever had the great misfortune of trying).  Nothing was fast enough and here’s why.

Our average /home partition on one of our servers has anywhere between 3-6 million files on it.  Lets assume that only 10,000 files on that partition were modified in a 24 hour period.  You still have to parse/scan the directories that contain those 3-6 million files just to find the 10,000 files that were modified.  Without copying the files and doing just the stat (Or the scan) of all those files takes hours to complete.  However, what is FAR worse are the seeks that the block device (Hard drives) incur while they are doing the scan.  It slows the system way down just to find the files to backup.  Then when it finds files to change it still has to copy them.  This is “just the way it is” on every system I know including solaris, windows, linux.

One day I was thinking (In the shower of course – since 90% of all good ideas come to you in the shower) why not just have the linux kernel dump the name of any file that was created/modified (Any bytes written) at the time the file was modified and use that as a list of files to back up.  The kernel already has this information when a file is updated and just throws it away.  There is virtually no overhead to do this and it saves literally 90% of the time we would normally have spent on the system doing backups.  There are already similar hooks in the kernel to get this data through innovative techniques like Inotify, but Inotify is capable of a lot more than what we wanted and consequently MUCH slower.

The problem was that I couldn’t find a single program that implemented this idea or even mentioned this technique anywhere on the web.  In situations like this I turn to our favorite in house kernel hacker and demand magic.  In this case it took him one day to write a kernel patch that implemented this and stripped out files files that didn’t matter for backup purposes such as /proc or /dev and so forth.  So does it work?

YES!  Whats really neat is that it isn’t anything that is specific for Cpanel.  It works for any linux filesystem such as XFS, EXT3, EXT4, Reiser, JFS, etc.  I think that most admins don’t fully realize the amount of time that is wasted and I/O that is consumed just in the determination of what files need to be copied.  This new backup method is literally 10x faster than what we had before and puts far less load on the server in the process.

So what to do with it.  Well, after we clean up the kernel code a bit and make sure it is 100% rock solid I will post the patch free of charge here on my site.  The patch simply dumps a list of files to be backed up to any file you specify.  You can then do whatever you want from that point.  We will have a fully implemented Cpanel backup that will work perfectly with Cpanel and is completely compatible with their restore feature.  I have no pricing for it, but I will tell you this.  I will charge you only 25% of the lowest price that you are quoted from R1soft for their horrible software.  Meaning if you are paying $50 a client license I will charge you $12.50 .  Of course you are more than welcome to use the patch free of charge and implement your own system.  It is the fastest solution of any system I have ever tested (Including of course R1soft).

If you have any ideas that you think could make the product even faster I am open to any of your ideas.

Thanks,
Matt Heaton / President Bluehost.com

Consequences be damned…

Sunday, January 11th, 2009

I can’t take it anymore!  If I read/watch/listen to another news program that talks about how our government (Read: Me and any other tax payer) here in the United States should bail out yet another industry I’ll go nuts.  I guess I am just so far out of touch that the ideas that seem fundamental and core to me are now “out of date”.

How dare I believe that both government and the private sector should be responsible for their actions.  How dare I believe that consequences for your actions should always be accepted.  How dare I believe that people should be allowed to succeed or fail based on the merit of their ideas and their ability to effectively implement those ideas.  As I said before – These ideas are now “out of date” and are almost an insult to half the population in the United States.

To me it seems that we are weak and becoming weaker all the time.  The bar for what is acceptable is constantly being lowered to accommodate those that can’t meet expectations.  The consequences for our actions are constantly mitigated so that people won’t “suffer”.

If I tell one of my sons that the pot is hot and not to touch it I hope he will learn from what I say.  If he doesn’t listen he burns his hand and in the process learns two lessons.  He learns that I care for him enough to tell him how to avoid being hurt, and I guarantee that he finally learns that the pot is hot.  Do I want him to burn his hand?  Of course not.  Bad decisions can/should be painful to endure, but not taking your licks when you make that bad decision is worse.  You learn to be weak and you avoid the natural consequences that act as the teacher of life’s valuable lessons.  Our society seeks to dodge consequences at every turn and in so doing rewards failure at the expense of the successful.

I feel very strongly that our government and many of our people look for a way out of their problems that bypass responsibility completely.  Not every business should be saved.  Not every need of the people can and should be paid for by government.  Not every painful experience should be avoided.  We learn because of mistakes that we make and change because of what we learn.  When we are constantly told that what we are doing isn’t wrong and that it isn’t our fault that we are in the position that we find ourselves in then we can’t learn and move forward.

As far as I’m concerned we are headed backwards in a dramatic way, but what do I know?

Matt Heaton / President Bluehost.com