Peace of mind promises by hosting companies are often clever marketing with little substance

  Follow me: Follow Bruce Kirkpatrick by email subscription Bruce Kirkpatrick on Twitter Bruce Kirkpatrick on Facebook
Tue, Apr 16, 2013 at 1:40PM

What I find with hosting companies is that downtime is usually related to software problems rather then the network or hardware.

When it comes to network or hardware, the hosts I've used (Rackspace, theplanet, softlayer, liquidweb, 1and1, godaddy, hivelocity) are each handling things about the same on the network/hardware side of things.  They all provide high qualify bandwidth, but they often target a different type of customer when it comes to software, customer service, and price.   Some are too price focused such as 1and1 and godaddy, so they over sell their servers and you get poor performance and less customer service.   Others are too customer service oriented like Rackspace, which tries to sell human support and fancy things like 100% SLA agreements above all else - they even sell outdated / inferior hardware to compensate for their labor costs.  Then there are companies which are more focused on providing bleeding edge hardware and tools that enable self-service and low costs like softlayer and hivelocity.   I currently use hivelocity because I've learned to value self-service and low costs rather then big promises and clever marketing that underdeliver.

I think a 100% SLA on the network is meaningless.

Paying for the difference between 100% SLA and a 99.9999% network with no SLA is not a good value in my opinion. We had no SLA agreement with Softlayer (theplanet at the time) when they had a literal explosion in their power unit that knocked off a bunch of a servers offline on a weekend.  Our server was down for 1 full day, and they gave EVERYONE 2 months of free service - that was $1000 value at the time for me.  Much more then I expected especially when we didn't pay for any fancy uptime promises.  They just cared about keeping their customers happy.

What is an SLA agreement? When you get down to the details of an Service Level Agreement for network availability, you'll often find that a availability SLA only pays you for the difference in hosting costs for services not provided during the time you were offline.  It has to be something that is verified to be their problem.   Consider that if you are paying $1000 per month and their network goes down for 1 day, you'd only get a $33 credit probably unless they were willing to go out of their way to compensate you, but it wouldn't matter how much they give you.  If you are a serious business, you are probably losing more then what they credit you for being down an entire day.

No refund will come close to compensating you for what you would lose in reputation or money with significant downtime.  Perhaps, your staff are unable to do their job when the server is down - such as a call center operation. No hosting company is going to pay for your lost potential earnings and company costs. Their contracts are extremely careful to make sure of that.

In reality, most of the larger hosting companies seem to have a great track record with their network availability.  They are coming up with ways to charge you extra for things that they already do very well.

I've seen the hosting companies I work with, have had maybe one minor incident a year.  So why do people pay thousands extra per year for SLA agreements with companies who sell guarantees?   I think a lot of business owners are desparate for any small amount of added peace of mind, and they fall prey to these gimmicks.

Self service is the only option for your custom applications

It doesn't matter whether I'm serving 10 million visitors a day or 100,000, it's still going to come down to me being careful and verifying other people's work.  The hardware is almost never an issue.

For example, I had a drive failure in a raid array, which caused no downtime when it occured.   I simply had to schedule a maintenance window so that the new drive could be installed.  The time spent for the server to reboot and plug in the new drive was all that it took to resolve that problem.  I was back up and running in minutes.

People go down more often for kernel updates and software configuration issues.  The average centos of windows installation reboots a dozen times a year.  No server has perfect uptime.  However, you could implementing clustering and manage a lot more complexity.   Though whenever you migrate to new servers, you probably will experience some brief downtime or inconsistencies while data and dns are migrating.  No one can guarantee zero downtime with migrations like this.

Hardware RAID and hot-swap aren't so great - but they are profitable add-ons for hosting companies

Most people don't realize that software RAID in modern systems is just as good as hardware RAID in many ways.  Yet a lot of pay still pay for very expensive SAS hard drives.   Let's keep in mind that most Solid State Drives (SSD) use a SATA connection, so you don't get hot swap with them.   Anyone who values high performance, is not going to have perfect uptime when they are using SSD drives that fail.  So why pay for SAS technology?  If you have software RAID, that should be good enough.   Linux can do a live rebuild of a software raid configuration with minimal performance loss.  

Direct access to configuration is sometimes not available, but should be

A lot of hosts don't provide KVM or IPMI access to manage your server.  This is why you get forced into expensive managed options.  You simply don't have direct access to configure bios, setup software raid, firewalls and other things.  You should find a host that does provide access and learn how to manage these things yourself.   The money saved will be significant.

I don't waste money on SAS hot-swap drive technology ($200 extra per month per server) once I realized that I could configure software raid using hivelocity's free shared KVM access. I've never had another issue that was because of softlayer or hivelocity once I started managing things myself.

What I find is that when you ask the hosting company to do anything for you, they are more likely to do something wrong because they typically assign issues to a level 1 or level 2 tech to reduce costs.  People who really don't know the technology very well are assigned to do many tasks on a routine basis.   I've seen tickets where I go back and forth with someone who gives me a boiler plate response and then once I'm forwarded to level 3 tech, I get the correct response quickly.   This is a real problem that occurs throughout all hardware/software companies and I can't blame them.  I'm just one customer paying a relatively small amount of money.  They simply can't put high quality staff on the front-lines at all times.

I've seen multiple hosting companies do things wrong with their managed services

For example, many years ago Rackspace didn't configure the backups correctly on a server that they were supposed to provide managed backup for once before. We lost some work for a short period of time when there was a hard drive crash, and we had to redo some of the work.  When you work at a company where people assume that the other people are doing their job right, you are going to be in trouble. It wasn't my job to verify their work, and we're all busy.

The reality of experiencing the weaknesses of hosting companies first-hand has led me to rely on no one else when it comes to setting up technology for my business. I manually setup and verify everything myself.  

Linux is installed from a minimal configuration with a detailed record of every command line and configuration file I used.  

I maintain my own local backups and sync those with the Jungledisk server.  I tested the jungledisk restore process.  

I use alertra monitoring, which is an external monitoring service. I don't have to trust my hosting company is doing it correctly.  Alertra gives me an automated phone call when the server goes down.   Did you know that many hosting companies will send you an email when your server goes down in the middle of the night asking for permission to resolve the issue?  They don't even bother to automatically resolve the issue many times or they may wait an hour to see if you respond first.   

I'd rather get a call in middle of the night then depend on my hosting company to resolve a technical issue.   I've explicitly asked hosting companies to call me immediately before, but they usually don't.   Also if you have brought the server down on purpose, you'd have to tell them that in advance - it gets a bit more tedious to keep them in the loop.  I really think managed services are inadequate because of this lack of immediate response to issues.   I refuse to pay even $50 per month extra if they can't do it right.  Alertra is $10 per month and I know immediately when something is wrong.  I don't have humans mucking it up.   Automation is critical.   If I decide I need the hosting company to help because I'm away from the computer, I can call them and ask for help.  This is better then them asking me if I want help.   I don't even allow the hosting company to login to the server usually.

I only use companies that give me direct control such as KVM or IPMI access or some kind of control panel access. I have written down the steps to install, configure and troubleshoot each thing whether it is backups, firewall, or the operating system.  I know everything we have running and how to fix it.   I think this is a minimum requirement for any web development company that provides hosting services. This is how I eliminated the need for companies like Rackspace, which can give you a false sense of security. I'm sure hivelocity would get it wrong sometimes too if I trusted them to do everything.   Hivelocity is great because they don't get in my way.  They know how to serve customers like me better then anyone else.

Coldfusion and other less popular software are usually not supported by hosting companies

There is no way to outsource a ColdFusion problem. My previous employer used to demand help from Rackspace where none was available and I thought they were crazy. I told them over and over that we need to be server experts and know what is happening in detail with the server software.

A hosting company might arbitrarily upgrade your php version to 5.3 when your apps can only run 5.2.  You might not even realize this until a day or two later when customers start complaining about their site being down. Their "managed security updates" are using built-in features that are automated by software like plesk, cpanel or the operating system, and they don't really use in-depth knowledge to upgrade your server.   You are better off not allowing the hosting company to perform any upgrades.  You may just end up with substantial software-related downtime, that they aren't liable for because you authorized them to do the security updates already. You often can't let auto-updates occur for the web app software, you need to test things before something is installed. Your in-house people need to handle the software and operating system, and the hosting company is only there for hardware / network / abuse help. I've never seen a hosting company that has ColdFusion experts and certainly not Railo. We were left to fixing things ourselves and now that's the only way I'd have it.

Conclusion

Don't believe any marketing language put out by a hosting company.  Gather real data on their promises and don't trust them to do things correctly.   You could even make a visit to the data center and see how they operate in more detail if you want to ensure you are making a good choice.  Hosting companies do a great job most of the time, but it's irresponsible for you to assume they are perfect.  You must hold them accountable by manually verifying everything they do for you.   Force them to document what they have done and be critical when they make a mistake to ensure it doesn't happen again.  Develop expert knowledge within your company to eliminate dependencies on third party companies for critical business functions that you are responsible for supporting.   If the hosting companies is providing the entire solution including customer service, then that may be a great option and they probably manage that better since they understand the software they are using.   Hosting companies don't understand your software most of the time, and that leads to serious problems.  

This approach to hosting my application has led to me saving thousands per year, having a more diverse skill-set, and improving the reliability of our services.


Bookmark & Share



Popular tags on this blog

Performance |