Elliott C. Back: Internet & Technology

Dreamhost Sucks At Hosting

Posted in Hosting, Performance, Scalability, Uptime by Elliott Back on May 3rd, 2007.

I’ve concluded that Dreamhost sucks phenomenally at hosting websites that generate any kind of traffic. Sure, their $9.99 a month plans with massive savings coupons are enticing, but if you knew what you were getting yourself into, you’d stay away. Dreamhost sucks like you’d want to suck on a knife covered in chocolate–which isn’t very much.

hellhost.jpg

Others tell the tale better than I can:

There are also two unbelievable sites which actually claim that Dreamhost doesn’t suck. Well, if you read the above articles, you’d understand that Nightmarehost is really a bad dream.

So far not a single blog has explained at a high technical level why Dreamhost can’t handle their customers. I’ve seen some vague hand-waving about overselling, but no one actually has numbers to back it up. Sure, when someone tells me it takes them 10 minutes to go from SSH login prompt to terminal I believe them, but it’s not good enough. We’re making serious accusations about quality of service; we had better be able to back it up. We need hard data.

I have a shared hosting account on one of their machines, sepulveda.dreamhost.com [205.196.222.24]. The ping is fairly responsive, but not exceptional. They get their bandwidth directly from Level3, so it’s good bandwidth:

%ping -n 100 sepulveda.dreamhost.com
Minimum = 77ms, Maximum = 116ms, Average = 87ms

Unfortunately, there are 1200 users on my machine. I’ve seen industry guidelines that recommend far, far less than that, anywhere from 1/4 to 1/10th for shared hosting services:

[sepulveda]$ cat /etc/passwd | wc -l
1199

The machine itself appears to be a single dual-core opteron with 4GB of RAM, which isn’t hefty by any means. It should be dual dual-core and have 16GB of RAM to be at all useful. Besides, RAM is cheap–if they did put in more RAM maybe they could realistically handle 1-2k users per machine! Here’s the proc info:

/proc/meminfo:

        total:    used:    free:  shared: buffers:  cached:
Mem:  4172861440 3993358336 179503104        0 24576000 2158034944
Swap: 6465036288 326541312 6138494976
MemTotal:      4075060 kB
MemFree:        175296 kB
MemShared:           0 kB
Buffers:         24000 kB
Cached:        2033020 kB
SwapCached:      74436 kB
Active:         810260 kB
Inactive:      1321512 kB
HighTotal:     3211200 kB
HighFree:        34404 kB
LowTotal:       863860 kB
LowFree:        140892 kB
SwapTotal:     6313512 kB
SwapFree:      5994624 kB

/proc/cpuinfo:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 35
model name	: Dual Core AMD Opteron(tm) Processor 175
stepping	: 2
cpu MHz		: 2194.592
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic
sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse
sse2 ht syscall mmxext lm 3dnowext 3dnow pni
bogomips	: 4377.80

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 35
model name	: Dual Core AMD Opteron(tm) Processor 175
stepping	: 2
cpu MHz		: 2194.592
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep
mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht
syscall mmxext lm 3dnowext 3dnow pni
bogomips	: 4377.80

It’s funny that a guy actually monitored his site with Pingdom for a week. He writes in a comment on the dreamhost status blog:

Sorry, guys, but your service is simply terrible.
Today, there were just 83.11% of uptime (3h23min offline until now!) – data obtained from Pingdom.com.
Since 4/9, there were just one day with 100% uptime.
In the sum of last ten days, I just had more than 9 hours of downtime (while my other servers had no more than 15 minutes).
Every day I see server problems in my server.
That is unacceptable!!!

Dreamhost and I have been having conversations now for a while about a site which gets 1-2k visitors, and hosts 51GB of transferred static content a day. I thought you might be interested in reading them. On 4/30/2007, I received this email from Brian S. about my site:

Connections to your domain ( static.imgfly.com ) crashed the shared apache service several times this morning. A connection limit has been placed on your site. Being on a shared server means you need to share the resources with other customers. Due to the heavy volume of traffic, other domains on the same service were not able to load. Once the traffic to your site has taperd off, we will gladly remove the connection limit. Please read the appropriate section of our Terms of Service and let us know if you have any questions.

dreamhost.com/tos.html

My only thought is was dismal woe. If they don’t know how to configure Apache with the right connection and threading settings so it won’t crash, all is lost. Also, some of my DNS settings (I forget which exactly) were mangled in their blocking process, so I sent them a snarky email:

Unfortunately, you did far more than place a connection limit. You also edited my dns settings without my consent, which is strictly against any kind of ethical hosting policy. I had a CNAME on feifei.us pointing * to the domain so that subdomains would work; now it has mysteriously disappeared.

My account is advertised at having ~2.9TB of bandwidth a month; I pointed 50GB/day of static hosted content at my dreamhost account, which is about half of what I’m allotted, and you screw up my domain settings? You can’t handle half of what you promise?

I want to know exactly what kind of connection limit you’ve placed on my site, at a high technical level. I want to know why you can’t deliver even half of the bandwidth allotted to my account. This isn’t a high-level of bandwidth, it’s just a constant level of about 50GB a day, and it’s not even dynamic content, just static files.

At that time I was frantically re-routing and re-setting up the site, because all the DNS settings were lost, even ones pointing offsite and not at Dreamhost servers. They replied quite fairly to the email, I have to give them credit:

The CNAME was not removed as part of the connection limit. I’m sorry that it may have disappeared, but this was not as a result of the limit put in
place and was instead likely a bug in the system. You’re free to put it back if you’d like, but let me know if you don’t notice the wildcard working and I’ll get it set up for you again. I’m sorry that you believe we removed it, but we did not. The domain you mentioned was not one that was touched.

Bandwidth and connections are two separate issues. While you can definitely use all of the bandwidth that we offer, the number of connections per second and concurrent connections that your site was receiving was causing the Apache service to crash. If we didn’t put the limit in place you wouldn’t be able to use any of the bandwidth we provide as your site would have continued to crash the service. This would prevent yours and other customer’s websites from appearing online.

Unfortunately you’ve since removed hosting for this so I can’t provide you any sort of information about the limit that was put in place. I’ve explained the bandwidth issue above. Please don’t hesitate to write back if you need help with anything else.

Of course, I don’t buy the bandwidth explanation. They would have to allow me enough connections to actually constantly use 11Mbs for me to use it up. With 1200 customers on their server, even with a gigabit ethernet card they couldn’t fulfill their contracts if everyone used all their bandwidth. Some overselling is, of course, necessary and acceptable. Dreamhost goes a little overboard. The next day I noticed the new arrangements of sites I set up wasn’t working well, so I sent them a quick checkup email:

There’s some serious issues with this site; it took 8 connection tries to even connect to the server.

The reply I got back was shocking, even to me (emphasis my own):

Your domain imgfly-static.feifei.us is again causing serious problems on the sepulveda webserver. The number of requests coming in is almost instantly crashing the apache instance. This is unacceptable, and since instead of working with us you simply moved your problematic site to another domain I’m going to ask you now to stop running whatever you’re currently hosting on imgfly-static.feifei.us immediately, and to not start anything like it on any user, domain, or server of ours, ever again.

I’ve disabled imgfly-static.feifei.us to preserve the stability of the server, please do not re-enable it.

I only moved my domains around because (a) I wanted them that way in the first place, and (b) the DNS disappeared at some point. I had thought their connection limit was account-wide, but apparently my fixing the DNS also ruined their limiting. Today, since even though they suck, I sent them a nice, long, clearly written, conciliatory email. I really do want to get my 3TB of bandwidth out of them, and if it takes some sucking up, so be it:

I feel like we’re misunderstanding each other–I’m not trying to subvert your sepulveda cluster, and I am trying to work with you. Last time there was an issue, either my CNAME or A record for the subdomain somehow got lost in the dns, so I added a wildcard CNAME to feifei.us. I was under the impression that connection limits were placed on the account to prevent it from affecting sepulveda, but whatever change I made must have invalidated that.

I’ve just arrived home from work and I’ve pointed the stream of traffic you can’t handle elsewhere until we can work out the configuration. I’ve enabled the subdomain again, but there won’t be anything running on it until I get the ok.

Let me explain the software solution I’m running. The domain ImgFly.com is hosted on a dedicated server I run offsite. Requests for actual image content are forwarded to imgfly-static.feifei.us, which serves them as static content if they have been cached by that server, otherwise makes an attempt to fetch the resource remotely from amazon S3 and cache it to disk. Approximately 100 photos are uploaded an hour, and 2.1 GB of data downloaded. This isn’t much. It’s mostly serving random static content to consumers.

When I SSH into sepulveda, I see some indications that the problems you’re seeing aren’t my fault:

[sepulveda]$ uptime
16:40:39 up 26 days, 2:48, 5 users, load average: 10.67, 10.77, 9.30

Sepulveda is a dual-dual-core opteron, but those load averages are still pretty high. When I SSH in I can barely get a single command to run. Clearly the server is overloaded to the point where all it can do is serve an extremely limited amount of information. I’m willing to work with you to best manage your server resources, but you need to let me know exactly what you can handle.

Here are some solutions that come to mind:

1) Move me (or just feifei.us) to a reasonably loaded server
2) Configure apache so it doesn’t crash, or, give me an .htaccess file with reasonable limiting I can use
3) Run feifei.us without mod_rewrite or the cache script, just on a completely static filesystem
4) Tell me exactly how many connections you are able to handle, and I will send you only that many

I’m sure you’ll have some good ideas as well. It’s disappointing to be offered 3TB of bandwidth a month which is an unmetered constant rate of 10 Mbs, but be unable to fully utilize it.

I’m sure this post will get lots of comments… if any of you know someone willing to host 2+ TB of bandwidth for < $100 a month, let me know. I’m almost to the point of paying for another dedicated server to manage this.

Update:

I guess telling Dreamhost that I’d turn off the site which was causing them problems and work with them to figure out a better way to host it was a bad idea, because I received this lovely email a few moments ago:

Hello,

I’ve disabled your account for failure to comply with my request. This
is a permanent account closure, I’ve refunded your last payment.

James

So my DNS and whatever miscellaneous files (I think my dad’s site!) are there are currently being held hostage. Considering they’re my property, unless I hear from Dreamhost in the next three hours, I will be calling my lawyers tomorrow. I want blood now.

Update 2:

At 4/04/2007 5:19 PM EST I received a lovely email from Dreamhost, explaining that they weren’t doing anything to address the fact that my domains and data are being held hostage:

I have gone ahead and forwarded this to James, he will get back to you as soon as he can. Please wait for his reply.

I have forwarded the to XX so that they can update your incident report with this info, They will get back to you with any information that they may have if it is necessary, if so please wait for their reply.

It is now 24 hours since my first of three requests in writing for them to release my data and domains to me. If you know someone at Dreamhost who’s friendly and sane enough to let me pick up my things and leave, you should let them know about this. Otherwise, this is going to be a whole lot more painful tomorrow.

Update 3:

Some kind soul submitted this story to Digg. Yay! Maybe if it gets enough attention Dreamhost will give me my domains and access back.

Update 4:

Hello Digg crowd. Maybe I wasn’t clear about things. This site isn’t running on a Dreamhost machine. Hellllllll no. It would be down right now if it was. My Dreamhost account, which contains only domains, my dad’s low volume blog, and maybe one other blog, is currently disabled. The rest of it’s on a dedicated server I run from Cari.net, a company I’ve never had any problems with.

Update 5:

Now they’re telling me to wait. This should take a tech all of 2 minutes to resolve, just enable the account, wait for me to tell you I’m done with it, and then close it permanently:

It means that we have passed your message to the Tech that is responsible for disabling your site. While we understand your urgency to get this issue resolved, you will need to wait until the Rep is able to follow-up with you regarding the issue.

Update 6:

The Co-founder of Advection .NET emailed me and offered to help out. Really nice guy. If you need serious big time content-distribution and bandwidth, you should check out Advection .NET Global Media Hosting Network.

Update 7:

Ah, the sweetness of resolution. It’s a very long day and a half later, but I’m now almost again in possession of my files, databases, and domains. For some reason, the Dreamhost abuse team decided to zip up my files themselves. They probably didn’t trust me to download my files and be off again:

I’m sorry about the delay in this responce, it was in part due to the time needed to prepare all of your data.

I’ve tarred up your files and placed them at abuse.dreamhost.com/misc/user-content/xxx/ the login is xxx and the password is that which you used to login to the Webpanel. Your data has been split into three files, and will need to be concatenated before you can unzip it. Also there are your databases, they were spread between three database servers, so are in three different files. I will be removing the files after one week.

Here are the authorization codes for the domains you have registered with us. Once you’ve initiated a transfer out you can contact us again and we’ll approve the transfer.

If they had added a line to their two line email about terminating my account saying “We will provide your domains and files for transfer in __some timeframe__, please wait for our email” they could have averted this post.

Final Update:

I’ve finally gotten an email from someone in the know, a level 2 support manager. Hurray for moving up. His email is very nice:

I’m terribly sorry about the recent events that have transpired, it looks like we disabled you a little overzealously. I apologize for that. We’ve re-activated your account, however, it’s my understanding that you’re moving your domains off of our servers, which is completely understandable. If you’d still like to do that, we’ll be more than happy to refund that payment James offered.

As for the domain, imgfly.com, I’ve moved it over to your account, so you should be able to transfer. To save you time, I’ve included your auth code for transfer, should you need it. Now, if you do decide to stay, that’s great, however, we will need to discuss the stability issues you were having, however, I’ll save that, if the issue pops up if you decide to stay.

We do appreciate your willingness to work out the issue, and I apologize for the misunderstanding, and the overzealous nature on James’ part. If you’d like to discuss any other matters, please let me know at xxx@dreamhost.com, and I’ll be happy to speak with you.

I’m going to send him an email back about using imgfly-static.feifei.us as a cache node, because I really would like to use all my bandwidth up. Everything else is moving off, however.

Post-final Update:

I had thought that was the end of it, but I noticed this comment allegedly from Dreamhost’s Co-Founder and CTO, which includes the false statement:

3. He ignored our requests to not re-enable his website and did it anyway.

I tried to leave a comment explaining that I disabled my website per instructions, then put up an index page–which was somehow misconstrued by Dreamhost as ignoring their instructions. However, it’s been a few days and my comment has not been approved. Later comments have. I guess Dreamhost feels that they should have the right to slander me on their forum. I don’t feel the same way. If any Dreamhosters want to comment here, be my guest.

New News on Dreamhost

Recently 3,500 ftp accounts got hacked, including several high-profile websites like Cameron Moll. Still, there’s nothing on their official blog about this, like they want to cover it all up.

Ten Steps to Valid HTML

Posted in Code, Computers & Technology, How to Blog, SEO, Search by Elliott Back on August 14th, 2005.

It’s important to write valid html, and even more important to try and generate it from your blogging software. If you’re already not writing most of the output of your website, then it’s simple to make sure your page validates. Right now, mine appears to, although there’s always a chance that something down the line will break. My CSS doesn’t validate because I have to use an IE box model hack.

Why produce valid xhtml? You can read a long essay, or just accept that web standards are a good thing, allowing shorter development time, less debugging, and better usability.

That said, here are some of the top 10 xhtml errors:

  1. The use of a raw amperstand in a link query string. The w3c validator reports this as “cannot generate system identifier for general entity” because you’ve tried to create a new entity &xxxxxxx and not an encoded & amp ; in the string. Replace all & with &amp; in urls.
  2. The forgotten alt tag. The w3c validator reports this as “required attribute “alt” not specified,” which means that for every img tag you have, you must have an attribute alt=”something”. So, what you need to do is change <img src=”http://example.com” /> to <img src=”http://example.com” alt=”Example image” />.
  3. Missing end tags. The w3c validator reports this as “end tag for “img” omitted, but OMITTAG NO was specified” for your particular tag–what it means is that you used a singleton tag, that is, a tag that stands by itself and doesn’t have an inherent end tag, so you must use the xml style / delimiter to signify that the tag ends itself. So, instead of <img src=”http://example.com” alt=”Example image” > you would write <img src=”http://example.com” alt=”Example image” />.
  4. Incorrect nesting of lists. Please do not place lists inside a paragraph tag. The w3c reports this error as “document type does not allow element “ul” here; missing one of “object”, “applet”, “map”, “iframe”, “button”, “ins”, “del” start-tag.”
  5. Incorrect nesting of tags. Think of tags as a stack–as you add new tags to your text, you close the most immediate one first, or you’ll get errors like this: “end tag for “strong” omitted, but OMITTAG NO was specified” and “end tag for element “strong” which is not open.” Instead, change <b><a href=”http://example.com” ></b></a> to <b><a href=”http://example.com” ></a></b>
  6. Oh, the horrors of flash. Did you know it’s really hard to embed flash properly? Luckily, the problem has been solved by people: www.alistapart.com/articles/flashsatay/ who basically took the Macromedia output and stripped it down. Sad, though, that they didn’t build up from the spec…
  7. Where’s the doctype? Again, ALA to the rescue with an informative article on document typing: www.alistapart.com/articles/doctype/. If your site doesn’t have a doctype, it’s not a well-formed html document!
  8. Javascript events are lowercase. How many times have you followed standard coding conventions and written your onclick handler as an onClick handler? Probably too many to tell. Just make it lowercase, and that “there is no attribute “onClick”” will go away!
  9. Using propietary CSS extensions. Even if you’re tempted to use the word-wrap property on a blockquote or a right float, don’t. The microsoft or Mozilla-only CSS extensions aren’t good down the road when you want to upgrade your site technology.
  10. Some things need a type, javascript for example. If you forget the type=”text/javascript” from your script declaration, or the type=”text/css” from your stylesheet, it won’t validate, for obvious reasons.

More to come tomorrow, when I wake up. And now, it’s all done.

How to hire the best

Posted in Computers & Technology, Deals & Savings, Education, Science by Elliott Back on August 4th, 2005.

The infamous Mark Jen has posted his take on Joel’s hiring essay. Basically, Joel makes the argument that hiring the absolute best programmers is the best thing for a software company, because superb programmers are investments that more than pay for themselves. It’s basically an argument of averages–everyone can build software, but the few companies that can build great software are few and noticeable. To give a concrete example:

When everyone is making ugly square mp3 players, a stylish mp3 player with rounded edges and careful design will be king.

A coworker and I were discussing this yesterday and today. Obviously, when hiring candidates for positions, we want good ones. However, we go beyond the code of hiring the best of the best–we actually do what we say here. If there’s a candidate that you can’t respect as an equal or greater skill, a candidate who doesn’t appear to possess basic skills, or who is any way lacking is simply not good enough. A company shouldn’t hire someone that limps over the corporate minimum bar to fill a position.

Until there’s someone you find who can leap over a bar twice as high with ease, you don’t want to fill that position. So, don’t make your interviews easy. If you’re doing an interview, make it moderately challenging for someone of your level. Include a “screener” technical question that you think anyone with similar skills and general knowledge should be able to easily answer. Some good interview question choices include:

  • Tell me if there are two numbers in an array that sum to x
  • How do hashmaps work? How would you hash a string?
  • Generate permutations of x
  • Reverse a c string
  • Write a tree to linked-list function
  • Write an efficient recursive function to garbage collect memory
  • Describe how a compiler works.
  • Give an overview of DNS, TCP, filesystems, process scheduling, pipelining, or some other high-level CS topic

Once you’ve passed them through an easy coding question and another general question, you can start to interview them based on their resume, because you know that they’ve met a minimum requirement to do their job. If you’re impressed at the end, hire them. Otherwise, why bother? The negative cost of hiring someone who doesn’t impress you and your teammates is greater than the benefit of filling that vacant position.

Update:

I just noticed Shelly’s comment on this old hiring posts. It reads:

That is the worst interview question I’ve heard of. It is guaranteed to discriminate in favor of a certain type of developer, and not necessarily a good one.

No wonder you people can’t find good engineers. You don’t know how to interview worth a damn. You’re looking for code monkeys, but interviewing engineers. I had a feeling this was what was happening when I talked with someone who interviewed at Microsoft and the same thing happened. Absolutely silly questions-and yes, very biased. Your HR department has done a poor job.

Asking somebody how to do code the strstr function. I’d hire the person who looked at you like you were daft and said, “I’d use the function built into the language. Now what _job_ is it you want me to do?”

I just have to add to the conversation, and point out that asking for an interviewee to code any basic function like that is industry best practice. It’s the absolute lowest bar. Sure, if you actually can code, then these questions will seem ridiculous, but otherwise? You don’t hire a programmer who can’t write code, so you need to see if they can write code. Shelley would rather have interviews, I guess, that go like this:

Interviewer: So, you can code basic functions, do recursion, handle arrays, right?
Shelley: You bet I can! And more!
Interviewer: Fantastic–just had to check.
Shelley: Let’s move onto more interesting things…

Nope, it doesn’t work like that, because we can’t trust you to tell us the truth. Your abilities have to be assessed. Unfortunately, in another comment, Shelley goes on to say:

Any interview that resorts to having the interviewee code is a bad interview. Shows that your staff is too inexperienced to know how to interview.

She also makes a big hand-waving pseudoscientific argument about long term / short term memory with regards to coding. See, the thing is, the most basic part of this kind of job description is writing code. Sure, we create systems, do designs, model databases, and create relational object oriented structures, but then a software developer sits down and implements. Writes code. You wouldn’t believe how many people cannot write a function to reverse the elements of an array, in any language.

Here’s your challenge:

O readers, show your might. I’m going on vacation this weekend, but when I come back, I want efficient implementations of strstr, is_anagram, atoi for any base, and edit_distance. Log the time it takes you to write each one, too. Remember–these are basic interview “crawl over the bar” questions…

Next Page »