Victus Spiritus

home

Priority One: Keep Your Site Up

21 Nov 2009

HighDynamicPeace

This a no brainer Mark you noob

It turns out you have to setup everything the right way or your service will crash like a dominos champion display.

This week Victus Media's search and personalized ad tool was either dysfunctional or down. I had been happily hosting the plugin here for a few weeks so people could explore and try out our service. We had some major overhauls in data and function and have learned a little about stability. But every minute the server was down or the database empty I imagined a lost potential user. If you think about up time in this way, it really hurts to have your site down (like a kick in the groin). I want nothing more than for users to sample our site and think, "whoa!", and this week that wasn't close to happening.

While I was wrestling with wubi/ubuntu and finally virtualbox (darn tiny Linux window) our server was effectively dead to the outside world. We have hashed out a solution for future problems, and me having a mirror of our server environment running locally should help identify and resolve some issues before they make it to our live server. So far I've been chasing a good mirror of our server code (lead tech Tyler is always busy) and had some issues doing that in windows with rails. We need to do more testing on concurrent users as well as improving functionality. Later on we can work on a redundant server backup that kicks in if we have any load or technical issues.

Site Stability

Any user feedback design methodology depends on usage to drive design decisions. Without usage, new functional elements have little rationale for being developed beyond instincts and hunches. While design doesn't require outside direction initially, it surely benefits from the knowledge gained from user habits, and expectations. If 99/100 new visitors all love a particular element, you would be wise to maintain and even expand that function, as long as it doesn't break what users like about it.

User Trust and Bug Smashing

The best way to gain user and customer trust is to be reliable. If your service can stand up to a dedicated denial of service attack (beyond Google spiders ;)), with only mild delays then you're in the golden zone.

If normal usage breaks your system, then you have to react fast to squash any bugs. The greater the number of users for your service, the larger the test base. A divergent set of browsers, systems and expectations all work in unison to transform a concept into a sturdy and resilient product.

The quicker you can move your service from flakey to trustworthy, the sooner you can expand.