So you have started-it-up and now you are getting good traffic — Thousands of users, etc. etc.
Do you know script kiddies are scanning your website using simple dictionary attacks on SSH ports? Do you know that once in a while there is a Fatal application Error in your PHP log (which may point to bigger problem)? Do you know that the backup you are taking is actually not gonna restore your DB? Do you know that every night at 12 one of the servers has a CPU spike?
It’s a good idea to catch some of the serious problems early on and deploy tools to proactively assess them. In this session we will discuss some very basic things, as a CTO you MUST worry about and proactively solve problems around them.
These are (in the order of decreasing priority):
2. Monitoring/Availability/Load (External/System level)
3. Application errors
5. Source control
Discussion will be around tools, hands-on-experience, tactical things which you would do on a day-to-day basis to keep the lights on.
(This session is not intended for startups who already have 100s of servers, but someone who has between 2-20 servers up in the pool)
Updated Mar 5: Slideshare URL
This post was submitted by indus_khaitan.