Over the past few weeks all of us here at BlipADeal have been working hard to get the best deals from around the world into one place. Now it has been completed and now the team would like to declare that we are now Launched !
It has been a long, painful, sleepless, exciting and fun week all rolled into one. At the beginning of this *launch* week we had all the software modules written, the new infrastructure design completed, we had our go-live implementation checklist and all that was left was to build the new infrastructure, integrate the modules and go live !
Always sounds easy before you start, always looks nicely planned when you see it on the white board, you always imagine pressing the submit button on the DNS settings to cut over but it NEVER EVER freaking works that way does it. Something always goes wrong and your forced to improvise your way through. I guess this is what makes the startup life so exciting and fun.
So lets go over the timeline of our launch over this week.
At the beginning of the week we sat down and confirmed that all of the modules we had to complete individually for this release was done and dusted. Our crawling engines where tested and working individually, our changes to the front end where completed and our mobile applications had all the new functionality integrated with the new API’s. This was basically a checklist day to confirm everything that had to be completed was indeed completed and possibly tidy anything up that wasn’t 100% as it should be.
We spent the next day testing all the new functionality on our pre-production servers and making sure that everything worked as expected. We did some final QA on our Android and iPhone Mobile applications , did a final pre-prod release to some of our users to test the new website and we ran some performance tests on our crawling engine and our front end.
Everything was nicely checked of the list and confirmed A-OK. We pushed all of the final changes into git.
On Wednesday we built all of the new infrastructure. We provisioned all of the required servers and installed all the additional system software configurations. We configured the databases, web-servers, load balancers and did a quick health check of all the servers. This phase was relatively painless and we had all the new servers configured and tested in under a day.
I just want to add that its quite astonishing to see how quickly we can build server farms with the advent of technology such as AWS and similar servers It almost makes you forget about the times that you actually had to order a server as opposed to now simply pressing a button !
On thursday we loaded the new software onto the servers but found a few problems so we were forced to move some of the code back to development and make changes . These were unexpected problems so we were slightly delayed. We also found some late performance problems on the crawling engines so we had to do some fine tuning and re-arrange some of the modules to distribute the load in a more uniformed way. We left the servers running overnight to see how the performance was after the changes.
Friday was the day we were going to cut over , celebrate and take a chillaxing weekend monitor the servers and making sure everything ticks over ok. As usual this is not how it worked out. We discovered that there were some problems with some of the crawling engines so there was a need to develop some statistics on the performance. We ended up developing the new functionality so once the system went live we could accurately get analytics on how our crawling engine performance was. This development was unexpected and took the whole day to complete but after it was finished we found that it was well worth the effort cause tuning performance is now much easier to accomplish.
On Friday the plan was to finish some data migration, backups, production pre-runs and some administration documentation for the submission of the mobile applications. None of them completed because of the unexpected issues.
Today was definitely going to be the day that we were going to launch the new release. We started by doing a complete backup of the database followed by our data migration including running the alter scripts on the production tables. Most of the data migration steps worked as expected.
We dropped in the new code changes and did some parallel runs of the application both from the web and on the mobile devices. It was also key to test the compatibility of the old mobile version that many users had already download as we did not want to release a new mobile version and API which rendered our current user base to no longer have a working version of the app.
Everything seemed to be fine. Even though we had some problems during the course of the week everything panned out pretty well up to this point. We had decided to do the cutover at 10:30 PM on the saturday night as the assumption was most people would be out having a good time in the city and not looking at deals !
After all the systems were run up it was time to cutover the DNS and some elastic IP’s to the new production servers. Elastic IP’s were good but there were problems with the DNS. we had intermittent behaviour which kept forcing the site to move back to the old version at random times.
At this point the site was live but it was behaving unexpectedly and only after 10 minutes of excruciating “what the hell is going on remarks“ did we realise there was a very obscure DNS setting that we had missed when load balancing is used. By this time it was about 2am and we were almost in a zombie like status and the rest of the boys had to drive home cause they had plans on that sunday.
At the end, after getting over these issues, we were however overjoyed that the software had gone live and everything seemed to be working as expected. It was a hard slog but when you see the software running its always well worth the effort.
As mentioned earlier, you always imagine this big red button with “launch” written on it or a single mouse click which *launches* everything … BUT MAN …. it never works that way