Evolution of Deployment at 7digital

# Why

If there is something that gets me worried, it’s The Big Rewrite. Someone will think that it’s easier to develop a solution for a problem from scratch than by refactoring (or implementing) over the current available solution. My main fear is about the knowledge lost since the previous implementation.

The developers involved have left; The reason something was done is forgotten; There will be time for it and even to add the new requirements that are driving the work! (I do hope this is being driven by requirements and not someone’s personal preference). XKCD 844

But as all rules have an exception, there will be a time where the pros will out-weight the cons considerably and I go for it - This is one of those rare times. Why did we rewrite the deployment for the team? Hang on, this is quite a list!

There are more teams than ever - Unique deployment framework is overly generic. Applications have very specific requirements - Webfront vs Bulk Downloader. Default install is huge and causing problems with updates - Teams now have their own machines Live and can avoid this. The shoemaker’s son always goes barefoot - Teams will add features to the deployment framework after only testing their own edge cases or with testing strategies which are team specific. Everyone thinks their case is more important. The machine images are maturing - We no longer have IIS6, or variable ports, or variable disk names… most configuration is useless. The clusters grow fast - We can no longer do manual roll-backs, which are a bad idea in principle, and if we always roll-forward then we don’t need to keep copies of deploys. The right tool for the right job - Using CI/Dependency tools to manage deploy scripts between dozens of projects is slow and flaky. We shouldn’t need an hour to add a new host name binding because it needs to propagate to Master and Release. It’s hard to teach the deploy framework to new developers - this one gets me really worried, I need to tell stories of brave warriors in the old days and why we might need that deploy flag. The requirements weren’t correct - It was developed with a view to future extension and use in other companies, quite an interesting goal but this wasn’t necessary, we ended up with too much noise in the code. It does it all - SRP ? Nope. The same tool runs tests and makes tea. The framework is dead - the open source parts of the framework haven’t been updated in over a year, the main developers’ lost interest, or maybe moved to something better. And finally. We lost our champions - All remaining hope is lost, the people who drove it are gone and no one defends the integrity of the framework, it gets hacked away every other day by a team’s individual agenda without a unified direction. Bang Head here

The problem Link to heading

So let’s start again, what do we need:

A way to send files - That we already have in the current machines with SFTP. A way to change the configuration files - Every system knows how to do copy and we have SSH. A way to set up IIS - We got rid of all old instances, we know that all images are the same, IIS7 has appcmd. It even filters apppools, sites and applications by name! A way to make sure everything worked - As seen one of my previous posts we added Status endpoints and these are becoming more stable, some of them are evolving variations like Sanity Checks depending on use. We can hit it with an available tool which is curl, which can actually make the deploy fail by bringing in the -f flag. A way to update fast - Can we just have a separate project that gets pulled into the deploy with configuration instead of waiting for CI? Yes. The process

I started with a very hands on approach, I’ve been working with the current deployment for years and I kept on adding features from the list in the simplest way possible.

Does it copy the build? Check Does it setup IIS? Check And so on. A specific application will have a different concept of IIS setup but besides that they all have the same interaction with the machine and in case one of the applications stands out it should be treated as a smell - is this the right application? Or is the machine set-up properly?

The ugly truth Link to heading

I was scared when realised that very fast I had a working example for one of the applications, for every environment without the need to re-factor or use a more complex framework. So what was done was the simplest possible solution - a batch script - An evil and ugly batch script. No variables, no variations, it is static per application, run on every environment and does the actions seen above. How about testing? It is true we can refactor it for extensive testing but the daily deploys are the true testing.

And so is its readability; Its extremely reduced size; its ease to comprehend; Its simplicity and lack of features bring us to a forced strictness of the machines/environments; And you can get rid of it - you will know what it does in 30 minutes, does it apply to your application or did you get a better solution? JFDI

The sentence Link to heading

And after a few weeks the results are in, everyone in the team knows how it works when everyone still struggled with the previous framework, it can be changed very quickly and it’s easy to find out why it fails. The main advice to take from this is that not everything is a software problem, we don’t need to use the same language or tools that we use for our applications neither we need to be embarrassed that something is not sophisticated enough when it’s good enough.

Where we are going you don’t need roads Link to heading

Some future work that wasn’t done and was never implemented in the previous framework:

Rethink the scripts in a Kata way - Drive the deployment framework with a Kata approach to keep it out of a personal scope and as simple as possible. Make local deploys easier - Get new developers with a working VM and all the team’s applications working and passing all tests locally in minutes. Rethink tests runner - Currently tests rely on the same deploy framework which is a sort of Swiss army knife. After rethinking the test runner, rethink the importance of local set-ups - some test setups are not driven with the team’s requirements in mind, for example, my own team being responsible for content delivery rarely uses the DB for writes and relies mostly on API calls. Cut down on all unnecessary installed tools. Make VMs be part of the deployment strategy with tools like Vagrant and Puppet. OS agnostic, see what needs to be run on Windows and what doesn’t - unfortunately this is not just related with TCO and performance but also that these VM automating tools are not as useful on Windows.