CI and CD with Packer, Chef, AWS (Devops Weekly #187)

originally posted on medium

image

Introduction

A few weeks ago I wrote an article about our current local setup with Chef and Test Kitchen. In there I introduced the workflow and stages of the process. Here I intend to explain in detail the different steps and a few of the reasons behind it.

Base images

The first stage is around Packer, which we use with Chef Solo to build different versions of RHEL with the same setup.

It is installed locally with Homebrew. Don’t think it’s done often enough to be automated.

It only contains a Packer cookbook which is fully tested under ChefSpec.

In this cookbook we have a few OS version switches and do things like:

  • Timezone
  • Disk volumes
  • Users
  • Datadog for monitoring
  • Splunk for logging

Local environment

Local development is done with Test Kitchen and Chef Solo in an environment called local.

The cookbooks are organised between the cookbooks folder (installed cookbooks) and site-cookbooks (custom cookbooks).

All site-cookbooks are tested with ChefSpec.

We follow the wrapper cookbook method where we have cookbooks that represent a service. e.g. our _tomcatservices cookbook builds on top of tomcat adding extra application versioning, ways to warm up URLs and more. And then the actual wrapper cookbook which will mainly contain parameters of the underlying cookbook.

This way we have several cookbooks that represent instances with different applications or JVM options deployed over the same _tomcatservices base installation.

In the near future we plan to Open Source a few of the base cookbooks.

Test Kitchen suites are built around these wrapper cookbooks and represent unique instances that exist in the environments. Some of them include:

  • Tomcat instance — Applications 1,2,3
  • Tomcat instance — Application 4
  • MuleESB Console
  • Mongo Single Node

All these unique instances are tested with ServerSpec and include local configuration and application smoke tests — A bad deploy will make it fail.

We also built a tools cookbook to include useful tools for development and ServerSpec e.g. net-tools and NetCat. These are not installed in other environments by default.

Build and Test stage

Applications unit testing

On our CI system we build and unit test all applications to push new builds for our artefact repository.

Configuration unit testing

We run ChefSpec and FoodCritic on all site-cookbooks — Any ChefSpec fail will break the build.

Single instances

In the same system we have all the previous Test Kitchen suites running on a new environment using AWS.

Every Chef commit and every application build will trigger a Test Kitchen build of a specific instance — bad builds or bad instance configuration will both make this fail.

We also use ServerSpec to ensure all applications behave in a standard way before they get any further down the line e.g. Health Checks or Logging patterns.

Less configuration means you have less code in general, less code means less bugs so we try to work hard on standards instead of having to configure every combination possible.

Base builds

Our base images are tested by Test Kitchen using vanilla AWS images with the Packer cookbook to ensure we don’t end up with base images that we cannot reproduce reliably.

We do not have an automated way to upload these base images so we only use it to stop the line if anything goes wrong.

Stack builds

Still in the integration stage but after building single instances we have stack builds. In these Test Kitchen suites we converge several instances that will not work separately and ensure that any mismatch will be caught quickly. For example:

  • MuleESB stack build will create a Console and an ESB. Connect and test the two.
  • Mongo Replica Set stack will create three Mongo instances configured as a replica set to connect and test.
  • AEM stack will create a Mongo single instance, AEM Author, AEM Publisher, AEM Dispatcher — Connect and test all of it to ensure you can safely bring these together.

Building stacks with Test Kitchen brings up the problem of Discovery — Without Chef Server you cannot know what other instances already exist.

Our first attempt to mock discovery with several Test Kitchen instances was with Route53 but due to the slow updates we replaced it with temporary ELB and using fixed names that exist in the test environment.

We do this by adding a few properties with chef-solo-search and a _test_kitchensetup cookbook that depends on AWS cookbook and registers the temporary instances.

We are currently in the process of replacing some of the Chef Search with Consul which means we’ll be able to use it for all environments including these suites.

Local CloudFormation development (Sandbox)

After the stack tests we are sure that small groups of instances work correctly and push new versions of the cookbooks and applications into Chef Server.

These stable/pre-production cookbooks are used to develop CloudFormation templates with a smaller possibility of application or chef bugs — This way we can focus entirely on CloudFormation issues by separating it from configuration management.

We test the Sandbox environment through monitoring and logging and if it all works we copy that template into the Integration environment.

These templates are split through layers — These include broader themes like Security Groups and DNS or the previous stack builds.

Integration environment

Configuration side

In this stage we have a tested CloudFormation template and cookbooks that have gone through all the previous stages and represent our first attempt at a stable environment.

We create a new environment with a lifespan of around a week at the moment.

We use it to try out application deploys into a full environment, run backups and integrate all the different applications and instances for the first time as a whole — This represents a snapshot of Production by managing application artefact versions together with a CloudFormation template and cookbook versions.

Application side

On this side we have application integration tests running, automated acceptance tests running (cucumber), performance tests running and more.

Other environments

Our environment and role configurations are quite small — most versioning is done on the cookbook level so except for a few environment specific overrides you won’t see any variations between the next environments.

Some of the other environments include QA, UAT, Staging (Pre-production) — But they represent versions in time of the Integration environment.

Results

We have a setup that is very strict on standards and tests but allows us with a couple line changes to try in every single combination possible:

  • A new application or application version on a tested and stable environment.
  • A new cookbook version with on a tested and stable environment.
  • A new environment template with tested and stable applications and configuration.

A couple highlights so far were being able to test a new OS version instantly in all possible instances and upgrading Tomcat knowing all applications work.