Engineering A Continuous Delivery Pipeline - Charlotte Godley

Posted on
DevOpsDays Edinburgh Conference DevOps Continuous Delivery Microservices Deployment Containers

Write up of Charlotte Godley’s talk about engineering a continuous delivery pipeline from Devopsdays Edinburgh 2017.

Continuous delivery into multiple Kubernetes environments.

Problem Domain

Ocado has many warehouses all filled with robots that grab/store stock and get them ready for distribution. Due to the nature of their work they can’t handle the network latency of using the cloud therefore they run everything on top of OpenStack internally.

What was bad about previous approach?

Getting the big picture was hard. Understanding what was in production, who owned it and what versions were actually deployed was problematic. Documentation was either non existent or incomplete/out-dated. One size doesn’t fit all and as teams are autonomous trying to force rules onto them would prove problematic and counter productive. There was no post deployment visibility.

How to get the big picture?

GitOps. By using git as the single source of truth you can know exactly what is in prod for any project at any point in time. Every commit to master kicks off a pipeline that build and deploys the code therefore prod is what is in master at any point in time.

Pipeline stages:

  • Build and test app
  • Build deployment container
  • Build and deploy Kubernetes deployment

Problems and Solutions

This initial approach had a few problems. There were lots of code pipelines. Security problems came up from credentials that were needed at certain points. It was necessary to manually copy commit references and commit them to the next repo in the pipeline.

The solution for this was more repos. Splitting out the Kubernetes manifests and the application code from its config allowed these problems to be reduced. Some problems still persisted and these were resolved with a bot that monitored the git repos and automatically propagated changes. The workflow previously would be similar to the below:

  • update code repo
  • update docker repo
  • update manifest repo
  • update env repo

After the bot’s introduction the only step that was manual was updating the code repo, after that the bot took care of copying references forward and generated an automatic repo.

Assisting with the big picture.

  • Documentation - Write down all the things, in one place preferably. Have it peer reviewed with the same scrutiny as code. Make it concise but with lots of relevant examples. Keep it up to date. Make it a reference guide but don’t duplicate upstream documentation.
  • Templating - Templating the deployment and manifest repos enables a team to get started quicker and easier. It reduces their time to deployment and means less overhead to get a new app or service out there.
  • Avoid hard and fast rules - Talk to developers, understand their needs, put in place basic ground rules but other than that work closely with them to evolve the process and make it work for them and you.
  • Monitoring - Automated Slack alerts. Dashboards link to documentation. End to end tests to predict where applications may have problems with changes/issues in Kubernetes.
  • Manage access - 3 roles, majority have first role. As necessary grant second role. Third role is for limited individuals in emergency situations.
    1. read only
    2. developer
    3. admin

What problems still remain?

  • Tidying up - whats on cluster is whats at head on master, additions is just a new merge, rollback is just a revert but how do you delete?
  • Testing the pipeline - the pipeline is majority bash scripts which are not the easiest thing to test.