Tagged "chaos testing"

Chaos Testing for Docker Containers

What follows is the text of my presentation, Chaos Testing for Docker Containers that I gave at ContainerCamp in London this year. I’ve also decided to turn the presentation into an article. I edited the text slightly for readability and added some links for more context. You can find the original video recording and slides at the end of this post.

Docker Chaos Testing

Intro

Software development is about building software services that support business needs. More complex businesses processes we want to automate and integrate with. the more complex software system we are building. And solution complexity is tend to grown over time and scope.

Network emulation for Docker containers

TL;DR

Pumba netem delay and netem loss commands can emulate network delay and packet loss between Docker containers, even on single host. Give it a try!

Introduction

Microservice architecture has been adopted by software teams as a way to deliver business value faster. Container technology enables delivery of microservices into any environment. Docker has accelerated this by providing an easy to use toolset for development teams to build, ship, and run distributed applications. These applications can be composed of hundreds of microservices packaged in Docker containers.

Pumba - Chaos Testing for Docker

Update (27-07-27): Updated post to latest v0.2.0 Pumba version change.

Introduction

The best defense against unexpected failures is to build resilient services. Testing for resiliency enables the teams to learn where their apps fail before the customer does. By intentionally causing failures as part of resiliency testing, you can enforce your policy for building resilient systems. Resilience of the system can be defined as its ability to continue functioning even if some components of the system are failing - ephemerality. Growing popularity of distributed and microservice architecture makes resilience testing critical for applications that now require 24x7x365 operation. Resilience testing is an approach where you intentionally inject different types of failures at the infrastructure level (VM, network, containers, and processes) and let the system try to recover from these unexpected failures that can happen in production. Simulating realistic failures at any time is the best way to enforce highly available and resilient systems.