Queen's School of Computing
Supervisor: Prof. Ahmed Hassan 

Internal examiner is: Prof. Yuan Tian
External examiner is: Prof. Ali Etemad (ECE)
Chair: Prof. James Stewart


A study on the practices of composing multi-component applications
with Docker Compose and DockerHub images

Abstract

Docker is a tool used to encapsulate a software package with all of its dependencies and configurations into an isolated environment. Nowadays, developers compose multiple Docker images to form multi-component applications.

While prior studies of Docker examine the quality and evolution of a single Docker image, no study has explored the use of Docker in a multi-component setting. In this thesis, we first identify how multi-component applications are composed and maintained using Docker Compose, then we examine Docker images that are hosted on DockerHub, a commonly used online registry for Docker images.

From our first study of 4,103 open-source Github projects that use Docker Compose, we observe that 26.8% of the projects needlessly use Docker Compose to compose a single component application, multi-component applications stick to the basic options of Docker Compose and ignore advanced ones such as security and monitoring related options, and multi-component applications rarely upgrade their Docker Compose version. We also observe that 77% of the studied applications use images from an online registry and 95.2% of these registry images are hosted on DockerHub.

Hence, we investigate the available Docker images on DockerHub. We studied 505 DockerHub images for five popular software systems (101 images for each system). We observed that DockerHub community images differ from their official image, while also differing from each other in terms of their installed libraries. We also observe that there exist community images that are more resource-efficient and which contain fewer security vulnerabilities compared to their official image. However, users might not find such images since they are not well-documented.

Our thesis suggests the need for tools and methodologies to help multi-component applications take full advantage of the capabilities and resources that are offered by Docker Compose and DockerHub.