MySQL high availability made charmingly easy

In a previous blog, we talked about patterns to run a database in a highly available manner. In this blog, we present our recipe for MySQL high availability. We will also explain how our solution interacts with K8s objects to provide some of its features.

Photo from: https://unsplash.com/photos/i5FsBOLsB50

Why use MySQL

MySQL is the most popular open-source database according to JetBrains’ 2022 survey and according to DB-ranking’s popularity index. The success of MySQL is fueled by its vibrant community and ease of use. For example, MySQL can run as a single process making it a breeze to deploy. While it is easy to run a single instance of MySQL, it is rarely a good idea as we will see next.

Why we need to run a highly available MySQL

According to Uptime Institute’s 2022 report, “over 60% of failures result in at least $100,000 in total losses”. Reports like ManageForce’s and Phenom’s estimated the cost of an outage to be around $500,000 per hour, on average. Running a single instance of a database is generally a recipe for high downtime. While deploying a single MySQL instance is quite easy, running a secure and highly available MySQL can be a daunting task. Let’s see how Canonical can help you deploy a highly available MySQL with ease and on the platform of your choice.

The open source ingredients

The following table provides an overview of the components we use to provide a highly available MySQL deployment:

Component	Role/Functions	Version (major)
MySQL server	Database server Clustering using InnoDB cluster	8.0
MySQL router	Connection pooling and failover	8.0/edge
Percona-xtrabackup	Backup and Restore	8.0
Charmed operator (or charm)	Operator for MySQL on top of K8s. It automates a number of management tasks like deployment and scaling (more on this later).	8.0/edge
Juju	Juju is an open source orchestration engine for software operators that enables the deployment, integration and lifecycle management of applications at any scale, on any infrastructure.	3.1

Let’s cover how to install some of these components that we will need to demonstrate the high availability guarantees provided by our operator.

The recipe

Let’s start our journey by installing the required dependencies and deploying a single MySQL instance.

Pre-requisites

First, we will install MicroK8s. MicroK8s is a lightweight CNCF-certified Kubenetes distribution.

snap install microk8s --channel=1.27-strict/stable

In order to reduce the need for sudo when interacting with microks8, let’s run the following commands:

sudo usermod -a -G microk8s $USER
sudo chown -f -R $USER ~/.kube

We need to restart our session for the above commands to take effect. After relaunching our terminal, we should enable the MicroK8s addons for storage and DNS using the following command:

sudo microk8s.enable hostpath-storage dns

We can then check if our Microk8s deployment is ready by running:

microk8s status --wait-ready
microk8s kubectl get all --all-namespaces

The first command should yield an output similar to the following:

microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    dns              # (core) CoreDNS
    ha-cluster       # (core) Configure high availability...
    helm             # (core) Helm - the package manager ...
    helm3            # (core) Helm 3 - the package manager ...
    hostpath-storage # (core) Storage class ...
    storage          # (core) Alias to hostpath-storage ...
  disabled:
   ...

The second command should yield an output similar to the following:

NAMESPACE NAME READY STATUS
kube-system pod/coredns-7745f9f87f-dnlj7 1/1 Running
kube-system pod/calico-kube-controllers-6c99c8747f-l4g5v 1/1 Running
kube-system pod/calico-node-vzskh 1/1 Running
kube-system pod/hostpath-provisioner-58694c9f4b-sskwq 1/1 Running

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S)
default service/kubernetes ClusterIP ... <none> 443/TCP
kube-system service/kube-dns ClusterIP .. <none> 53/UDP,53/TCP,9153/TCP

NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR
kube-system daemonset.apps/calico-node 1 1 1 1 1 kubernetes.io/os=linux

NAMESPACE NAME READY UP-TO-DATE AVAILABLE
kube-system deployment.apps/calico-kube-controllers 1/1 1 1
kube-system deployment.apps/coredns 1/1 1 1
kube-system deployment.apps/hostpath-provisioner 1/1 1 1

NAMESPACE NAME DESIRED CURRENT READY
kube-system replicaset.apps/calico-kube-controllers-6c99c8747f 1 1 1
kube-system replicaset.apps/coredns-7745f9f87f 1 1 1
kube-system replicaset.apps/hostpath-provisioner-58694c9f4b 1 1 1

The next step is to install Juju, our orchestrator engine for software operators:

sudo snap install juju --channel=3.1/stable

We then need to bootstrap Juju so that it uses MicroK8s as the backing cloud:

juju bootstrap microk8s micro

Finally, we can start deploying a single MySQL instance:

juju add-model mysql-demo
juju deploy mysql-k8s --channel 8.0/edge --trust

Please note that the first time you run the above commands, it might take Juju several minutes to download the required charm (a.k.a. operator) and dependencies. The subsequent commands should be faster.

In order to check for the status of the deployment, you can use the following command:

juju status
# ‘juju status --watch 1s’ will continuously monitor the output

For more details around what is happening in the background you can type:

juju debug-log --tail --replay

After a few minutes, (your mileage may vary), you should get an output similar to the following after typing juju status:

Model Controller Cloud/Region Version SLA Timestamp
mysql-demo micro microk8s/localhost 3.1.2 unsupported ...

App Version Status Scale Charm Channel Rev Address Exposed Message
mysql-k8s 8.0.32-... active 1 mysql-k8s 8.0/stable 75 ... no Primary

Unit Workload Agent Address Ports Message
mysql-k8s/0* active idle ... 79 Primary

Now that we managed to deploy our first Juju unit on top of a K8s distribution, it is probably the right time to explain how Juju objects are mapped to K8s objects.

Behind the scenes

In order to understand how Juju objects are mapped to K8s objects, we can type the following commands:

microk8s kubectl get all  --all-namespaces
microk8s kubectl get pods -n mysql-demo

We can notice that a new K8s namespace was created for the mysql-demo model. Within the namespace, we can see two pods (one for the Juju model operator and another for the mysql unit):

NAME                            READY   STATUS    RESTARTS      AGE
modeloperator-784fbf89f-pt5dj   1/1     Running   1 (26m ago)   36m
mysql-k8s-0                     2/2     Running   0             24m?

In order to get a hold of the components of a Juju unit from a kubernetes point of view, we can run the following command:

microk8s kubectl describe pod -n mysql-demo "mysql-k8s-0"

The interesting sections to examine are the following ones:

Controlled By:  StatefulSet/mysql-k8s
...
Containers:
  charm:
    Container ID:  containerd://...
    Image:         jujusolutions/charm-base:ubuntu-22.04
    Image ID:      docker.io/jujusolutions/charm-base@sha256:...
    Port:          <none>
    Host Port:     <none>
    Command:
      /charm/bin/pebble
...
  mysql:
    Container ID:  containerd://...
    Image:         registry.jujucharms.com/charm/.../mysql-image...
    Image ID:      registry.jujucharms.com/charm/.../mysql-image...
    Port:          <none>
    Host Port:     <none>
    Command:
      /charm/bin/pebble

We can make a few observations already:

Our containers are managed as K8s StatefulSets
All containers are managed by pebble, a service manager.
The Pod is composed of 2 containers: the workload one (MySQL server) and the operator one (a.k.a charm).

Let’s connect to our newly created pod to try to understand further how those containers are organised:

microk8s kubectl exec --stdin --tty -n mysql-demo "mysql-k8s-0" -- /bin/bashps -edf

Excluding the processes related to bash and our ps command, we see the following 2 entries:

root   1 0  0 14:03 ? 00:00:11 /charm/bin/pebble run ...
root  15 1  0 14:03 ? 00:00:14 /charm/bin/containeragent unit ...

We can guess that they correspond to the 2 containers spotted earlier in the describe pod output. Let’s try to connect to each one of them using the following commands sequence:

microk8s kubectl exec --stdin --tty -n mysql-demo -t "mysql-k8s-0" --container mysql -- /bin/bash
ps -edf
exit

microk8s kubectl exec --stdin --tty -n mysql-demo -t "mysql-k8s-0" --container charm -- /bin/bash
ps -edf
exit

The outputs of the ps commands will look like the following:

# First ps' output
root 1 0 0 22:02 ? 00:00:00 /charm/bin/pebble run ...
mysql 17 1 0 22:02 ? 00:00:00 /bin/sh /usr/bin/mysqld_safe
mysql 227 17 1 22:02 ? 00:00:08 /usr/sbin/mysqld --basedir=/usr

# Second ps' output
root 1 0 0 14:03 ? 00:00:11 /charm/bin/pebble run ...
root 15 1 0 14:03 ? 00:00:14 /charm/bin/containeragent unit ...

We can note that pebble is running as the init process (i.e. having pid 1). We can also easily recognise the mysql server process in the first output and the agent managing the operator code’s lifecycle in the second output. A nice visualisation of the above can be found on the Juju website.

Hopefully, you can now appreciate the amount of boilerplate code that Juju allows you to get rid of compared to manually running all the equivalent K8s commands. You can also probably appreciate a number of best practices implemented by Juju when using a K8s distribution:

Separate namespace for every model
Colocation within the same Pod of tightly coupled containers
Usage of a StatefulSets for stateful workloads as databases
Usage of a full fledged service manager, pebble, instead of simplistic init process (e.g. tini)
And more

Let’s now resume our recipe of highly available MySQL deployments.

Deploying a highly available MySQL

As we said earlier, running a single instance is not a good idea. So we will explore another neat feature of our charms, on-demand scaling. Adding replicas to MySQL is as simple as running the following command:

juju scale-application mysql-k8s 3

After some minutes, running juju status should yield an output similar to the following:

Model Controller Cloud/Region Version SLA Timestamp
mysql-demo micro microk8s/localhost 3.1.2 unsupported ...

App Version Status Scale Charm Channel Rev Address Exposed Message
mysql-k8s 8.0.32-0ubuntu0.22.04.2 active 3 mysql-k8s 8.0/stable 75 … no Primary

Unit Workload Agent Address Ports Message
mysql-k8s/0* active idle ... Primary
mysql-k8s/1 active idle ...
mysql-k8s/2 active idle ...

Let’s deploy the MySQL router next, using the following command:

juju deploy mysql-router-k8s --channel 8.0/edge --trust

After some minutes, running juju status should show a new application and a new unit:

...
App Version Status Scale Charm Channel Rev Address Exposed Message
mysql-router-k8s 8.0.32... blocked 1 mysql-router-k8s 8.0/candidate 46 ... no Missing relation: backend-database

Unit Workload Agent Address Ports Message
...
mysql-router-k8s/0* active idle ...

As you might have noticed, MySQL router shows blocked as the Status. Don’t worry, as this is expected. MySQL router is a database proxy and should therefore be configured to talk to the MySQL cluster and to a consuming application for the error message to disappear.

This is when Juju’s powerful integration abstraction (a.k.a. relation) comes into play. Integrations allows us to establish a communication link between two workloads (e.g. MySQL cluster and MySQL router) using the following simple command:

juju relate mysql-router-k8s mysql-k8s

After some moments, typing juju status –relations should provide with an output similar to the previous one with the following addition at the bottom:

Relation provider                    Requirer                             Interface           Type     
mysql-k8s:database                   mysql-router-k8s:backend-database    mysql_client        regular  
mysql-k8s:database-peers             mysql-k8s:database-peers             mysql_peers         peer     
mysql-router-k8s:mysql-router-peers  mysql-router-k8s:mysql-router-peers  mysql_router_peers  peer

Understanding the above is essential in mastering Juju relations. Most operators come with pre-built integration points to provide (the relation provider side) or consume a service (the relation requirer side). In the example above, we see that our operator for MySQL server comes with integration points for:

Other instances of MySQL servers that are expected to be part of the same cluster (i.e. database-peers). Relations that are intended to be used by the units of the same application are called peer relations.
Clients that intend to use MySQL server to store data (i.e. mysql_client). This relation is called a regular relation as it is intended to link one application to another.

In order to emulate an application that uses MySQL through the proxy (i.e. MySQL router), we will use the Data Integrator charm:

juju deploy data-integrator --channel edge --config database-name=test-database

Running a juju status should show a new data-integrator application. We will now relate the data integrator to MySQL router to finish setting up the scene for some basic checks:

juju relate data-integrator mysql-router-k8s

After some minutes, running juju status should yield an output without any error message (now that all expected relations are satisfied ). On my laptop the output looks like the following:

Model Controller Cloud/Region Version SLA Timestamp
mysql-demo micro microk8s/localhost 3.1.2 unsupported ...

App Version Status Scale Charm Channel Rev
data-integrator active 1 data-integrator edge 12
mysql-k8s 8.0.32-0ubuntu0.22.04.2 active 3 mysql-k8s 8.0/edge 85
mysql-router-k8s 8.0.33-0ubuntu0.22.04.2 active 1 mysql-router-k8s 8.0/edge 49

Unit Workload Agent
data-integrator/0* active idle
mysql-k8s/0* active idle
mysql-k8s/1 active idle
mysql-k8s/2 active idle
mysql-router-k8s/0* active idle

(Note that some columns were ommitted)

Time for some tests

Connecting to MySQL server

We will continue our journey by checking that MySQL server is responding to some simple queries. In addition to being a test application for a number of database operators developed by Canonical, the data integrator charm can also help us manage credentials for database users.

For example, the following command will create a user and display its credentials:

juju run data-integrator/leader get-credentials

We can learn few things from the above:

In addition to the standard Juju commands like deploy, scale-application and relate some charms come with specific actions like get-credentials for the data-integrator.
Any Juju application is guaranteed to have a unique leader that we can direct the actions to using the pattern juju run <app-name>/leader. This is useful when the application is composed of multiple units. While it might be tempting to think about the leader as the primary in a database cluster. The 2 concepts are similar but different. You can think of the leader unit as the one executing our actions while the database primary is the one receiving read/write traffic from its connected clients.

The latter command will , only one time, display a username and password in the following format:

mysql:
  database: test-database
  endpoints: mysql-router-k8s.mysql-demo.svc.cluster.local:6446
  password: <password>
  read-only-endpoints: mysql-router-k8s.mysql-demo.svc.cluster.local:6447
  username: <user-name>
ok: "True"

We can notice that our operator comes with 2 endpoints. The first one to be used for read and write traffic. The read-only is for read scaling and allows you to balance your read traffic on all the members of the MySQL cluster (and not only the primary as for the default one).

Running the following in our terminal gives us an idea how those objects are mapped in Microk8s:

microk8s kubectl get endpoints -n mysql-demo

The output on my laptop looks as follows:

NAME                         ENDPOINTS                                  
modeloperator                X.X.X.79:17071                             
mysql-k8s-endpoints          X.X.X.84,X.X.X.85,X.X.X.89                 
mysql-router-k8s-endpoints   X.X.X.88                                  
mysql-router-k8s             X.X.X.88:65535,X.X.X.88:6446,X.X.X.88:6447
data-integrator-endpoints    X.X.X.91                                  
data-integrator              X.X.X.91:65535                            
mysql-k8s-primary            X.X.X.84:3306                              
mysql-k8s-replicas           X.X.X.85:3306,X.X.X.89:3306                 
mysql-k8s                    X.X.X.84:65535,X.X.X.85:65535,X.X.X.89:65535

Using the above output, we can learn few things about the integration endpoints:

There is a dedicated endpoint to talk to the primary. We can assume it is the one used for read/write traffic.
There is a dedicated endpoint to talk to the replicas. We can assume it is the one used for read scaling.

Let’s now use the above credentials to connect to our MySQL cluster:

mysql -h <primary's ip> -u <user-name> -p<password> -e "show databases;"# In the above example, X.X.X.84 is the primary's ip.

The above command yields the following:

+--------------------+
| Database           |
+--------------------+
| information_schema |
| performance_schema |
| test-database      |
+--------------------+

MySQL high availability tests

High availability is not only about deploying several replicas of MySQL. It is also about providing automatic failover when a problem affects one of the instances. Let’s check what our charmed operator will do for us when we simulate a failure in one of the MySQL units.

We recommend, from now on, to execute the upcoming instructions in a new terminal/tab so that you can follow the behaviour of our charmed operator in the existing terminal.

We will simulate a first issue in the MySQL primary by doing the following:

microk8s kubectl exec --stdin --tty -n mysql-demo 
-t "mysql-k8s-<your primary unit's number>" --container mysql -- /bin/bash
ps -edf
kill -9 <mysqld_safe's pid> <mysqld's pid> 
ps -edf

The above commands will kill all MySQL server related processes emulating a failure affecting the primary instance.

Pebble should restart all the MySQL processes shortly after they are killed. As a client of MySQL, you will not notice any issue except a brief delay in processing time while the system is recovering.

We will now try to abruptly stop the Pod running the primary MySQL. This can emulate a crash of a server or a sudden network isolation of the primary:

microk8s kubectl delete pod mysql-k8s-<your primary unit's number> --force --grace-period=0

By checking juju status , you can see that an automatic failover happened and that our cluster self healed after only a few seconds.

Time for feedback

At Canonical, we are committed to open source software. Therefore, all of our charms are open-source and are available under the following links:

If you like MySQL as much as we do, please do not hesitate to submit feedback, propose a commit or contact us on mattermost to discuss your ideas and requests.

MySQL high availability made charmingly easy | Ubuntu

Why use MySQL

Why we need to run a highly available MySQL

The open source ingredients

The recipe

Pre-requisites

Behind the scenes

Deploying a highly available MySQL

Time for some tests

Connecting to MySQL server

MySQL high availability tests

Time for feedback

Related

Why use MySQL

Why we need to run a highly available MySQL

The open source ingredients

The recipe

Pre-requisites

Behind the scenes

Deploying a highly available MySQL

Time for some tests

Connecting to MySQL server

MySQL high availability tests

Time for feedback

Related

Related News

Developing Robust Integration of Linux and IoT Solutions

Monthly News – April 2024

Discover more from WIREDGORILLA