Galera behind NAT (mixed environment in k8s and docker)

Database plays an important role in the application development. To have a redundant and high available database, we can use Galera for mysql/mariadb. With Galera, the database can have replication across different server and the load can be divided into some servers. By that capability, we can have a reliable backend database for our application.

In this tutorial, we will deploy galera cluster inside kubernetes and docker. In normal circumstance, when we install galera inside kubernetes, the replication communication is using internal network. So we need to set galera to use external network so that galera cluster can communicate with mariadb outside the kubernetes. With this setup, we can combine galera node located in kubernetes, docker, private or public cloud.

In kubernetes, we need to bind the pod with the worker node so that the incoming and outgoing traffic delivered from the same node. Below is the illustration for the networking diagram.

Galera Layout

In this layout, pod A will always created in node 1 and reachable using ip 192.168.1.1. The same configuration also applied for pod B.

Deployment in Kubernetes

First we need to create service.yaml, secret.yaml, configmap.yaml and statefulsets.yaml. Below is the service.yaml, this service will be used in statefulsets configuration.

apiVersion: v1
kind: Service
metadata:
  name: db-headless
  namespace: database
spec:
  clusterIP: None
  clusterIPs:
  - None
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: mysql
    port: 3306
    protocol: TCP
    targetPort: mysql
  selector:
    app.kubernetes.io/name: mix-mariadb-galera
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}
This configmap is used by the pod as configuration template:
apiVersion: v1
data:
  my.cnf: |
    [client]
    port=3306
    socket=/opt/bitnami/mariadb/tmp/mysql.sock
    plugin_dir=/opt/bitnami/mariadb/plugin

    [mysqld]
    default_storage_engine=InnoDB
    basedir=/opt/bitnami/mariadb
    datadir=/bitnami/mariadb/data
    plugin_dir=/opt/bitnami/mariadb/plugin
    tmpdir=/opt/bitnami/mariadb/tmp
    socket=/opt/bitnami/mariadb/tmp/mysql.sock
    pid_file=/opt/bitnami/mariadb/tmp/mysqld.pid
    bind_address=0.0.0.0

    ## Character set
    ##
    collation_server=utf8_unicode_ci
    init_connect='SET NAMES utf8'
    character_set_server=utf8

    ## MyISAM
    ##
    key_buffer_size=32M
    myisam_recover_options=FORCE,BACKUP

    ## Safety
    ##
    skip_host_cache
    skip_name_resolve
    max_allowed_packet=16M
    max_connect_errors=1000000
    sql_mode=STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
    sysdate_is_now=1

    ## Binary Logging
    ##
    log_bin=mysql-bin
    expire_logs_days=14
    # Disabling for performance per http://severalnines.com/blog/9-tips-going-production-galera-cluster-mysql
    sync_binlog=0
    # Required for Galera
    binlog_format=row

    ## Caches and Limits
    ##
    tmp_table_size=32M
    max_heap_table_size=32M
    # Re-enabling as now works with Maria 10.1.2
    query_cache_type=1
    query_cache_limit=4M
    query_cache_size=256M
    max_connections=500
    thread_cache_size=50
    open_files_limit=65535
    table_definition_cache=4096
    table_open_cache=4096

    ## InnoDB
    ##
    innodb=FORCE
    innodb_strict_mode=1
    # Mandatory per https://github.com/codership/documentation/issues/25
    innodb_autoinc_lock_mode=2
    # Per https://www.percona.com/blog/2006/08/04/innodb-double-write/
    innodb_doublewrite=1
    innodb_flush_method=O_DIRECT
    innodb_log_files_in_group=2
    innodb_log_file_size=128M
    innodb_flush_log_at_trx_commit=1
    innodb_file_per_table=1
    # 80% Memory is default reco.
    # Need to re-evaluate when DB size grows
    innodb_buffer_pool_size=2G
    innodb_file_format=Barracuda

    ## Logging
    ##
    log_error=/opt/bitnami/mariadb/logs/mysqld.log
    slow_query_log_file=/opt/bitnami/mariadb/logs/mysqld.log
    log_queries_not_using_indexes=1
    slow_query_log=1

    ## SSL
    ## Use extraVolumes and extraVolumeMounts to mount /certs filesystem
    # ssl_ca=/certs/ca.pem
    # ssl_cert=/certs/server-cert.pem
    # ssl_key=/certs/server-key.pem

    [galera]
    wsrep_on=ON
    wsrep_provider=/opt/bitnami/mariadb/lib/libgalera_smm.so
    wsrep_sst_method=mariabackup
    wsrep_slave_threads=4
    wsrep_cluster_address=gcomm://
    wsrep_cluster_name=galera
    wsrep_sst_auth="root:"
    # Enabled for performance per https://mariadb.com/kb/en/innodb-system-variables/#innodb_flush_log_at_trx_commit
    innodb_flush_log_at_trx_commit=2
    # MYISAM REPLICATION SUPPORT #
    wsrep_replicate_myisam=ON

    [mariadb]
    plugin_load_add=auth_pam
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/instance: galera-db
    app.kubernetes.io/name: mariadb-galera
  name: mariadb-galera-configuration
  namespace: database
secret.yaml is used to define database password, the password is in base64 encoded format.
apiVersion: v1
data:
  mariadb-galera-mariabackup-password: cGFzc1dPUkQK
  mariadb-password: cGFzc1dPUkQK
  mariadb-root-password: cGFzc1dPUkQK
kind: Secret
metadata:
  labels:
    app.kubernetes.io/instance: galera-db
    app.kubernetes.io/name: mariadb-galera
  name: secret-mariadb-galera
  namespace: database
type: Opaque
The last config is statefulsets.yaml, to define pod creation.
apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app.kubernetes.io/instance: INS_NAME
    app.kubernetes.io/name: mix-mariadb-galera
  name: APP_NAME
  namespace: database
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: INS_NAME
      app.kubernetes.io/name: mix-mariadb-galera
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: INS_NAME
        app.kubernetes.io/name: mix-mariadb-galera
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/instance: INS_NAME
                  app.kubernetes.io/name: mix-mariadb-galera
              namespaces:
              - database
              topologyKey: kubernetes.io/hostname
            weight: 1
      containers:
      - command:
        - bash
        - -ec
        - |
          exec /opt/bitnami/scripts/mariadb-galera/entrypoint.sh /opt/bitnami/scripts/mariadb-galera/run.sh
        env:
        - name: MY_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: BITNAMI_DEBUG
          value: "false"
        - name: MARIADB_DEFAULT_PORT_NUMBER
          value: "3306"
        - name: MARIADB_GALERA_CLUSTER_NAME
          value: galera
        - name: MARIADB_GALERA_CLUSTER_ADDRESS
          value: gcomm://192.168.1.1:30002,192.168.1.2:30002,192.168.2.1:30002
        - name: MARIADB_ROOT_USER
          value: root
        - name: MARIADB_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              key: mariadb-root-password
              name: secret-mariadb-galera
        - name: MARIADB_DATABASE
          value: my_database
        - name: MARIADB_GALERA_MARIABACKUP_USER
          value: mariabackup
        - name: MARIADB_GALERA_MARIABACKUP_PASSWORD
          valueFrom:
            secretKeyRef:
              key: mariadb-galera-mariabackup-password
              name: secret-mariadb-galera
        - name: MARIADB_ENABLE_LDAP
          value: "no"
        - name: MARIADB_EXTRA_FLAGS
          value: --wsrep_provider_options=ist.recv_addr=NODE_IP:IST_PORT;ist.recv_bind=0.0.0.0:IST_PORT
            --wsrep_sst_receive_address=NODE_IP:SST_PORT --wsrep_node_incoming_address=NODE_IP
        - name: MARIADB_ENABLE_TLS
          value: "no"
        - name: MARIADB_GALERA_NODE_ADDRESS
          value: NODE_IP:GALERA_PORT
        image: docker.io/bitnami/mariadb-galera:10.6.7-debian-10-r17
        imagePullPolicy: IfNotPresent
        name: mariadb-galera
        ports:
        - containerPort: 3306
          hostIP: NODE_IP
          hostPort: MYSQL_PORT
          name: mysql
          protocol: TCP
        - containerPort: GALERA_PORT
          hostIP: NODE_IP
          hostPort: GALERA_PORT
          name: galera
          protocol: TCP
        - containerPort: IST_PORT
          hostIP: NODE_IP
          hostPort: IST_PORT
          name: ist
          protocol: TCP
        - containerPort: SST_PORT
          hostIP: NODE_IP
          hostPort: SST_PORT
          name: sst
          protocol: TCP
        - containerPort: GALERA_PORT
          hostIP: NODE_IP
          hostPort: GALERA_PORT
          name: galera-udp
          protocol: UDP
        readinessProbe:
          exec:
            command:
            - bash
            - -ec
            - |
              password_aux="${MARIADB_ROOT_PASSWORD:-}"
              if [[ -f "${MARIADB_ROOT_PASSWORD_FILE:-}" ]]; then
                  password_aux=$(cat "$MARIADB_ROOT_PASSWORD_FILE")
              fi
              exec mysqladmin status -u"${MARIADB_ROOT_USER}" -p"${password_aux}"
          failureThreshold: 3
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        livenessProbe:
          exec:
            command:
              - bash
              - '-ec'
              - >
                password_aux="${MARIADB_ROOT_PASSWORD:-}"

                if [[ -f "${MARIADB_ROOT_PASSWORD_FILE:-}" ]]; then
                    password_aux=$(cat "$MARIADB_ROOT_PASSWORD_FILE")
                fi

                exec mysql -u"${MARIADB_ROOT_USER}" -p"${password_aux}" -e "select *
                from mysql.wsrep_cluster_members;"
          failureThreshold: 30
          initialDelaySeconds: 320
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        resources: {}
        securityContext:
          runAsNonRoot: true
          runAsUser: 1001
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /opt/bitnami/mariadb/.bootstrap
          name: previous-boot
        - mountPath: /bitnami/mariadb
          name: data
        - mountPath: /bitnami/conf/my.cnf
          name: mariadb-galera-config
          subPath: my.cnf
      dnsPolicy: ClusterFirst
      nodeName: NODE_NAME
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
        runAsUser: 1001
      serviceAccount: default
      serviceAccountName: default
      terminationGracePeriodSeconds: 30
      volumes:
      - emptyDir: {}
        name: previous-boot
      - configMap:
          defaultMode: 420
          name: mariadb-galera-configuration
        name: mariadb-galera-config
  serviceName: db-headless
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: INS_NAME
        app.kubernetes.io/name: mix-mariadb-galera
      name: data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 200Gi
      volumeMode: Filesystem
    status:
      phase: Pending
This statefulsets.yaml is only a template, we need to substitute some variable when implement it in kubernetes. Next, we deploy service, secret and configmap to kubernetes cluster.
kubectl apply -f service.yaml
kubectl apply -f secret.yaml
kubectl apply -f configmap.yaml
To substitute the variable in statefulsets.yaml, create this simple bash file, deploy.sh.
#!/bin/bash

INS_NAME=$1
APP_NAME=${INS_NAME}-mariadb-galera
NODE_IP=$2
MYSQL_PORT=30001
GALERA_PORT=30002
IST_PORT=30003
SST_PORT=30004
NODE_NAME=$3

cat statefulsets.yaml | sed "s/APP_NAME/${APP_NAME}/g" | sed "s/NODE_IP/${NODE_IP}/g" | sed "s/GALERA_PORT/${GALERA_PORT}/g" | sed "s/IST_PORT/${IST_PORT}/g" | sed "s/SST_PORT/${SST_PORT}/g" | sed "s/NODE_NAME/${NODE_NAME}/g" | sed "s/INS_NAME/${INS_NAME}/g" | sed "s/MYSQL_PORT/${MYSQL_PORT}/g" | kubectl apply --validate=false -f -
and then run the script using this command to deploy the pod. Note that the node name must be same with the node hostname. In above example k8s_node_1 and k8s_node_2. For initial start, in k8s_node_1, we need to change the MARIADB_GALERA_CLUSTER_ADDRESS to "gcomm://" in statefulsets.yaml, this tell mariadb to bootstrap the first pod. Then run:
./deploy.sh db1 192.168.1.1 k8s_node_1
After pod A completely running, join the pod B to galera cluster by running this command:
./deploy.sh db2 192.168.1.2 k8s_node_2

Deployment using Docker

Next step is to start docker in another server. Run this command:
docker run -d --name db3-mariadb-galera-0 \
  -e MARIADB_GALERA_CLUSTER_NAME=galera \
  -e MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://192.168.1.1:30002,192.168.1.2:30002,192.168.2.1:30002 \
-e MARIADB_GALERA_MARIABACKUP_PASSWORD=passWORD \ -e MARIADB_ROOT_PASSWORD=passWORD \ -e MARIADB_GALERA_NODE_NAME=db3-mariadb-galera-0 \ -e MARIADB_REPLICATION_PASSWORD=passWORD \ -e MARIADB_GALERA_NODE_ADDRESS=192.168.2.1:30002 \
-e MARIADB_EXTRA_FLAGS="--wsrep_provider_options=ist.recv_addr=192.168.2.1:30003;ist.recv_bind=0.0.0.0:30003 --wsrep_node_incoming_address=192.168.2.1 --wsrep_sst_receive_address=192.168.2.1:30004" \
-p 30001:3306 -p 30004:30004 -p 30002:30002/tcp -p 30002:30002/udp -p 30003:30003 \
-v /var/lib/mysql:/bitnami/mariadb \ --health-cmd='mysql -uroot -ppassWORD -e "select * from mysql.wsrep_cluster_members;" || exit 1' \
--health-timeout=10s \ --health-retries=30 \ --health-interval=10s \ --restart always \ bitnami/mariadb-galera:latest
Don't forget to create folder /var/lib/mysql and give full permission to the folder for persistent storage.

Finishing touch

After all mariadb are connected each others, we must change again the MARIADB_GALERA_CLUSTER_ADDRESS value of pod A back to "gcomm://192.168.1.1:30002,192.168.1.2:30002,192.168.2.1:30002", change can be done by directly edit the statefulsets configuration using kubectl edit or reapply the deploy.sh command for pod A. This step is needed to make pod A to be able to join the cluster whenever the pod is restarted because of failure or pod migration.

Access the Database

The database can be accessed by connecting to any IP with port 30001, if you want to have single point of entry, you can define a load balancer that pointing to corresponding ip and port.
mysql -uroot -ppassWORD -h 192.168.1.1 -P 30001 -e "select * from mysql.wsrep_cluster_members;"

References

Comments

Popular Posts