linux

This is a followup post to the article where I outlined how to use the power of linux pipes and pigz to compress and decompress your data in parallel for faster streaming backups.

Out of the box, the SST scripts used by Galera don’t try to optimize for speed. They are just trying to robust and not fail at the job they are required to do. That job is to get a Galera node back in sync with the cluster. But we can do better…

So I looked into the options you can pass to the SST process via adding [sst] to your .cnf file. I tested them in a lab, running the same OS and same DB version as the production database we were supported. This was to make sure the SST wouldn’t break.
And the results? we got a similar speed up in the SST duration that I had seen when we had been streaming backups to build a backup replica.

The time difference when dealing with large datasets is huge.

Notes:

Options added a .cnf file on every db server in the Galera cluster.
pigz needs to be installed on all db servers in the Galera cluster.
WARNING: Test this is a lab environment before rolling this out to production.
WARNING: Test this is a lab environment before rolling this out to production.
WARNING: Test this is a lab environment before rolling this out to production.
getting the idea yet… just test it first already.
These options will work with either:
wsrep_sst_method=mariabackup
or
wsrep_sst_method=xtrabackup-v2

SST options to speed up using pigz and parallel threads:

[sst]
compressor='pigz -p 8'
decompressor='pigz -dc -p 8'
inno-backup-opts="--parallel=8"

Explanation:

compressor=’pigz -p 8′ : compress the streaming backup using pigz using 8 CPUs
decompressor=’pigz -dc -p 8′: decompress the streaming backup using pigz using 8 CPUs
inno-backup-opts=”–parallel=8″: backup the db using 8 parallel threads

Documentation:

https://mariadb.com/kb/en/mariabackup-options/#-parallel
https://www.percona.com/doc/percona-xtrabackup/2.3/innobackupex/innobackupex_option_reference.html

Until next time.

If ever you need to re-construct the directory structure on Linux/Unix on a different machine you can just run this command.

# Generates a list of mkdir commands to re-construct the directory structure from current location

find . -type d| while read -r line; do echo “mkdir -p $line”; done

If you are wanting to copy files as well, just use scp or rsync

The use case for these kind of commands nowadays is greatly reduced, if you are using DevOps tools such as puppet or chef, they will do this kind of thing automagically out-of-the-box. If you are running your databases on VMs (datafiles within the VM), most of the time you could clone the image and everything is the same.
The aim of all those tools is to make the job of Sysadmins and DBAs easier whilst producing a environment where the state is consistent/known.

Have Fun

DBA Dojo

A place on the way of the DBA "Tao DBA" Oracle, MySQL databases and more on EC2.

Faster Galera SST times using [sst] options.

Re-constructing directory structure on Linux