Streaming backups in parallel using tee

So you need to build a new set of databases, perhaps in a new location or geographical zone. Management wants it done yesterday cause the newly updated service hit the front page of reddit and your website and its back-end databases are getting smashed.

The standard method would be to stream a backup from the master or a dedicated backup slave and build each new read only slave from that backup.

You tried streaming the backups in parallel using pssh because some random database blog said you can. https://dbadojo.com/2020/08/26/streaming-backups-in-parallel-using-pssh/

But that failed with a python memory error, you don’t want to use the split workaround and management are still screaming at you….so you search again…

Thinking of the basic requirements, you want one stream of data (from the backup) to go to multiple locations. So you want a linux utility which can redirect streams… enter the venerable tee command.
Tee will redirect a stream from a pipe to as many locations as you want, with the last stream going to a file or that webscale file called /dev/null.

Example of streaming a backup to two locations:

mariabackup --backup --slave-info --tmpdir=/tmp --stream=xbstream \
--parallel=4 --datadir=/var/lib/mysql 2>backup.log | tee \
>(ssh -q 192.168.56.112 -t "mbstream --directory=/var/lib/mysql -x --parallel=4") \
>(ssh -q 192.168.56.113 -t "mbstream --directory=/var/lib/mysql -x --parallel=4") \
> /dev/null

This command will split the streamed backup into two ssh commands which connect to two hosts, and run the mbstream -x command to create an unprepared backup in the datadir.
If you have more hosts, it is just a matter of adding redirections and the associated commands.

One space and performance improvement would be to add pigz to parallel compress before the tee and in each tee’ed ssh command.

Until next time.

 

Full example: Parallel streaming backup using tee.

-- Split standard input into separate parallel streams using tee

mariabackup --backup --slave-info --tmpdir=/tmp --stream=xbstream --parallel=4 --datadir=/var/lib/mysql 2>backup.log | tee >(ssh -q 192.168.56.112 -t "mbstream --directory=/var/lib/mysql -x --parallel=4") >(ssh -q 192.168.56.113 -t "mbstream --directory=/var/lib/mysql -x --parallel=4") > /dev/null

-- Use pssh to run prepare in parallel. This is a better use case for pssh

[root@db1 test_db]# pssh -i --host='192.168.56.112 192.168.56.113' "mariabackup --prepare --target-dir=/var/lib/mysql"
[1] 05:10:23 [SUCCESS] 192.168.56.112
Stderr: mariabackup based on MariaDB server 10.4.14-MariaDB Linux (x86_64)
[00] 2020-08-31 05:10:22 cd to /var/lib/mysql/
[00] 2020-08-31 05:10:22 This target seems to be not prepared yet.
[00] 2020-08-31 05:10:22 mariabackup: using the following InnoDB configuration for recovery:
[00] 2020-08-31 05:10:22 innodb_data_home_dir = .
[00] 2020-08-31 05:10:22 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2020-08-31 05:10:22 innodb_log_group_home_dir = .
[00] 2020-08-31 05:10:22 InnoDB: Using Linux native AIO
[00] 2020-08-31 05:10:22 Starting InnoDB instance for recovery.
[00] 2020-08-31 05:10:22 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2020-08-31 5:10:22 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-08-31 5:10:22 0 [Note] InnoDB: Uses event mutexes
2020-08-31 5:10:22 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2020-08-31 5:10:22 0 [Note] InnoDB: Number of pools: 1
2020-08-31 5:10:22 0 [Note] InnoDB: Using SSE2 crc32 instructions
2020-08-31 5:10:22 0 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
2020-08-31 5:10:22 0 [Note] InnoDB: Completed initialization of buffer pool
2020-08-31 5:10:22 0 [Note] InnoDB: page_cleaner coordinator priority: -20
2020-08-31 5:10:22 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=2008409552
2020-08-31 5:10:22 0 [Note] InnoDB: Last binlog file './mysql-bin.000003', position 172310127
[00] 2020-08-31 05:10:22 Last binlog file ./mysql-bin.000003, position 172310127
[00] 2020-08-31 05:10:23 completed OK!
[2] 05:10:23 [SUCCESS] 192.168.56.113
Stderr: mariabackup based on MariaDB server 10.4.14-MariaDB Linux (x86_64)
[00] 2020-08-31 05:10:22 cd to /var/lib/mysql/
[00] 2020-08-31 05:10:22 This target seems to be not prepared yet.
[00] 2020-08-31 05:10:22 mariabackup: using the following InnoDB configuration for recovery:
[00] 2020-08-31 05:10:22 innodb_data_home_dir = .
[00] 2020-08-31 05:10:22 innodb_data_file_path = ibdata1:12M:autoextend
[00] 2020-08-31 05:10:22 innodb_log_group_home_dir = .
[00] 2020-08-31 05:10:22 InnoDB: Using Linux native AIO
[00] 2020-08-31 05:10:22 Starting InnoDB instance for recovery.
[00] 2020-08-31 05:10:22 mariabackup: Using 104857600 bytes for buffer pool (set by --use-memory parameter)
2020-08-31 5:10:22 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-08-31 5:10:22 0 [Note] InnoDB: Uses event mutexes
2020-08-31 5:10:22 0 [Note] InnoDB: Compressed tables use zlib 1.2.7
2020-08-31 5:10:22 0 [Note] InnoDB: Number of pools: 1
2020-08-31 5:10:22 0 [Note] InnoDB: Using SSE2 crc32 instructions
2020-08-31 5:10:22 0 [Note] InnoDB: Initializing buffer pool, total size = 100M, instances = 1, chunk size = 100M
2020-08-31 5:10:22 0 [Note] InnoDB: Completed initialization of buffer pool
2020-08-31 5:10:22 0 [Note] InnoDB: page_cleaner coordinator priority: -20
2020-08-31 5:10:22 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=2008409552
2020-08-31 5:10:22 0 [Note] InnoDB: Last binlog file './mysql-bin.000003', position 172310127
[00] 2020-08-31 05:10:23 Last binlog file ./mysql-bin.000003, position 172310127
[00] 2020-08-31 05:10:23 completed OK!

-- change ownership in parallel.

[root@db1 test_db]# pssh -i --host='192.168.56.112 192.168.56.113' "chown -R mysql:mysql /var/lib/mysql"
[1] 05:10:41 [SUCCESS] 192.168.56.112
[2] 05:10:41 [SUCCESS] 192.168.56.113

-- start databases in parallel.

[root@db1 test_db]# pssh -i --host='192.168.56.112 192.168.56.113' "systemctl start mariadb"
[1] 05:10:48 [SUCCESS] 192.168.56.112
[2] 05:10:48 [SUCCESS] 192.168.56.113

-- check the status of databases ... you guessed it, in parallel.

[root@db1 test_db]# pssh -i --host='192.168.56.112 192.168.56.113' "systemctl status mariadb"
[1] 05:10:53 [SUCCESS] 192.168.56.112
● mariadb.service - MariaDB 10.4.14 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: active (running) since Mon 2020-08-31 05:10:48 UTC; 4s ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 4219 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 4163 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 4161 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 4186 (mysqld)
Status: "Taking your SQL requests now..."
CGroup: /system.slice/mariadb.service
└─4186 /usr/sbin/mysqld

Aug 31 05:10:48 db2 systemd[1]: Starting MariaDB 10.4.14 database server...
Aug 31 05:10:48 db2 mysqld[4186]: 2020-08-31 5:10:48 0 [Note] /usr/sbin/mysqld (mysqld 10.4.14-MariaDB-log) starting as process 4186 ...
Aug 31 05:10:48 db2 mysqld[4186]: 2020-08-31 5:10:48 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 32183)
Aug 31 05:10:48 db2 systemd[1]: Started MariaDB 10.4.14 database server.
[2] 05:10:53 [SUCCESS] 192.168.56.113
● mariadb.service - MariaDB 10.4.14 database server
Loaded: loaded (/usr/lib/systemd/system/mariadb.service; disabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/mariadb.service.d
└─migrated-from-my.cnf-settings.conf
Active: active (running) since Mon 2020-08-31 05:10:48 UTC; 4s ago
Docs: man:mysqld(8)
https://mariadb.com/kb/en/library/systemd/
Process: 4225 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Process: 4169 ExecStartPre=/bin/sh -c [ ! -e /usr/bin/galera_recovery ] && VAR= || VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (code=exited, status=0/SUCCESS)
Process: 4167 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
Main PID: 4192 (mysqld)
Status: "Taking your SQL requests now..."
CGroup: /system.slice/mariadb.service
└─4192 /usr/sbin/mysqld

Aug 31 05:10:48 db3 systemd[1]: Starting MariaDB 10.4.14 database server...
Aug 31 05:10:48 db3 mysqld[4192]: 2020-08-31 5:10:48 0 [Note] /usr/sbin/mysqld (mysqld 10.4.14-MariaDB-log) starting as process 4192 ...
Aug 31 05:10:48 db3 mysqld[4192]: 2020-08-31 5:10:48 0 [Warning] Could not increase number of max_open_files to more than 16384 (request: 32183)
Aug 31 05:10:48 db3 systemd[1]: Started MariaDB 10.4.14 database server.