Lookout: crashbug using innodb_track_changed_pages with O_DIRECT


If you are using innodb_flush_method = O_DIRECT (which is highly recommended for a bunch of reasons) and innodb_track_changed_pages the instance will crash if you query any tables related to that feature.

So innodb_track_changed_pages is a Percona system variable which is used to make incremental backups much faster.

So if you are using Percona’s great backup and recovery software suite and running at least MySQL 5.5.27 you might have enabled this variable to make your incremental backups run faster.

You may want to reconsider your choice and turn it off … continue to read why…



I was working on a client a while back (12 months ago) and I was looking to improve their incremental backup durations.

When I turned on the feature and restarted the MySQL instance, as soon as I queried the table the MySQL instance crashed.

After a couple of tests, the change was hastily rolled back and I raised this bug ticket with Percona.


Testing, preparing and getting a bug raised took time, but apart from Rick Pizzi also posting that he had a similar crash, there has been no activity on this. It is like the crickets chirping in the Simpsons.


Zero activity, zero response, the bug is Status is new and unassigned since Jan 27th 2016.

Yes this is an older version MySQL Percona 5.5, but it also affects versions up to 5.7.

How many installations are running between 5.5 and 5.7? I would have thought the bulk of the installed base.

Goal of this post:

Hopefully, this article will prompt someone to take a look at the bug.

If it seems like some of my recent blog posts have been raising issues with Percona, it is because I use, support and recommend Percona MySQL databases and tools every day even though I don’t work at Percona. I love that Percona provide the tools and software they do. Having a crash bug open for more than 12 months is disappointing.

The bug in action:

mysql> select @@version;
| @@version       |
| 5.5.54-38.6-log | <== The latest 5.5 version
1 row in set (0.00 sec)

mysql> show global variables like 'innodb%track%';
| Variable_name              | Value |
| innodb_track_changed_pages | OFF   |
1 row in set (0.00 sec)

-- restart after adding variable to .cnf file

mysql> show global variables like 'innodb%track%';
| Variable_name              | Value |
| innodb_track_changed_pages | ON    |
1 row in set (0.00 sec)

| count(*) |
|        0 |
1 row in set (0.00 sec)

-- it works because there is nothing in the bitmap file

root@db1:~# ls -l /var/lib/mysql/
total 28704
-rw-r--r-- 1 root  root         0 Feb 18 03:01 debian-5.5.flag
-rw-rw---- 1 mysql mysql 18874368 Feb 18 03:06 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:06 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:01 ib_logfile1
-rw-rw---- 1 mysql mysql        0 Feb 18 03:06 ib_modified_log_1_0.xdb <=== BITMAP FILE
drwx------ 2 mysql root      4096 Feb 18 03:01 mysql
-rw-rw---- 1 mysql mysql      126 Feb 18 03:05 mysql-bin.000001
-rw-rw---- 1 mysql mysql      126 Feb 18 03:06 mysql-bin.000002
-rw-rw---- 1 mysql mysql      107 Feb 18 03:06 mysql-bin.000003
-rw-rw---- 1 mysql mysql       96 Feb 18 03:06 mysql-bin.index
-rw------- 1 root  root        11 Feb 18 03:01 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Feb 18 03:01 performance_schema
drwx------ 2 mysql root      4096 Feb 18 03:01 test

-- nothing wrong at this point... now change some rows

mysql> show databases;
| Database           |
| information_schema |
| mysql              |
| performance_schema |
| test               |
4 rows in set (0.00 sec)

mysql> use test;
Database changed
mysql> show tables;
Empty set (0.00 sec)

-- Don't use column names like this at home folks...

mysql> create table test (blah varchar(10));
Query OK, 0 rows affected (0.02 sec)

-- Fred the world renowned tester

mysql> insert into test values ('fred');
Query OK, 1 row affected (0.00 sec)

mysql> exit
root@db1:~# ls -l /var/lib/mysql/
total 28708
-rw-r--r-- 1 root  root         0 Feb 18 03:01 debian-5.5.flag
-rw-rw---- 1 mysql mysql 18874368 Feb 18 03:08 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:08 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:01 ib_logfile1
-rw-rw---- 1 mysql mysql     4096 Feb 18 03:08 ib_modified_log_1_0.xdb <=== BITMAP FILE
drwx------ 2 mysql root      4096 Feb 18 03:01 mysql
-rw-rw---- 1 mysql mysql      126 Feb 18 03:05 mysql-bin.000001
-rw-rw---- 1 mysql mysql      126 Feb 18 03:06 mysql-bin.000002
-rw-rw---- 1 mysql mysql      396 Feb 18 03:08 mysql-bin.000003
-rw-rw---- 1 mysql mysql       96 Feb 18 03:06 mysql-bin.index
-rw------- 1 root  root        11 Feb 18 03:01 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Feb 18 03:01 performance_schema
drwx------ 2 mysql root      4096 Feb 18 03:08 test

-- still works as innodb_flush_method isn't set

mysql> show global variables like 'innodb_flush_method';
| Variable_name       | Value |
| innodb_flush_method |       |
1 row in set (0.00 sec)

| count(*) |
|       17 |
1 row in set (0.01 sec)

-- OK now turn on O_DIRECT

root@db1:~# vi /etc/mysql/my.cnf 
root@db1:~# service mysql restart
 * Stopping MySQL (Percona Server) mysqld                                                                        [ OK ] 
 * Starting MySQL (Percona Server) database server mysqld                                                        [ OK ] 
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

mysql> show global variables like 'innodb_flush_method';
| Variable_name       | Value    |
| innodb_flush_method | O_DIRECT |
1 row in set (0.00 sec)

-- Bingo! Houston we have a problem.

ERROR 2013 (HY000): Lost connection to MySQL server during query


MySQL Practice challenges part one.

Are you ready to accept the challenge? Really?

Can you prove that you have got what it takes to be an effective DBA?

Go grab the tests from https://github.com/dbadojo/test-mysql-restore and see if you do…


These tests are designed to test your ability to do basic restores and recoveries of a MySQL database.

Each restore gets progressively more complex.

The key feature is, once you are successfully restored the MySQL database you will get the encryption key/passphrase
to do the next test.

Use: ccrypt -d <filename>.cpt to decrypt the file.


Virtualbox and vagrant if using a virtual machine for running the MySQL instance.
MySQL 5.6
ccrypt or equivalent that can read ccrypt encrypted files.

A vagrantfile is provided as an example to spawn a simple virtualbox VM with 1Gig of memory to run the small MySQL database required for the tests.

Once you have proved you are awesome…

Once you have completed all the current tests, email dbadojo@gmail.com and we will keep you informed when the next set becomes ready.

For the uber awesome DBAs, if you have ideas for more tests, email them to dbadojo@gmail.com or comment here or send a merge request.

P.S. There are more and harder restores to come… stay tuned.

MySQL upgrade 5.6 with innodb_fast_checksum=1


My checklist for performing an in-place MySQL upgrade to 5.6.


In my previous post, I discussed the problem I had when doing an in-place MySQL upgrade from 5.5 to 5.6 when the database had been running with innodb_fast_checksum=1.

The solution was to use the MySQL 5.7 version of the tool innochecksum. Using this tool on a shutdown database, you can force the checksums on the innodb datafiles to be rewritten into either INNODB or CRC32 format.

Once the MySQL 5.6 upgrade is done, the 5.6 version of mysqld will be able to read the datafiles correctly and not fail with an error.

There is already plenty of good documentation on the MySQL website on how to upgrade from 5.x to 5.6.


My checklist for in-place upgrading to MySQL 5.6:

  1. Perform application and database performance testing on your test environment to make sure your application performance doesn’t get worse when running on MySQL 5.6.
  2. Make sure you have backups and verified that your backups are good aka you have restored databases from those backups.
  3. Check that all users have updated their passwords to use the new mysql password hash (plugin) Doc URL
  4. Organize downtime in advance.
  5. If running with innodb_fast_checksum=1, proceed with steps to replace the fast checksums with INNODB or CRC32.
    Note: if you use CRC32, you will need to make sure your cnf file is updated for 5.6 to use innodb_checksum_algorithm = CRC32. This is because innodb_checksum_algorithm = INNODB is the default setting. See this post for a sample procedure.
  6. Run a quick search of all existing .cnf files to find any other system variables which have been removed and either replace or remove them.
  7. Run the in-place upgrade.
  8. Run mysql_upgrade, it will flag if it doesn’t need to be run again.

I am trying something new with a poll. Enjoy.

innodb_fast_checksum=1 and upgrading to MySQL 5.6

The Percona version of MySQL has been such a good replacement for the generic MySQL version that many of the features and options that existed in Percona have been merged into the generic MySQL.

Innodb_fast_checksum was an option added to improve the performance of checksums.

The system variable was replaced by innodb_checksum_algorithm in 5.6.

Unfortunately, when you go to upgrade from Percona 5.x to Percona (or generic mysql) 5.6, an in-place upgrade will fail.

The error(s) will be generally mysql complaining it can’t read the file. This is because fast checksums can’t be read by the 5.6 version.

Example errors:

InnoDB: checksum mismatch in data file
InnoDB: Could not open

The recommended option is do the default upgrade process: use mysqldump to dump your data out and reload after you replace the binaries.

For large datasets or servers suffering poor IO performance, the time it takes to do that, even using a parallel dump and load tool is prohibitive.

So are you looking for a workaround?

How about a mysql tool which has been around for a while, called innochecksum.

This tool can check your datafiles to make sure the checksums are correct, or in our case, force the checksums to be written a specific way. I was thinking, prep work is done, now it is just process work. But alas, the versions of innochecksum for 5.5 and 5.6 don’t support files sizes over 2Gigabytes.

Luckily, innochecksum for 5.7 actually does support larger file sizes and best of all it works on old version datafiles too. For people hitting this article in the future, 5.7 at the time was just a RC (Release candidate).

To use this method:

  1. Backup your db or have good backups.
  2. Organize downtime for your db (slave preferably so you aren’t affecting traffic)
  3. Shutdown mysql
  4. Repeat for each innodb datafile: example command: innochecksum -vS –no-check –write=innodb <path to innodb datafile>
  5. Replace innodb_fast_checksum = 1 with innodb_fast_checksum = 0 in your my.cnf (and chef/puppet/ansible repo)
  6. Restart mysql

I will cover the whole procedure for upgrading from Percona MySQL 5.5 to Percona MySQL 5.6 in more detail in a later post.

Fun tool tip:

I have had to compile the MySQL 5.7 innochecksum for an older linux kernel running glibc older than 2.14, and it works fine as well. The biggest headache was sorting out cmake, boost etc to enable the compilation of the MySQL 5.7 source code.

Have Fun

Prewarm your EBS backed EC2 MySQL slaves

This is the story of cold blocks and mismatched instances and how they will cause you pain and cost you money until you understand why.

Most of the clients that we support run on the Amazon cloud using either RDS or running MySQL on plain EC2 instances using (Provisioned IOPS) PIOPS EBS for data storage.

As expected the common architecture is running a master with one or more slaves handling the read traffic.

A common problem is that after the slaves are provisioned (normally created from an EBS snapshot) they lag badly due to slow IO performance.

Unfortunately what tends to be lost in the “speed of provisioning new resources” fetish is some limitations in terms of data persistence layer (EBS).

If you are using EBS and you have created the EBS volume from snapshot or created a new volume you have to pre-warm the EBS volume otherwise you will suffer a bad (I mean seriously bad) first usage penalty.  Bad? I am talking up to 50% performance drop[1]. So that expensive PIOPS EBS volume you created is going to perform like rubbish every time it reads/writes a cold block.

The other thing which also tends to happen is mixing up the wrong instance (network performance) with the PIOPS EBS. This the classic networked storage, the network is the bottleneck. If your instance type has limited network performance, having a higher PIOPS than the network can handle means you are wasting money (on PIOPS) you can’t use. A bit like in the old days (of dedicated servers and SAN storage) where the SAN could deliver 200-300Mbytes per sec, but the 1 Gigabit network could only do 40-50Mbytes per sec.

Here is the real downside, using the cloud you can provision new resources to handle peak load (in the case more MySQL slaves to handle read load) as fast as you can click, or faster using API calls, or even automagically, if you have some algo forecast the need for additional resources. But… the EBS is all cold blocks, so these new instances will be up and available in minutes but the IO performance will be poor until you either pre-warm or the slave gets around to writing/reading all blocks.

So the common solution is to pre-warm the blocks using dd to read the EBS device (and warm the block) to /dev/null

eg: sudo dd if=/dev/xvdf of=/dev/null bs=1M

Consider how long this will take for any reasonable sized DB (200GBytes) using an instance with 1 Gigabit network.

200Gigabytes read at 50Mbytes/sec  = 200,000 Mbytes/50 = 4000 secs = 3600 (1hr) + 400 (6 mins 40 secs) =~ more than 1 hr.

So you or your algo provisioned a new EC2 instance for the database in minutes but either your IO will be rubbish for an extended period, or you wait more than 1 hr per 200GB to have the EBS pre-warmed.

What are the solutions?

  1. Forecast further in advance depending on the size of your db (or any other persistent storage layer eg NoSQL etc)
  2. Use ephemeral storage and manage the increased risk of data loss in the event of instance termination.
  3. Break your DB or your application into smaller pieces aka micro services.[2]
  4. Pay more $ and have your databases stay around longer so waiting for a instance to be ready in the beginning is not a problem.

As you can expect, most businesses are happy with option 4. Pay more, leave instances around like they were dedicated servers (base load). Amazon is happy too.

Option 3 whilst requiring some thought (argh) and additional complexity is where the real speed of provisioning, dare I say it, agile nature of the cloud will bear the most fruit.

[1] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-prewarm.html

[2] http://martinfowler.com/articles/microservices.html


mysqlslap howto

I noticed that people were hitting the site for information on how to run mysqlslap.

To help out those searchers, here is a quick mysqlslap howto

  1. Make sure you have mysql 5.1.4 or higher. Download MySQL from the MySQL website
  2. Make sure your MySQL database is running.
  3. Run mysqlslap, using progressively more concurrent threads:
    mysqlslap  --concurrency=1,25,50,100 --iterations=10 --number-int-cols=2 \
    --number-char-cols=3 --auto-generate-sql --csv=/tmp/mysqlslap.csv \
    --engine=blackhole,myisam,innodb --auto-generate-sql-add-autoincrement \
    --auto-generate-sql-load-type=mixed --number-of-queries=100 --user=root \

For detailed descriptions of each parameter see the MySQL documentation:


If you want to see how I used mysqlslap to test mysql performance on Amazon EC2, here are the list of posts


MySQL Error: error reconnecting to master

Error message:

Slave I/O thread: error reconnecting to master
Last_IO_Error: error connecting to master


Check that the slave can connect to the master instance, using the following steps:

  1. Use ping to check the master is reachable. eg ping master.yourdomain.com
  2. Use ping with ip address to check that DNS isn’t broken. eg. ping
  3. Use mysql client to connect from slave to master. eg mysql -u repluser -pREPLPASS –host=master.yourdomain.com –port=3306 (substitute whatever port you are connecting to the master on)
  4. If all steps work, then check that the repluser (the SLAVE replication user has the REPLICATION SLAVE privilege). eg. show grants for ‘repl’@’slave.yourdomain.com’;


  • If step 1 and 2 fail, you have a network or firewall issue. Check with a network/firewall administrator or check the logs if you wear those hats.
  • If Step 1 fails but Step 2 works, you have a DNS or names resolution issue. Check that the slave can connect and resolves names using mysql client or ssh/telnet/remote desktop.
  • If Step 3 fails, you need to check the error reported, it will either be a authentication issue (login failed/denied) or an issue with the TCP port the master is listening on. A good way to verify that port is open is to use: telnet master.yourdomain.com 3306 (or the port the master is listening on) if that fails then there is a firewall(s) in the network which are blocking that port.
  • If you get to step 4 and everything looks fine and the slave does reconnect fine on retrying. Then you have probably had either temporary, network failure, names resolution failure, firewall failure or any of the prior together.

Continuing Sporadic issues:

Get hold of the network and firewall logs.
If this is not possible, setup a script to periodically ping, connect, mysql connect and log that over
time to prove to your friendly network admin that there is an problem with the network.

How MySQL deals with it:

MySQL will try and reconnect by itself after a network failure or query timeout.

The process is governed by a few variables:


In a nutshell, a MySQL slave will try to reconnect after getting a timeout (slave-net-timeout) after waiting the number of seconds in master-connect-retry but only for the number of times
specified in master-retry-count.
By default, a MySQL slave waits one hour before retry, and will then retry every 60 seconds for 86,400 times. That is every minute for 60 days.

If the one hour slave-net-timeout is too long for your DR/Slave read strategy you will need to adjust it accordingly.

Edit: 2011/02/02

Thanks to leBolide. He discovered that there is a 32 character limit on the password for replication.

Have Fun


P.S. If you liked this post you might be good enough to try these challenges