Lookout: crashbug using innodb_track_changed_pages with O_DIRECT


If you are using innodb_flush_method = O_DIRECT (which is highly recommended for a bunch of reasons) and innodb_track_changed_pages the instance will crash if you query any tables related to that feature.

So innodb_track_changed_pages is a Percona system variable which is used to make incremental backups much faster.

So if you are using Percona’s great backup and recovery software suite and running at least MySQL 5.5.27 you might have enabled this variable to make your incremental backups run faster.

You may want to reconsider your choice and turn it off … continue to read why…



I was working on a client a while back (12 months ago) and I was looking to improve their incremental backup durations.

When I turned on the feature and restarted the MySQL instance, as soon as I queried the table the MySQL instance crashed.

After a couple of tests, the change was hastily rolled back and I raised this bug ticket with Percona.


Testing, preparing and getting a bug raised took time, but apart from Rick Pizzi also posting that he had a similar crash, there has been no activity on this. It is like the crickets chirping in the Simpsons.


Zero activity, zero response, the bug is Status is new and unassigned since Jan 27th 2016.

Yes this is an older version MySQL Percona 5.5, but it also affects versions up to 5.7.

How many installations are running between 5.5 and 5.7? I would have thought the bulk of the installed base.

Goal of this post:

Hopefully, this article will prompt someone to take a look at the bug.

If it seems like some of my recent blog posts have been raising issues with Percona, it is because I use, support and recommend Percona MySQL databases and tools every day even though I don’t work at Percona. I love that Percona provide the tools and software they do. Having a crash bug open for more than 12 months is disappointing.

The bug in action:

mysql> select @@version;
| @@version       |
| 5.5.54-38.6-log | <== The latest 5.5 version
1 row in set (0.00 sec)

mysql> show global variables like 'innodb%track%';
| Variable_name              | Value |
| innodb_track_changed_pages | OFF   |
1 row in set (0.00 sec)

-- restart after adding variable to .cnf file

mysql> show global variables like 'innodb%track%';
| Variable_name              | Value |
| innodb_track_changed_pages | ON    |
1 row in set (0.00 sec)

| count(*) |
|        0 |
1 row in set (0.00 sec)

-- it works because there is nothing in the bitmap file

root@db1:~# ls -l /var/lib/mysql/
total 28704
-rw-r--r-- 1 root  root         0 Feb 18 03:01 debian-5.5.flag
-rw-rw---- 1 mysql mysql 18874368 Feb 18 03:06 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:06 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:01 ib_logfile1
-rw-rw---- 1 mysql mysql        0 Feb 18 03:06 ib_modified_log_1_0.xdb <=== BITMAP FILE
drwx------ 2 mysql root      4096 Feb 18 03:01 mysql
-rw-rw---- 1 mysql mysql      126 Feb 18 03:05 mysql-bin.000001
-rw-rw---- 1 mysql mysql      126 Feb 18 03:06 mysql-bin.000002
-rw-rw---- 1 mysql mysql      107 Feb 18 03:06 mysql-bin.000003
-rw-rw---- 1 mysql mysql       96 Feb 18 03:06 mysql-bin.index
-rw------- 1 root  root        11 Feb 18 03:01 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Feb 18 03:01 performance_schema
drwx------ 2 mysql root      4096 Feb 18 03:01 test

-- nothing wrong at this point... now change some rows

mysql> show databases;
| Database           |
| information_schema |
| mysql              |
| performance_schema |
| test               |
4 rows in set (0.00 sec)

mysql> use test;
Database changed
mysql> show tables;
Empty set (0.00 sec)

-- Don't use column names like this at home folks...

mysql> create table test (blah varchar(10));
Query OK, 0 rows affected (0.02 sec)

-- Fred the world renowned tester

mysql> insert into test values ('fred');
Query OK, 1 row affected (0.00 sec)

mysql> exit
root@db1:~# ls -l /var/lib/mysql/
total 28708
-rw-r--r-- 1 root  root         0 Feb 18 03:01 debian-5.5.flag
-rw-rw---- 1 mysql mysql 18874368 Feb 18 03:08 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:08 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Feb 18 03:01 ib_logfile1
-rw-rw---- 1 mysql mysql     4096 Feb 18 03:08 ib_modified_log_1_0.xdb <=== BITMAP FILE
drwx------ 2 mysql root      4096 Feb 18 03:01 mysql
-rw-rw---- 1 mysql mysql      126 Feb 18 03:05 mysql-bin.000001
-rw-rw---- 1 mysql mysql      126 Feb 18 03:06 mysql-bin.000002
-rw-rw---- 1 mysql mysql      396 Feb 18 03:08 mysql-bin.000003
-rw-rw---- 1 mysql mysql       96 Feb 18 03:06 mysql-bin.index
-rw------- 1 root  root        11 Feb 18 03:01 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Feb 18 03:01 performance_schema
drwx------ 2 mysql root      4096 Feb 18 03:08 test

-- still works as innodb_flush_method isn't set

mysql> show global variables like 'innodb_flush_method';
| Variable_name       | Value |
| innodb_flush_method |       |
1 row in set (0.00 sec)

| count(*) |
|       17 |
1 row in set (0.01 sec)

-- OK now turn on O_DIRECT

root@db1:~# vi /etc/mysql/my.cnf 
root@db1:~# service mysql restart
 * Stopping MySQL (Percona Server) mysqld                                                                        [ OK ] 
 * Starting MySQL (Percona Server) database server mysqld                                                        [ OK ] 
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

mysql> show global variables like 'innodb_flush_method';
| Variable_name       | Value    |
| innodb_flush_method | O_DIRECT |
1 row in set (0.00 sec)

-- Bingo! Houston we have a problem.

ERROR 2013 (HY000): Lost connection to MySQL server during query

MySQL Practice challenges part one.

Are you ready to accept the challenge? Really?

Can you prove that you have got what it takes to be an effective DBA?

Go grab the tests from https://github.com/dbadojo/test-mysql-restore and see if you do…


These tests are designed to test your ability to do basic restores and recoveries of a MySQL database.

Each restore gets progressively more complex.

The key feature is, once you are successfully restored the MySQL database you will get the encryption key/passphrase
to do the next test.

Use: ccrypt -d <filename>.cpt to decrypt the file.


Virtualbox and vagrant if using a virtual machine for running the MySQL instance.
MySQL 5.6
ccrypt or equivalent that can read ccrypt encrypted files.

A vagrantfile is provided as an example to spawn a simple virtualbox VM with 1Gig of memory to run the small MySQL database required for the tests.

Once you have proved you are awesome…

Once you have completed all the current tests, email dbadojo@gmail.com and we will keep you informed when the next set becomes ready.

For the uber awesome DBAs, if you have ideas for more tests, email them to dbadojo@gmail.com or comment here or send a merge request.

P.S. There are more and harder restores to come… stay tuned.

MySQL vs PostgreSQL: Part 1

I have watched the progress of PostgreSQL for a while now and read the recent updates for the 9.5 release. I was impressed at the amount and depth of features that are in PostgreSQL.

Coming originally from an Oracle DBA background I tend to appreciate what features are missing in MySQL. MySQL in some respects benefited from a network effect from being the M part in the LAMP stack. But that is pretty much history now.

So how much difference is there between MySQL and PostgreSQL?

Many of the search results whilst good on pagerank are dated and old.

Here is the good page from wikipedia,


Pretty good comparison page,


A presentation of modern SQL using PostgreSQL

After all this I decided to consult with the magic mirror…

Mirror, mirror on the wall.
Who is the fairest database of all?


Turns out MySQL still is the fairest, but all the main relational/SQL databases have trended down over time.

Javascript, Node.js ,JSON and hence JSON datastores such as MongoDB are trending up from a very low base. So NoSQL is still kicking along fine. News of the death of NoSQL are over-stated.

I came to the conclusion that back in ’em olden days of the web, back in ’04, the searches were far more technical. By 2015 those type of searches have been overwhelmed by searches such as fb and facebook.

So my question remains partially un-answered. Yes MySQL is still clearly the most searched for term in relation to SQL databases, JSON related stuff is trending up. But in terms of overall search traffic all database related searches are going down.
An interesting sidenote, both MySQL and PostgreSQL have added JSON datatypes recently, so both databases are trying to adapt to the changes driven by trend to interactive webpages controlled by javascript.

In the end I came to the conclusion that I need to spend more time with PostgreSQL specifically its biggest feature gap it had with MySQL until recently i.e. replication.

Until next time…

MySQL upgrade 5.6 with innodb_fast_checksum=1


My checklist for performing an in-place MySQL upgrade to 5.6.


In my previous post, I discussed the problem I had when doing an in-place MySQL upgrade from 5.5 to 5.6 when the database had been running with innodb_fast_checksum=1.

The solution was to use the MySQL 5.7 version of the tool innochecksum. Using this tool on a shutdown database, you can force the checksums on the innodb datafiles to be rewritten into either INNODB or CRC32 format.

Once the MySQL 5.6 upgrade is done, the 5.6 version of mysqld will be able to read the datafiles correctly and not fail with an error.

There is already plenty of good documentation on the MySQL website on how to upgrade from 5.x to 5.6.


My checklist for in-place upgrading to MySQL 5.6:

  1. Perform application and database performance testing on your test environment to make sure your application performance doesn’t get worse when running on MySQL 5.6.
  2. Make sure you have backups and verified that your backups are good aka you have restored databases from those backups.
  3. Check that all users have updated their passwords to use the new mysql password hash (plugin) Doc URL
  4. Organize downtime in advance.
  5. If running with innodb_fast_checksum=1, proceed with steps to replace the fast checksums with INNODB or CRC32.
    Note: if you use CRC32, you will need to make sure your cnf file is updated for 5.6 to use innodb_checksum_algorithm = CRC32. This is because innodb_checksum_algorithm = INNODB is the default setting. See this post for a sample procedure.
  6. Run a quick search of all existing .cnf files to find any other system variables which have been removed and either replace or remove them.
  7. Run the in-place upgrade.
  8. Run mysql_upgrade, it will flag if it doesn’t need to be run again.

I am trying something new with a poll. Enjoy.

innodb_fast_checksum=1 and upgrading to MySQL 5.6

The Percona version of MySQL has been such a good replacement for the generic MySQL version that many of the features and options that existed in Percona have been merged into the generic MySQL.

Innodb_fast_checksum was an option added to improve the performance of checksums.

The system variable was replaced by innodb_checksum_algorithm in 5.6.

Unfortunately, when you go to upgrade from Percona 5.x to Percona (or generic mysql) 5.6, an in-place upgrade will fail.

The error(s) will be generally mysql complaining it can’t read the file. This is because fast checksums can’t be read by the 5.6 version.

Example errors:

InnoDB: checksum mismatch in data file
InnoDB: Could not open

The recommended option is do the default upgrade process: use mysqldump to dump your data out and reload after you replace the binaries.

For large datasets or servers suffering poor IO performance, the time it takes to do that, even using a parallel dump and load tool is prohibitive.

So are you looking for a workaround?

How about a mysql tool which has been around for a while, called innochecksum.

This tool can check your datafiles to make sure the checksums are correct, or in our case, force the checksums to be written a specific way. I was thinking, prep work is done, now it is just process work. But alas, the versions of innochecksum for 5.5 and 5.6 don’t support files sizes over 2Gigabytes.

Luckily, innochecksum for 5.7 actually does support larger file sizes and best of all it works on old version datafiles too. For people hitting this article in the future, 5.7 at the time was just a RC (Release candidate).

To use this method:

  1. Backup your db or have good backups.
  2. Organize downtime for your db (slave preferably so you aren’t affecting traffic)
  3. Shutdown mysql
  4. Repeat for each innodb datafile: example command: innochecksum -vS –no-check –write=innodb <path to innodb datafile>
  5. Replace innodb_fast_checksum = 1 with innodb_fast_checksum = 0 in your my.cnf (and chef/puppet/ansible repo)
  6. Restart mysql

I will cover the whole procedure for upgrading from Percona MySQL 5.5 to Percona MySQL 5.6 in more detail in a later post.

Fun tool tip:

I have had to compile the MySQL 5.7 innochecksum for an older linux kernel running glibc older than 2.14, and it works fine as well. The biggest headache was sorting out cmake, boost etc to enable the compilation of the MySQL 5.7 source code.

Have Fun

Prewarm your EBS backed EC2 MySQL slaves

This is the story of cold blocks and mismatched instances and how they will cause you pain and cost you money until you understand why.

Most of the clients that we support run on the Amazon cloud using either RDS or running MySQL on plain EC2 instances using (Provisioned IOPS) PIOPS EBS for data storage.

As expected the common architecture is running a master with one or more slaves handling the read traffic.

A common problem is that after the slaves are provisioned (normally created from an EBS snapshot) they lag badly due to slow IO performance.

Unfortunately what tends to be lost in the “speed of provisioning new resources” fetish is some limitations in terms of data persistence layer (EBS).

If you are using EBS and you have created the EBS volume from snapshot or created a new volume you have to pre-warm the EBS volume otherwise you will suffer a bad (I mean seriously bad) first usage penalty.  Bad? I am talking up to 50% performance drop[1]. So that expensive PIOPS EBS volume you created is going to perform like rubbish every time it reads/writes a cold block.

The other thing which also tends to happen is mixing up the wrong instance (network performance) with the PIOPS EBS. This the classic networked storage, the network is the bottleneck. If your instance type has limited network performance, having a higher PIOPS than the network can handle means you are wasting money (on PIOPS) you can’t use. A bit like in the old days (of dedicated servers and SAN storage) where the SAN could deliver 200-300Mbytes per sec, but the 1 Gigabit network could only do 40-50Mbytes per sec.

Here is the real downside, using the cloud you can provision new resources to handle peak load (in the case more MySQL slaves to handle read load) as fast as you can click, or faster using API calls, or even automagically, if you have some algo forecast the need for additional resources. But… the EBS is all cold blocks, so these new instances will be up and available in minutes but the IO performance will be poor until you either pre-warm or the slave gets around to writing/reading all blocks.

So the common solution is to pre-warm the blocks using dd to read the EBS device (and warm the block) to /dev/null

eg: sudo dd if=/dev/xvdf of=/dev/null bs=1M

Consider how long this will take for any reasonable sized DB (200GBytes) using an instance with 1 Gigabit network.

200Gigabytes read at 50Mbytes/sec  = 200,000 Mbytes/50 = 4000 secs = 3600 (1hr) + 400 (6 mins 40 secs) =~ more than 1 hr.

So you or your algo provisioned a new EC2 instance for the database in minutes but either your IO will be rubbish for an extended period, or you wait more than 1 hr per 200GB to have the EBS pre-warmed.

What are the solutions?

  1. Forecast further in advance depending on the size of your db (or any other persistent storage layer eg NoSQL etc)
  2. Use ephemeral storage and manage the increased risk of data loss in the event of instance termination.
  3. Break your DB or your application into smaller pieces aka micro services.[2]
  4. Pay more $ and have your databases stay around longer so waiting for a instance to be ready in the beginning is not a problem.

As you can expect, most businesses are happy with option 4. Pay more, leave instances around like they were dedicated servers (base load). Amazon is happy too.

Option 3 whilst requiring some thought (argh) and additional complexity is where the real speed of provisioning, dare I say it, agile nature of the cloud will bear the most fruit.

[1] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-prewarm.html

[2] http://martinfowler.com/articles/microservices.html


Transmissions resumed

Sorry for the recent loss of the site. Blogger just didn’t play nice with the new hosting provider, no matter what I did. Out with the old and in with the new.

So what is on the agenda after such as long period without posts.

Some ideas for future posts:

  1. Hosted Database solutions, the good, the bad and the ugly.
  2. Running databases on virtual machines/clouds.
  3. DBA toolsets, do custom scripts have a place in the world containing percona-tools
  4. DBAs, Is this the beginning of the end, the end of the beginning? or am I running out of clichés
  5. SQL is dead long live SQL.
  6. Interviews with DBAs. 10 questions answered by the finest DBAs
  7. Remote DBA work, is it the promised land?
  8. Databases and Machine learning, is the self-tuning database in sight?