Given I had already been through the procedure of setting up an MySQL 5.0 NDB Cluster on EC2, getting the MySQL 5.1 NDB Cluster installed was a reasonably straight forward task.
I will create a new HOWTO wiki from my screen dumps in the next day or so though the documentation provided by MySQL is very thorough.
Given I want to give a large MySQL 5.1 cluster a whirl, maybe with Cluster replication, as described in this documentation, I built and bundled the following AMIs (Amazon Machine Images)
- MySQL 5.1 NDB data node – built from the rpms
- MySQL 5.1 NDB Management node – built from rpms
- MySQL 5.1 NDB SQL/API node – built from rpms
- Complete MySQL 5.1 install.
I built number 4 as you can run the whole cluster to test everything off one box or in my case image. You could also run the management node and SQL node on the same box as well.
The biggest issue I found which is specific to running on EC2 is the hostname, which is allocated via DHCP. I am working on scripts to automate the updating of the /etc/hosts file on each box so that the configure.ini required on the management node and the /etc/my.cnf settings required on the data nodes can point at a name rather than a specific IP address or DNS name.
The work I do here can be replicated to handle the hostname stuff required for Openfiler, Oracle standbys and anything else requiring network connectivity.
For particular interest for users of EC2 is the lack of persistent storage on any specific instance. Plenty of people are looking at various solutions, I guess the interest in using MySQL cluster is that any one data node or two (if you are paranoid) could die so running enough data nodes specifically and also maybe sql nodes with management nodes on the same would provide a solution to this. Backing up to S3 or some form of persistent storage is the fail back if everything goes pear-shaped.
Replicating these virtual images to boxes with persistent storage alleviates some of the pain, however EC2 remains a great area to built your knowledge of various technologies available. Even if some of quirks of running under a virtual machine make things slightly different.
If there is interest I will release a public image of the various types of nodes.