MySQL Mate & Maté: mysql

Showing posts with label mysql. Show all posts

Wednesday, October 12, 2011

TIL: Lookout For DEFINER

The Issue

I haven't blogged in a while an I have a long TODO list of things to publish: The repository for the SNMP Agent, video and slides of my OSCON talk and a quick overview of MHA master-master support. In the meantime, here's a little fact that I didn't know from MySQL CREATE VIEW documentation:

Although it is possible to create a view with a nonexistent DEFINER account, an error occurs when the view is referenced if the SQL SECURITY value is DEFINER but the definer account does not exist.

How can this be possible?

The Problem

For a number of reasons we don't have the same user accounts on the master than we have on the slaves (ie: developers shouldn't be querying the master). Our configuration files include the following line:

replicate-ignore-table=mysql.user

So if we create a user on the master, the user definition doesn't go through the replication chain.

So a VIEW can be created in the master, but unless we run all the proper GRANT statements on the slave as well, the VIEWs won't be effective on the slaves. Example from our slave (output formatted for clarity):

show create view view3\G
*************************** 1. row ***************************
                View: view3
         Create View: CREATE ALGORITHM=UNDEFINED 
               DEFINER=`app`@`192.168.0.1` 
               SQL SECURITY DEFINER VIEW `view3` AS select 
[...]

show grants for `app`@`192.168.0.1`;
ERROR 1141 (42000): There is no such grant defined 
for user 'app' on host '192.168.0.1'

The Solution

Once again, Maatkit's to the rescue with mk-show-grants on the master:

mk-show-grants | grep 192.168.0.1
-- Grants for 'app'@'192.168.0.1'
GRANT USAGE ON *.* TO 'app'@'192.168.0.1' 
IDENTIFIED BY PASSWORD '*password_hash';
GRANT DELETE, EXECUTE, INDEX, INSERT, SELECT, 
SHOW VIEW, UPDATE ON `pay`.* TO 'app'@'192.168.0.1';

A simple copy from the master and paste onto the slave fixed it.

Conclusion

Every now developers come to me with unusual questions. In this case it was: How come I can access only 2 out of 3 views?. In cases like these, it usually pays off to not overthink the issue and look into the details. A SHOW CREATE PROCEDURE on the 3 views quickly showed that one had a different host for the DEFINER. A quick read through the documentation and an easy test confirmed the mistake. That's why I have 3 mantras that I keep repeating to whomever wants to listen:

Keep it simple
Pay attention to details
RTFM (F is for fine)

It constantly keeps me from grabbing some shears and going into yak shaving mode.

Wednesday, August 17, 2011

MySQL HA Agent Mini HowTo

Why This Post

While testing Yoshinori Matsunobo's MHA agent I found that although the wiki has a very complete documentation, it was missing a some details. This article intends to close that gap and bring up some issues to keep in mind when you do your own installation. At the end of the article I added a Conclusions section, if you're not interested in the implementation details, but to read my take on the project, feel free to jump straight to the end from here.

My Test Case

Most of our production environments can be simplified to match the MHA's agent most simple use case: 1 master w/ 2 or more slaves and at least one more slave in an additional tier:

Master A --> Slave B
          -> Slave C --> Slave D

As noted in the documentation, in this case the MHA agent will be monitoring A, B & C only. I found that unless you have a dedicated manager node, a slave on the 3rd tier (Slave D above) is suitable for this role. All 4 servers were setup as VMs for my evaluation / tests. It makes it easier to simulate hard failure scenarios in a controlled environment. Once this is in place the fun begins.

1st Step: User Accounts

In all the examples in the documentation it uses root to login into MySQL and the OS. I prefer to create specific users for each application, so I created a specific MySQL user for the MHA agent and used the linux' mysql user (UID/GID = 27/27 in RedHat / CentOS).

MySQL Credentials

Reviewing the code, I was able to determine that the agent requires to run some privileged commands like: SET GLOBAL variable, CHANGE MASTER TO ..., FLUSH LOGS ..., SHOW SLAVE STATUS, etc. and creates internal working tables to be used during the master fail over. The easiest way to set it up was using:

GRANT ALL PRIVILEGES ON *.* TO mha_user@'ip address'  
IDENTIFIED BY password;

This should be repeated on all 4 servers using the IP addresses for all the potential manager nodes. Yes, it would be possible to use wildcards, but I consider restricting access from specific nodes a safer practice.

The MySQL replication user needs to be set up to connect from any other server in the cluster, since any of the slaves in the group could be promoted to be master, and have the rest of them connecting to it.

Linux User

As I mentioned before I use the default RedHat / CentOS definition for the mysql user. Keep in mind that if you installed from the official Oracle packages (ie: RPMs), they may not follow this criteria and could result in mismatching UID/GIDs between servers. The UIDs/GIDs for the mysql user and group have to be identical on all 4 servers. If this is not the case, you may use the following bash sequence/script as root to correct the situation:

#!/bin/bash 
# stop mysql
/etc/init.d/mysql stop
 
# Change ownership for all files / directories
find / -user mysql -exec chown -v 27 {} \;
find / -group mysql -exec chgrp -v 27 {} \;
 
# remove old user / group and rename the new ones
# might complain about not being able to delete group.
groupdel mysql
userdel mysql 

# Add the new user / group
groupadd -g 27 mysql
useradd -c "MySQL User" -g 27 -u 27 -r -d /var/lib/mysql mysql
 
# restart MySQL
/etc/init.d/mysql start

Once the mysql user is properly setup, you'll have to create password-less shared keys and authorize them on all the servers. The easiest way to do it is to create it in one of them, copy the public key to the authorized_keys file under the /var/lib/mysql/.ssh directory and then copy the whole directory to the other servers.

I use the mysql user to run the scripts since for most distributions it can't be used to login directly and there is no need to worry about file permissions, which makes it a safe and convenient user.

2nd Step: Follow The Documentation to Install and Configure

Once all the users have been properly setup, this step is straight forward. Check the Installation and Configuration sections of the wiki for more details.

For the placement of the configuration files I deviated a little bit from documentation, but not much:

Used a defaults file: /etc/masterha_default with access only for user mysql since it includes the MHA agent password:

-rw------- 1 mysql mysql 145 Aug 11 16:36 masterha_default.cnf
The application settings were placed under /etc/masterha.d/ this way they're easy to locate and won't clutter the /etc directory.

For simplicity, I didn't include any of the optional scripts and checks (ie: secondary check) in the configurate. You may want to check the documentation and source code of these scripts. Some of them are not even code complete (ie: master_ip_failover). Unless you are implementing some of the more complicated use cases, you won't even need them. If you do, you'll need to write your own following the examples provided with the source code.

Once you have everything in place, run the following checks as the mysql user (ie: sudo su - mysql):

masterha_check_ssh: Using my configuration files the command line looks like:

masterha_check_ssh --conf=/etc/masterha_default.cnf --conf=/etc/masterha.d/test.cnf
masterha_check_repl: This test will determine whether the agent can identify all the servers in the group and the replication topology. The command line parameters are identical to the previous step.

Both should show and OK status at the end. All utilities have verbose output, so if something goes wrong it's easy to identify the issue and correct it.

3rd Step: Run the Manager Script

If everything is OK, on the MHA node (Server D in my tests) run the following command as user mysql (ie: sudo su - mysql):

masterha_manager --conf=/etc/masterha_default.cnf --conf=/etc/masterha.d/test.cnf

You have to keep in mind that should the master fail, the agent will fail over to one of the slaves and stop running. This way it'll avoid split brain situations. You will either have to build the intelligence in the application to connect to the right master when failing or use a virtual IP. In both cases you'll might need to use customized IP failover scripts. The documentation provides more details.

Read the section about running the script in the background to choose the method that best fits your practice.

You will have to configure the notification script to get notified of the master failure. The failed server will have to be removed from the configuration file before re-launching the manager script, otherwise it will fail to start.

You can restart the failed server and set it up as a slave connected to the new master and reincorporate it to the replication group using masterha_conf_host.

Conclusion

This tool solves a very specific (and painful) problem which is: make sure all the slaves are in sync, promote one of them and change the configuration of all remaining slaves to replicate off the new master and it does it fairly quickly. The tool is simple and reliable and requires very little overhead. It's easy to see it is production ready.

The log files are pretty verbose, which makes it really easy to follow in great detail all the actions the agent took when failing over to a slave.

I recommend to any potential users to start with a simple configuration and add the additional elements gradually until it fits your infrastructure needs.

Although the documentation is complete and detailed, it takes some time to navigate and to put all the pieces of the puzzle together.

I would like the agent to support master-master configurations. This way it would minimize the work to re-incorporate the failed server into the pool. Yoshinori, if you're reading this, know that I'll volunteer to test master-master if you decide to implement it.

Thursday, July 21, 2011

My MySQL SNMP Agent

Back in February I wrote an article titled A Small Fix For mysql-agent. Since then we did a few more fixes to the agent and included a Bytes Behind Master (or BBM) chart. For those who can't wait to get their hands on the code, here's the current version: MySQL SNMP agent RPM. For those who'd like to learn about it's capabilities and issues, keep reading.

What to Expect From this Version

The article I quoted above pretty much describes the main differences with the original project, but we went further with the changes while still relying on Masterzen's code for the data collection piece.

The first big change is that we transformed Masterzen's code into a Perl module, this way we can easily plug in a new version without having to do massive editing to ours.

The 2nd change is that we added the code to calculate how many bytes behind is a slave, which should be cross checked always with seconds behind master to get replication's full picture. When a slave is just a few bytes behind, the script calculates the difference straight out of the SHOW SLAVE STATUS information. If the SQL thread is executing statements that are in a binary log file older than the one being updated by the I/O thread, then the script logs into the master to collect the sizes of the previous binary logs and make an accurate calculation of the delta.

For this change we hit another bug in CentOS 5 SNMP agent, by which 64bit counters were being truncated. The solution is to upgrade to CentOS 6 (not anytime soon, but that's another story) or a work around. We decided for the latter and display a variable flagging this value roll over. This is not needed for non-CentOS 5 platforms as far as we know.

By now I expect that many of you would have a question in your mind:

Why Not Branch / Fork?

Why provide an RPM instead of creating a branch/fork in the original project? There are many reasons, but I'll limit myself to a couple. I trust that before you write an enraged comment you'll keep in mind that this is a personal perception, which might be in disagreement with yours.

This code is different enough from the original that creating a branch to the original project would be too complicated to maintain. For example: we are using a completely different SNMP protocol and created a module out of the original code. We don't have the resources to follow behind all of Masterzen's possible patches and I wouldn't expect him to adopt my changes.

If we would've created a fork (a new project derived from the original), I believe at this point, it would divert the attention from the original project or others like PalominoDB's Nagios plugin.

What's Next

We plan to continue maintaining this RPM driven by our specific needs and keep sharing the results this way. If at some point we see it fit to drive the merge into another project or create a new fork of an existing one, we'll do it.

I will be presenting the project at OSCON next week. If you're going to be around, please come to my talk: Monitoring MySQL through SNMP and we can discuss issues like: why use pass_persist, why not use information schema instead of the current method, why not include your personal MySQL instrumentation pet peeve, I'd be glad to sit down with you and personally chat about it.

In the meantime, enjoy, provide feedback and I hope to get to know you at OSCON next Thursday.

Thursday, May 5, 2011

Some More Replication Stuff

Listening to the OurSQL podcast: Repli-cans and Repli-can’ts got me thinking, what are the issues with MySQL replication that Sarah and Sheeri didn’t have the time to include in their episode. Here’s my list:

Replication Capacity Index

This is a concept introduced by Percona in last year’s post: Estimating Replication Capacity which I revisited briefly during my presentation at this year’s MySQL Users Conference. Why is this important? Very simple: If you use your slaves to take backups, they might be outdated and will fall further behind during the backups. If you use them for reporting, your reports may not show the latest data. If you use it for HA, you may not start writing to it until the slave caught up.
Having said that, measuring replication capacity as you set up slaves is a good way to make sure that the slave servers will be able to catch up with the traffic in the master.

More On Mixed Replication

The podcast also discussed how mixed replication works and pointed to the general criteria that the server applies to switch to STATEMENT or ROW based. However there is one parameter that wasn’t mentioned and it might come back and haunt you: Transaction Isolation Level. You can read all about it in the MySQL Documentation: 12.3.6. SET TRANSACTION Syntax and in particular the InnoDB setting innodb_locks_unsafe for binlog.

Keep Binary Logs Handy

Today I found this article from SkySQL on Planet MySQL about Replication Binlog Backup, which is a really clever idea to keep your binary logs safe with the latest information coming out of the master. It offers a method of copying them without the MySQL server overhead. If you purge binary logs automatically to free space using the variable expire_logs_days, you will still have the logs when you need them for a longer time than your disk capacity on the master might allow.

Seconds Behind Master (SBM)

Again, another topic very well explained in the podcast, but here’s another case where this number will have goofy values. Lets say you have a master A that replicates master-master with server B and server C is a regular slave replicating off A. The application writes to A and B serves as a hot stand-by master.
When we have a deployment that requires DDL and/or DML statements, we break replication going from B to A (A to B keeps running to catch any live transactions) and apply the modifications to B. Once we verify that everything is working OK on B, we switch the application to write to B and restore replication going back to A. This offers a good avenue for rolling back in case the deployment breaks the database in any way (ie: rebuild B using the data in A). What we frequently see is, if the DDL/DML statement takes about 30min (1800 sec) on B, once we restore replication as explained, the slave C will show outrageous numbers for SBM (ie: >12hs behind, I really don’t know how does the SBM arithmetic works to explain this). So it’s a good idea to complement slave drifts monitoring with mk-heartbeat, which uses a timestamp to measure replication drifts.

Conclusion

This episode of the OurSQL podcast is a great introduction to replication and its quirks. I also believe that MySQL replication is one of the features that made the product so successful and wide spread. However, you need to understand its limitations if your business depends on it.

These are my $.02 on this topic, hoping to complement the podcast. I wanted to tweet my feedback to @oursqlcast, but it ended up being way more than 140 characters.

Tuesday, February 1, 2011

A Small Fix For mysql-agent

If you're already using an SNMP monitoring tool like OpenNMS, mysql-agent is a great way to add a number of graphics using Net-SNMP. However mysql-agent has a small bug that drove me crazy. I will try to highlight the process on how I discovered it (and hence fix it) since it involved learning about SNMP, how to diagnose it and eventually, once all the pieces came together, how simple it is to write your own agents.

Although versions are not that important, just for the sake of completeness we were using CentOS 5.5, MySQL 5.5.8 Community RPMs, Net SNMP version 5.3.22 and OpenNMS Web Console 1.8.7.

The Problem

I followed the directions on the mysql-agent blog only to find that I was facing the only open issue listed on mysql-agent's Github repository (spoiler alert, the solution is at the bottom). The set up has several components, which makes it difficult to diagnose:

mysql-agent
snmpd agentx
OpenNMS server

Running snmpwalk on the MySQL host, as suggested in the mysql-agent article, worked fine (as far as we could tell). However, OpenNMS wasn't getting the data and the graphs weren't showing up.

It turns out that, once you completed the OpenNMS configuration as described in the article, it's a good idea to run snmpwalk remotely, from the server running OpenNMS, as well. You need to specify your MySQL hostname instead of localhost:

snmpwalk -m MYSQL-SERVER-MIB -v 2c -c public mysql-host enterprises.20267

In our case, it failed. Unfortunately the logs didn't offer much information and whatever was failing, it was inside agentx.

The Alternative

Since the NetSNMP Perl class hides a lot of the details of the Net SNMP API, we decided to use an alternative method to write the agent using pass_persist. The beauty of this method is that you only need to write a filter script: SNMP requests come through standard input (stdin) and the output needs to be printed to standard output (stdout). In consequence, the agent can be tested straight from the command line before implementing it. A nice article about pass_persist can be found here. The pass_persist protocol is fully documented in the snmpd.conf man page.

To follow this route we had to tweak the script a little. The tweaks included:

No daemonize: Since the script used stdin/stdout, it needs to run interactively.
All values need to be returned as strings. It was the only work around we found to deal with 64bits values that otherwise weren't interpreted correctly.
stderr needed to be redirected to a file to avoid breaking the script's returned values ( add 2>/tmp/agent.log to the end of the command line) while you run it interactively.
Use SNMP::Persist Perl module to handle the SNMP protocol.

Once the changes were implemented (I promise to publish the alternative mysql-agent script after some clean up) these are the steps I followed to test it (for now I'll leave the -v option out, along with the stderr redirection).

Invoke the agent as you would've done originally, keeping in mind that now it'll run interactively. On your MySQL server:

mysql-agent-pp -c /path/to/.my.cnf -h localhost -i -r 30
Test if the agent is working properly (blue -> you type, red -> script output):

PING
PONG
Does it actually provide the proper values?

get
.1.3.6.1.4.1.20267.200.1.1.0
.1.3.6.1.4.1.20267.200.1.1.0
Counter32
21
getnext
.1.3.6.1.4.1.20267.200.1.1.0
.1.3.6.1.4.1.20267.200.1.2.0
Counter32
16

Note that case is important PING needs to be capitalized, get and getnext need to be in small caps. Once you know it works you'll need to add the pass_persist line to the snmpd.conf file and restart snmpd:

# Line to use the pass_persist method
pass_persist .1.3.6.1.4.1.20267.200.1 /usr/bin/perl /path/to/mysql-agent -c /path/to/.my.cnf -h localhost -i -r 30

Now execute snmpwalk remotely and if everything looks OK, you're good to go.

On our first runs, snmpwalk failed after the 31st value. Re-tried the specific values and a few other ones after those with get and getnext and it became obvious that for some, the responses weren't the expected ones.

The Bug and The Fix

So now, having identified the failing values, it was time to dig into the source code.

First the data gathering portion, which fortunately is well documented inside the source code. I found ibuf_inserts and ibuf_merged as the 31st and 32nd values (note that with get you can check other values further down the list, which I did to confirm that the issue was specific to some variables and not a generic problem). A little grepping revealed that these values were populated from the SHOW INNODB STATUS output, which in 5.5 didn't include the the line expected in the program logic, hence, the corresponding values stayed undefined. A patch to line 794 on the original script fixed this particular issue by setting the value to 0 for undefined values.

794c794
< $global_status{$key}{'value'} = $status->{$key};
---
> $global_status{$key}{'value'} = (defined($status->{$key}) and $status->{$key} ne '' ? $status->{$key} : 0);

This fix can be used for the original script and the new pass_persist one. I already reported it upstread in GitHub.

The original script still failed. OpenNMS still requires getbulk requests (explained in the Net-SNMP documentation) that agentx fails to convert into getnext. This can be reproduced using snmpbulkwalk instead of snmpwalk (Note: It took some tcpdump wireshark tricks to catch the getbulk requests). The current beta of the pass_persist version of mysql-agent has been in place for a while without issues.

Conclusion

I'm not highlighting all the process since it was long and complicated, but I learned a few concepts in during this time the I'd like to point out

Look Around Before Looking for New Toys

If you're using OSS, you may already have in house most of what you need. This project started when we decided to use OpenNMS (already in place to monitor our infrastructure) and wanted to add to it the MySQL data we wanted to monitor closely. A simple Google search pointed us to mysql-agent right away.

Embrace OSS

All the tools that we used in this case are Open Source, which made it extremely easy to diagnose the source code when pertinent, try alternatives, benefit from the collective knowledge, make corrections and contribute them back to the community. A full evaluation of commercial software, plus the interaction with tech support to get to the point where we needed a patch would've been as involved as this one and the outcome wouldn't have been guaranteed either. I'm not against commercial software, but you need evaluate if it will add any real value as opposed to the open source alternatives.

SNMP is Your Friend

Learning about the SNMP protocol, in particular the pass_persist method was very useful. It removed the mystery out of it and writing agents in any language (even bash) is far from difficult. I'm looking forward to go deeper into MySQL monitoring using this technology.

I'm hoping this long post encourages you to explore the use of SNMP monitoring for MySQL on your own.

Credit: I need to give credit to Marc Martinez who did most of the thinking and kept pointing me in the right direction every time I got lost.

NOTE: I'm not entirely satisfied with the current pass_persist version of mysql-agent I have in place, although it gets the job done. Once I have the reviewed version, I plan ... actually promise to publish it either as a branch of the existing one or separately.

Friday, January 14, 2011

About InnoDB Index Size Limitations

This is mostly a reflection on a limitation in InnoDB that, in my opinion, has persisted for too long. I founded while reviewing the Amarok media player. The player uses MySQL in the backend, embedded or regular server, so it makes for a great source of real life data.

The Issue

By default, Amarok uses MyISAM tables. This means that if it crashes or stops unexpectedly (a logout while playing music may cause this), the latest updates to the DB are all lost. So I've been looking into using InnoDB instead to avoid loosing my playlists or player statistics.

The Problem

The limitation that bothers me is this one: "Index key prefixes can be up to 767 bytes" which has been in place for several years.
Take this Amarok table for example:

CREATE TABLE urls (
    id int(11) NOT NULL AUTO_INCREMENT,
    deviceid int(11) DEFAULT NULL,
    rpath varchar(324) COLLATE utf8_bin NOT NULL,
    directory int(11) DEFAULT NULL,
    uniqueid varchar(128) COLLATE utf8_bin DEFAULT NULL,

PRIMARY KEY (id),
    UNIQUE KEY uniqueid (uniqueid),
    UNIQUE KEY urls_id_rpath (deviceid, rpath),
    KEY urls_uniqueid (uniqueid)
) ENGINE=MyISAM AUTO_INCREMENT=314
DEFAULT CHARSET=utf8 COLLATE=utf8_bin

The result of an ALTER TABLE to convert it to InnoDB:

alter table urls engine=InnoDB;
ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes

The Rant

Note that the maximum key length is in bytes, not characters. So lets review the rpath column. This column stores the path to the media file in the catalog, which could easily be something like: /very/long/path/to/find/a/file/with/music.mp3. If it only uses English alphabet characters it's not very long, but as soon as you start using some multi-byte characters (ie: ñ, ç, ü, etc) the length of the string starts to increase in bytes (ie: ó = 4 bytes). A simple query shows the diference:

select id, rpath, bit_length(rpath) / 8 as bytes, 
char_length(rpath) as chars 
from urls limit 1;
 ---- ----------------------------------------- --------- ------- 
| id | rpath                                   | bytes   | chars |
 ---- ----------------------------------------- --------- ------- 
|  1 | ./home/gnarvaja/Music/Dodododódodo.mp3 | 41.0000 |    39 |
 ---- ----------------------------------------- --------- -------

So how big can the index be in bytes?

I let MySQL answer this question for me. I created a similar test table with only a PRIMARY and then recreated the index on rpath. Here's the command sequence and it's output:

CREATE TABLE urls_test (
  id int(11) NOT NULL AUTO_INCREMENT,
  rpath varchar(324) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

CREATE INDEX rpath_idx ON urls_test (rpath);
Query OK, 0 rows affected, 2 warnings (0.32 sec)

show warnings;
 --------- ------ --------------------------------------------------------- 
| Level   | Code | Message                                                 |
 --------- ------ --------------------------------------------------------- 
| Warning | 1071 | Specified key was too long; max key length is 767 bytes |
| Warning | 1071 | Specified key was too long; max key length is 767 bytes |
 --------- ------ --------------------------------------------------------- 
2 rows in set (0.00 sec)

SHOW CREATE TABLE urls_test\G
*************************** 1. row ***************************
       Table: urls_test
Create Table: CREATE TABLE `urls_test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `rpath` varchar(324) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `rpath_idx` (`rpath`(255))
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)

So down to 255 characters from the original 324. In this case (a music player) it may not be a significant loss, but in a world where applications should support an increasing number of International character sets, in particular Asian ones, this limitation could potentially become serious.

I'm not a source code expert, so I'm not sure what it would take to remove, or at least expand the maximum key size. InnoDB's maximum key length (see URL quoted at the beginning of the article) seems like a good number: 3500 bytes ... or I may be overestimating the need for keys bigger than 255 bytes ... your call.

Friday, October 22, 2010

MySQL Enterprise Backup and The Meaning of Included

During the MySQL Users Conference, Edward Screven did a keynote presentation that made many of us feel warm and fuzzy about Oracle's future plans for MySQL. If you advance 16m 25s into the presentation, it even gives something to rejoice the MySQL Enterprise customers: "Backup is now included". He didn't say much more after that. Asking around at the conference the days following this announcement, I couldn't get a straight answer about when and how would it be available for existing customers.

Now, 6 months later (give or take a couple of weeks), the MySQL Enterprise Features page has no signs of the now included MySQL Enterprise Backup (the utility previously known as InnoDB Hot Backup) and there has been no other news supporting Edward's announcement anywhere else (if I'm wrong, please point to it with a comment).

Has anybody any insight about what the definition of included is according to Oracle's dictionary? Maybe it's not included for existing customers and it will be when Oracle comes out with the new price list? This last statement will surely make existing customers pretty unhappy.

Maybe there was no reason to feel warm and fuzzy after all. What is your take on this particular issue?

Wednesday, September 22, 2010

A Replication Surprise

While working on a deployment we came across a nasty surprise. In hindsight it was avoidable, but it never crossed our minds it could happen. I'll share the experience so when you face a similar situation, you'll know what to expect.

Scenario

To deploy the changes, we used a pair of servers configured to replicate with each other (master-master replication). There are many articles that describe how to perform an ALTER TABLE with minimum or no downtime using MySQL replication. The simple explanation is:

Set up a passive master of the database you want to modify the schema.
Run the schema updates on the passive master.
Let replication to catch up once the schema modifications are done.
Promote the passive master as the new active master.

The details to make this work will depend on each individual situation and are too extensive for the purpose of this article. A simple Google search will point you in the right direction.

The Plan

The binlog_format variable was set to MIXED. While production was still running on the active master, we stopped replication from the passive to the active master so we would still get all the DML statements on the passive master while running the alter tables. Once the schema modifications were over, we could switch the active and passive masters in production and let the new passive catch up with the table modifications once the replication thread was running again.

The ALTER TABLE statement we applied was similar to this one:

ALTER TABLE tt ADD COLUMN cx AFTER c1;

There were more columns after cx and c1 was one of the first columns. Going through all the ALTER TABLE statements takes almost 2 hour, so it was important to get the sequence of event right.

Reality Kicks In

It turns out that using AFTER / BEFORE or changing column types broke replication when it was writing to the binlog files in row based format, which meant that we couldn't switch masters as planned until we had replication going again. As a result we had to re-issue an ALTER TABLE to revert the changes and then repeat them without the AFTER / BEFORE.

The column type change was trickier and could've been a disaster, fortunately this happened on a small table (~400 rows which meant the ALTER TABLE took less than 0.3sec). In this case we reverted the modification on the passive master and run the proper ALTER TABLE on the active master. Should this have happened with a bigger table, there was no other alternative than either rollback the deployment or deal with the locked table while the modification happened.

Once this was done we were able to restart the slave threads, let it catch up and and everything was running as planned ... but with a 2hr delay.

Unfortunately, using STATEMENT replication wouldn't work in this case for reasons that would need another blog article to explain.

Happy Ending

After the fact, I went back to the manual and I found this article: Replication with Differing Table Definitions on Master and Slave. I guess we should review the documentation more often, the changes happened after 5.1.22. I shared this article with the development team, so next time we won't have surprises.

Friday, July 30, 2010

Simple Backup Server

I have not written an article in a while, I partially blame it on the World Cup and my day job. The time has come to share some of my recent experiences with a neat project to provide several teams internally with current MySQL backups.

When faced with these types of challenges is my first step is to look into OSS packages and how can they be combined into an actual solution. It helps me understand the underlying technologies and challenges.

ZRM Backup

I have reviewed Zmanda's Recovery Manager for MySQL Community Edition in the Fall 2008 issue of MySQL magazine. It remains one of my favorite backup tools for MySQL since it greatly simplifies the task and configuration of MySQL backups taking care of most of the details. Its flexible reporting capabilities came in handy for this project as I'll explain later. Some of the key settings:

We included the hostname in the ZRM backup-set to make it easier to locate. Linux shell example:

export BKUP_SET=`hostname -s`-logical

Following ZRM conventions, generate a HTML report in the main backup directory.

mysql-zrm-reporter --where backup-set=$BKUP_SET --type html \
   --show backup-status-info >/dir/to/backup/$BKUP_SET/report.html

The actual backup files live under the subdirectories:

/dir/to/backup/$BKUP_SET/<timestamp>/

where /dir/to/backup could be mounted on a NFS server

Please check the ZRM for MySQL documentation for details on its configuration and operation. Use the report format that best suits your needs, ZRM provides plenty of options and if none fits your needs exactly, you can still generate your own.

lighttpd HTTP Server

As a web server, lighty adds very little overhead so it can run on the same MySQL server and the backups can be served directly from it. If this is not acceptable and the backups are stored in an NFS volume, lighty can be installed on the NFS server, the configuration will remain very similar to the one I describe here.

For this example I’ll assume that the MySQL server host is named dbhost, in which case the /etc/lighttpd/lighttpd.conf file should include:

server.modules  = ( 
                                "mod_alias",
                                "mod_access",
                                "mod_accesslog" )

## Default root  directory set to the main backup set
server.document-root = "/dir/to/backup/dbhost-logical/"

## Set one alias per  backup type
alias.url            = ( "/logical/" => "/dir/to/backup/dbhost-logical/")
alias.url            = ( "/incremental/" => "/dir/to/backup/dbhost-incremental/")

## Enable logs for  diagnosis
server.errorlog      = "/var/log/lighttpd/error.log"
accesslog.filename   = "/var/log/lighttpd/access.log"

server.port          = 8088

## virtual directory listings enabled
dir-listing.activate = "enable"

##  enable debugging to facilitate diagnosis
debug.log-request-header    = "enable"
debug.log-response-header   = "enable"
debug.log-request-handling  = "enable"
debug.log-file-not-found    = "enable"

The server.document-root and alias.url settings should match the main directories for the ZRM backup sets.

Make sure that the user and / or group defined for the lighttpd process have proper permissions to access the backup set directory tree. The backups will be available as a directory listing when using the following URLs: http://dbhost:8088/logical/ or http://dbhost:8088/incremental/. Clicking on the report.html file in those directories (see the previous section in the article), the users have access to the backup reports and verify if any of them had errors. The files are also accessible using wget.

If you need to enable tighter security, lighty supports https and LDAP authentication the details are in its documentation and it takes less than 10 minutes to setup.

monit Monitoring

When you need a service to be running 24/7, monit is a great tool in the Linux arsenal to achieve it. It monitors a number of system and services variables and it will start, stop and/or restart any service under a number of conditions. For this POC, the goal is to keep lighty running, restarting it after an eventual crash. We are using the following configuration:

check process lighttpd
   with pidfile  "/var/run/lighttpd.pid"
   alert [email protected] NOT { instance,  pid, ppid, action }
   start program = "/etc/init.d/lighttpd start" with timeout  60 seconds
   stop program = "/etc/init.d/lighttpd stop" with timeout 30 seconds
   if 2 restarts  within 3 cycles then timeout
   if failed port 8088 protocol http with  timeout 15 seconds within 3 cycles then restart

If the process crashes or port 8088 becomes unresponsive, monit will (re)start lighty automatically.

Conclusion (sort of)

Implement all these tools in a new server takes less than 1 hour. During the the first couple of implementations I learned a lot about how the packages interacted, security issues and users’ needs and expectations. As the users started looking into the solution, they also came up with newer use cases.

While we look for a more suitable solution, these free (as in freedom and beer) packages provided us with the opportunity to learn while still achieving our goals and delivering results to our users. Now we know what to look for if we decide to evaluate an open core or commercial solution.

Friday, May 14, 2010

MySQL 5.1.46 With InnoDB Plugin Kicks Butt

We were discussing the recommendations we issue each quarter around MySQL and the question of using InnoDB plugin came up. We usually follow Planet MySQL closely, so we read what the blogs had to say and it was all good, but we decided to provide our users some data of our own. We used our own sysbench tests on to get the information we needed.

A Word About Benchmarks

I don't trust most of the benchmarks that are published online because they really apply to the use case of whomever is writing the article. They are usually many factors that can influence them and I find it difficult to apply them as-is to our environment.

I do trust the benchmarks published online as a reference on how to create and run our own benchmarks. So this article is based on this premise. I recommend you to do your own homework to verify the results for your own use cases.

The Test

Having said that, we use sysbench against the official MySQL RPM with no special adjustments to the configuration file. We run it once with the embedded InnoDB engine and re-ran them with the InnoDB plugin engine. This is the bash shell wrapper we use:

#!/bin/bash
# Sysbench MySQL benchmark wrapper
for nthr in 1 8 16; do
   echo "($(date %H:%M:%S)) -- Testing $nthr threads"
   sysbench --db-driver=mysql --num-threads=$nthr --max-time=900 --max-requests=500000 --mysql-user=user --mysql-password=password --test=oltp --oltp-test-mode=complex --oltp-table-size=10000000 prepare
   echo "($(date %H:%M:%S)) -- Running test for $nthr threads"
   sysbench --db-driver=mysql --num-threads=$nthr --max-time=900 --max-requests=500000 --mysql-user=user --mysql-password=password --test=oltp --oltp-test-mode=complex --oltp-table-size=10000000 run | tee $(hostname -s)_$nthr.log
   echo "($(date %H:%M:%S)) -- Cleaning up $nthr threads"
   sysbench --db-driver=mysql --num-threads=$nthr --max-time=900 --max-requests=500000 --mysql-user=user --mysql-password=password --test=oltp --oltp-test-mode=complex --oltp-table-size=10000000 cleanup
   echo "($(date %H:%M:%S)) -- done ($nthr)"
done

I like to run a 1 thread test since it gives us an idea of the underlying raw performance. Based on other tests we have done, our systems performance peaks somewhere between 8 and 16 concurrent threads, for this test there was no point in running other configurations. You may replace "1 8 16" with the numbers you think will best represent your systems in production. All the tests are run locally, when testing across the network the numbers will vary based on your network performance.

The Actual Results

So, without further ado, here are the results as reported by sysbench:

Number of threads	No Plugin Trx/sec	Plugin Trx/sec
1	176.32	325.75
8	332.82	742.80
16	334.47	736.40

The results for the No Plugin column are in line with what we got in tests for older 5.1.x releases.

Conclusion

MySQL v5.1.46 using InnoDB plugin kicks ass! I apologize for the language, but the numbers are pretty amazing. I hope you find this post useful.

Monday, May 3, 2010

Simple Backup Restore Trick

I don't usually post these simple tricks, but it came to my attention today and it's very simple and have seen issues when trying to get around it. This one tries to solve the question: How do I restore my production backup to a different schema? It looks obvious, but I haven't seen many people thinking about it.

Most of the time backups using mysqldump will include the following line:

USE `schema`;

This is OK when you're trying to either (re)build a slave or restore a production database. But what about restoring it to a test server in a different schema?

The actual trick

Using vi (or similar) editors to edit the line will most likely result in the editor trying to load the whole backup file into memory, which might cause paging or even crash the server if the backup is big enough (I've seen it happen). Using sed (or similar) might take some time with a big file. The quick and dirty trick I like is:

grep -v "USE \`schema\`" backup.sql | mysql -u user -p new_schema

Adapt the mysql command options to your needs. It's necessary to escape the backticks (`), otherwise the shell might interpret it as your trying to execute schema and use the output as the actual schema name. Also, make sure that new_schema already exists in the server.

This method is quick and dirty and leaves the original backup intact. I hope you find it useful.

Friday, March 26, 2010

My Impressions About MONyog

At work we have been looking for tools to monitor MySQL and at the same time provide as much diagnosis information as possible upfront when an alarm is triggered. After looking around at different options, I decided to test MONyog from Webyog, the makers of the better known SQLyog. Before we go on, the customary disclaimer: This review reflects my own opinion and in no way represents any decision that my current employer may or may not make in regards of this product.

First Impression

You know what they say about the first impression, and in this where MONyog started with the right foot. Since it is an agent-less system, it only requires to install the RPM or untar the tarball in the server where you're going to run the monitor and launch the daemon to get started. How much faster or simpler can it be? But in order to start monitoring a server you need to do some preparations on it. Create a MONyog user for both the OS and the database. I used the following commands:

For the OS user run the following command as root (thank you Tom):

groupadd -g 250 monyog && useradd -c 'MONyog User' -g 250 -G mysql -u 250 monyog && echo 'your_os_password' | passwd --stdin monyog

For the MySQL user run:

GRANT SELECT, RELOAD, PROCESS, SUPER on *.* to 'adm_monyog'@'10.%' IDENTIFIED BY 'your_db_password';

Keep in mind that passwords are stored in the clear in the MONyog configuration database, defining a MONyog user helps to minimize security breaches. Although for testing purposes I decided to go with a username/password combination to SSH into the servers, it is possible to use a key which would be my preferred setting in production.

The User Interface

The system UI is web driven using Ajax and Flash which makes it really thin and portable. I was able to test it without any issues using IE 8 and Firefox in Windows and Linux. Chrome presented some minor challenges but I didn't dig any deeper since I don't consider it stable enough and didn't want to get distracted with what could've been browser specific issues.

In order to access MONyog you just point your browser the server where it was installed with an URL equivalent to:

http://monyog-test.domain.com:5555 or http://localhost:5555

You will always land in the List of Servers tab. At the bottom of this page there is a Register a New Server link that you follow and start adding servers at will. The process is straight forward and at any point you can trace your steps back to make any corrections as needed (see screenshot). Once you enter the server information with the credentials defined in the previous section, you are set. Once I went through the motions, the first limitation became obvious: You have to repeat the process for every server, although there is an option to copy from previously defined servers, it can become a very tedious process.

Once you have the servers defined, to navigate into the actual system you need to check which servers you want to review, select the proper screen from a drop down box at the bottom of the screen and hit Go. This method seems straight forward, but at the beginning it is a little bit confusing and it takes some time to get used to it.

Features

MONyog has plenty of features that make it worth trying if you're looking for a monitoring software for MySQL. Hopefully by now you have it installed and ready to go, so I'll comment from a big picture point of view and let you reach your own conclusions.

The first feature that jumps right at me is its architecture, in particular the scripting support. All the variables it picks up from the servers it monitors are abstracted in JavaScript like objects and all the monitors, graphics and screens are based on these scripts. One the plus side, it adds a a lot of flexibility to how you can customize the alerts, monitors, rules and Dashboard display. On the other hand, this flexibility present some management challenges: customize thresholds, alerts and rules by servers or group of servers and backup of customized rules. None of these challenges are a showstopper and I'm sure MONyog will come up with solutions in future releases. Since everything is stored in SQLite databases and the repositories are documented, any SQLite client and some simple scripting is enough to get backups and workaround the limitations.

The agent-less architecture requires the definition of users to log into the database and the OS in order to gather the information it needs. The weak point here is that the credentials, including passwords, are stored in the clear in the SQLite databases. A way to secure this is to properly limit the GRANTs for the MySQL users and ssh using a DSA key instead of password. Again, no showstopper for most installations, but it needs some work from Webyog's side to increase the overall system security.

During our tests we ran against a bug in the SSH library used by MONyog. I engaged their Technical Support looking forward to evaluate their overall responsiveness. I have to say it was flawless, at no point they treated me in a condescending manner, made the most of the information I provided upfront and never wasted my time with scripted useless diagnostic routines. They had to provide me with a couple of binary builds, which they did in a very reasonably time frame. All in all, a great experience.

My Conclusion

MONyog doesn't provide any silver bullet or obscure best practice advice. It gathers all the environment variables effectively and presents it in an attractive and easy to read format. It's a closed source commercial software, the architecture is quite open through scripting and with well documented repositories which provides a lot of flexibility to allow for customizations and expansions to fit any installations needs. For installations with over 100 servers it might be more challenging to manage the servers configurations and the clear credentials may not be viable for some organizations. If these 2 issues are not an impediment, I definitively recommend any MySQL DBA to download the binaries and take it for a spin. It might be the solution you were looking for to keep an eye on your set of servers while freeing some time for other tasks.

Let me know what do you think and if you plan to be at the MySQL UC, look me up to chat. Maybe we can invite Rohit Nadhani from Webyog to join us.

Monday, March 8, 2010

Speaking At The MySQL Users Conference

My proposal has been accepted, yay!

I'll be speaking on a topic that I feel passionate about: MySQL Server Diagnostics Beyond Monitoring. MySQL has limitations when it comes to monitoring and diagnosing as it has been widely documented in several blogs.

My goal is to share my experience from the last few years and, hopefully, learn from what others have done. If you have a pressing issue, feel free to comment on this blog and I'll do my best to include the case in my talk and/or post a reply if the time allows.

I will also be discussing my future plans on sarsql. I've been silent about this utility mostly because I've been implementing it actively at work. I'll post a road map shortly based on my latest experience.

I'm excited about meeting many old friends (and most now fellow MySQL alumni) and making new ones. I hope to see you there!

Friday, February 12, 2010

Log Buffer #178, a Carnival of the Vanities for DBAs

Dave Edwards has offered me to write this week's Log Buffer, and I couldn't help but jump at the opportunity. I'll dive straight into it.

Oracle

I'll start with Oracle, the dust of the Sun acquisition has settled, so maybe it's time to return our attention to the regular issues.

Lets start with Hemant Chitale's Common Error series and his Some Common Errors - 2 - NOLOGGING as a Hint explaining what to expect from NOLOGGING. Kamran Agayev offers us an insight into Hemant's personality with his Exclusive Interview with Hemant K Chitale. My favorite quote is:

Do you refer to the documentation? And how often does it happen?

Very frequently. Most often the SQL Reference (because I don’t — and do not intend to – memorise syntax. Syntax has to be understood rather than memorized). Also, the RMAN Reference (known as the “Backup and Recovery Reference”) and the Database Reference.

At least I'm not the only one forgetting the exact syntax of every command.

Chen Shapira offers us her thoughts on diagnostics in Automated Root Cause Analysis, and I have to agree with her that sometimes it is best to be offered good visualization tools rather than cut and dry solutions and recommendations.

Miladin Modrakovic explains how to avoid an Oracle 11g vulnerability: Oracle Exploit Published 11g R2. Gary Myers makes his own contribution about security issues with 10g and 11g in Exploits and revoking the risks of revoking PUBLIC.

As a MySQL DBA I've heard many times the question, "What is the right pronunciation?" and purists would say 'es-que-el' as the ANSI standard specifies. But before there were any standards, there was SEQUEL. I heard the real story many times. Iggy Fernandez's article does a pretty good job summarizing it in Not the SQL of My Kindergarten Days quoting some references for those who would like to dig into the details.

During the weeks leading to the final approval of Sun's acquisition by the EU, there was a lot of speculation about MySQL's destiny's under Oracle. I'm sure that many of the MySQL Community members that kept their cool, they did so because they knew that Ken Jacob would most likely have a say on it. So when the news of his resignation was published, I'm sure that those people (myself among them) starting scratching their heads and started wondering about MySQL's future as well. Matt Assay's news article on CNet, Oracle loses some MySQL mojo, offers a great insight on the issue including quotes of Pythian's own Sheeri Cabral. There are plenty of other articles on the issue in Planet MySQL's feed.

MySQL

Continuing in the context of Oracle's acquisition and Ken's resignation, Bridget Bothelo's article MySQL users hope for the best, prep for the worst speculates about what is in the mind of those who run MySQL in production. If you are interested in the different releases and branches, you'll find plenty of blogs this week starting with Jay Jensen's question When should we expect the next stable MySQL release beyond 5.1? and Ronald Bradford's FOSDEM 2010 presentation Beyond MySQL GA: patches, storage engines, forks, and pre-releases – FOSDEM 2010.

Life goes on and in there is still plenty of action in the MySQL community. As Colin Charles reminds us in his MySQL Conference Update: Grid is up, go promote and register!, this should be an interesting year. In the storage engine and tools front, it's worth checking InfiniDB's impressive performance numbers in InfiniDB load 60 Billion SSB rows trended for storage engine developments and RE: HeidiSQL 5.0 Beta available in the tools segments.

Finally to end the MySQL section with some more mundane issues, here is a collection of articles with mysqldump related scripts and tools: Restore from mysqldump --all-databases in parallel and MyDumpSplitter-Extract tables from Mysql dump-shell script. No list of articles on backups would be complete without asking Can you trust your backup?.

Today we were talking at work about Perl vs Python for scripting. Me, I'm a Perl 'gansta' (see the PostgreSQL section). Traditionally MySQL has had a pretty bad Python driver, but Geert Vanderkelen is working on correcting that. If you're a Python fan check his FOSDEM 2010 presentation at FOSDEM: 'Connecting MySQL and Python', handout & wrap-up.

SQL Server

Aaron Bertrand published 2 really interesting articles that could be apply to other databases as well: When bad error messages happen to good people and Injection is not always about SQL with a funny (if it weren't for the Prius parked in my driveway) example at the end.

2010 MVP Summit is coming up and Thomas LaRock offers his 2010 MVP Summit Preview. His insight applies to other similar events (are you reading MySQL UC attendees?).

In my experience date and time representation and manipulation are tricky in databases, these 2 articles offer some tips: Convert FILETIME to SYSTEM time using T-SQL and Dan Guzman's Ad-Hoc Rollup by date/time Interval

I'm really bad judging the value of SQL server articles, so I'm going to choose the easy way and trust Adam Machanic's T-SQL Tuesday #002: The Roundup to provide with a few technical references.

PostgreSQL

Apparently the PostgreSQL community need their own "Geert" (see reference in the MySQL section) based on what I've read on Damn it feels good to be a (perl) gangsta and Josh Berkus' Postgres needs a new Python driver. Are you up to the challenge? In that case, step up to the plate, that's what Open Source is all about.

The PostGIS group had an important announcement: PostGIS 1.5.0 out and PLR working on Windows 8.3-8.4 installs which the author calls "Perhaps the best release ever", so make space on your disk and schedule and take it for a spin.

Baron Schwartz offers an interesting view on How PostgreSQL protects against partial page writes and data corruption. It offers great insight from a well known MySQL guru.

Last but not least End Point's people have determined with mathematical precision PostgreSQL version 9.0 release date prediction, make sure you read the article and get ready for it.

I hope I kept you reading up to this point and see you around in the blogosphere.

Wednesday, February 3, 2010

Using MariaDB with MySQL Sandbox

A few days back MariaDB announced their first GA release (see Released: MariaDB 5.1.42), so it is time to start testing it and there is not better way to test any MySQL version in a control environment other than MySQL Sandbox. However Sandbox relies on the fact that the tarball and tarball target directory are prefixed with mysql, which is not true with MariaDB. So here are the 2 tricks I had to use to make it work out of the box.

These steps are explained to create a single sandbox, the tips can be extrapolated to any other configuration. Also, I am trying to avoid renaming any files and/or directories as to leave the packages as close to the original as possible.

Step 1: Use A Symlink For The Tarball

The make_sandbox script will then think it's manipulating a MySQL tarball. Assuming that the default directory is where you have the tarball:

ln -sv mariadb-5.1.42-Linux-x86_64.tar.gz mysql-5.1.42-Linux-x86_64.tar.gz
make_sandbox /home/gnarvaja/Downloads/mysql-5.1.42-Linux-x86_64.tar.gz --sandbox_directory=maria_5.1.42

Make the adjustments needed to your own platform and version.

The make_sanbox run is going to fail since it expects a subdirectory named ./mysql-5.1.42-Linux-x86_64 which doesn't exist since we used a MariaDB tarball.

Step 2: Use A Symlink For The MariaDB Binaries Directory

For the same reason as above, now create a symlink for the directory to where the tarball was extracted and re-run make_sandbox:

ln -sv mariadb-5.1.42-Linux-x86_64 mysql-5.1.42-Linux-x86_64
make_sandbox /home/gnarvaja/Downloads/mysql-5.1.42-Linux-x86_64.tar.gz --sandbox_directory=maria_5.1.42

Remember to always include the --sandbox_directory option to avoid name conflicts in case you want to compare MariaDB with the corresponding MySQL release.

This time the installation will succeed and you'll be ready to start your testing.

Conclusion

I tried to install using the original tarball name using different options and the process failed with different error messages. I looked into the make_sandbox code and I saw some dependencies that would've taken me some time to figure out and fix. This method can be considered a hack, but it gets you up and running in no time.

Giusseppe, if you happen to see this blog, I'll be glad to test a patch when you have it.

Thursday, January 7, 2010

sar-sql Has A Wiki

Finally settled for a wiki for sar-sql using Ubuntu's own wiki. Right now it only has my regular installation procedure. I will enhance and keep adding items as my time allows. I hope it will help to shape the future of the script.

Enjoy it with responsibility.

PS: I use the Kubuntu format because it is my desktop of choice.

Monday, December 14, 2009

A Hard Look Into Replication

For some time now I've been struggling with a slave that invariably stays behind its master. I have been looking at every detail I can possibly think and in the process discovered a number of replication details I wasn't aware until now. I haven't too much information about them in the documentation, but they can affect the way you look at your slaves.

Seconds Behind Master

This is the first value that to look at when evaluating replication, most of the monitoring systems I know of rely on it. According to the manual:

When the slave SQL thread is actively running
(processing updates), this field is the number of
seconds that have elapsed since the timestamp of the
most recent event on the master executed by that thread.

In fast networks, most of the time, this is an accurate estimate of replication status, but many times you'll see this value to be in the ten of thousands of seconds and not a minute later it falls back to 0. In a chain of master and slaves, the number on the last slave measures how far behind it is from the master at the top of the chain. Under heavy load on the top master, it can even go back and forth wildly. Because of this, I've learned not to trust this value alone. It is a good idea then to compare other variables as well. For example: Master_Log_File / Exec_Master_Log_Pos vs. Relay_Master_Log_File / Read_Master_Log_Pos. The 2nd pair will point to the last statement executed on the slave in relation to the master's binary log file (keep in mind that the statements are actually being executed from the Relay Log file). The first one, will point to the latest statement read from the master and being copied into the Relay Log. Checking all these variables in context will tell you the real status of the slaves.

Sidenote: These are the variables in the slave snapshot in sar-sql, let me know which ones do you monitor to make your slaves are healthy.

Binary Log Format

This item is important and encompasses which format you choose for replication. In the case I am working on, it was set to STATEMENT. An initial look, revealed that the master had bursts of very high traffic, after which the slaves started lagging behind significantly. Most likely (still trying to prove this), because a number of big INSERTs and UPDATEs are being processed at the same time on the master, and inevitably are serialized on the slaves. Without going into the details, switching to ROW solved most of the delays.

Although binlog_format is a dynamic variable, the change will not take place right away. It will be applied to newly created threads/connections. Which means that if you have connection pooling in place (very common with web applications) , it might take a while until the change actually happens. If you want to force the change as soon as possible, you will have to find a mechanism friendly to your particular environment to regenerate the connections.

Another issue that came up is that, in a replication tree, no matter what the binlog_format variables establishes for the slaves in the middle of the chain. The binary log format of the top master will be used across the chain.

Status Variables and Logs

As you may know, SHOW GLOBAL STATUS includes a number of counters that count how many times a command type was issued. So Com_Insert will tell you how many INSERTs were issued since the server is up. That is, without counting the replication thread. So you may issue thousands of INSERTs on the master, and while Com_Insert will be updated accordingly, it won't change in the slave. Very frustrating when I tried to evaluate if the INSERT rate in the slave matched the rate on the master. The general log has a similar issue, it won't record any statement executed by the slave threads.

Conclusion

Although I understand where these limitations may originate from the way MySQL replication works, it does frustrate me since it really limits the type of tests and diagnostics that can be set up to find what's causing the issues on these servers.

I have to point out that MySQL Sandbox is an invaluable tool to test the different replication scenarios with minimum preparation work.

Tuesday, December 8, 2009

sar-sql New Alpha Release

I just uploaded a new tarball for sar-sql containing a few bug fixes, overall code improvements. I also added options to get a partial snapshot of SHOW SLAVE STATUS and SHOW MASTER STATUS. I chose only a few columns to avoid over complicating the project.

I plan one more round of heavy code changes, but no new features until I can stabilize the code enough to release it as beta.

Feel free to visit the project page in Launchpad to comment on the Blueprints, report new bugs and participate through the Answers section.

Thank you very much to Patrick Galbraith who provided some ideas on the best way to solve some of the coding issues.

Enjoy the download.

Wednesday, December 2, 2009

About CSV Tables

As most of MySQL users, I have often ignored what I'd like to call the minor storage engines. MYISAM, InnoDB and NDB (aka MySQL Cluster) are well covered in several articles; but what about engines like CSV or ARCHIVE? As part of some internal projects, we have been playing around with these 2 with some interesting results. On this article I'll concentrate on CSV.

Scenario

Currently we have a few servers that are storing historical data that will eventually be migrated into Oracle. Two things need to happen until we can finally decommission them: 1) export the data to CSV so it can be imported in bulk into Oracle and 2) keep the data online so it can be queried as needed until the migration is finalized. I thought it would be interesting if we could solve both issues simultaneously and decided to try the CSV engine. Here's a description of the process.

Understanding CSV Engine

The CSV engine, as far as I can tell, was and example storage engine that was included with earlier MySQL versions to illustrate the storage engine API. Since the data is stored in plain text files there are some limitations that need to be considered before using it:

No support for indexes
No NULL columns are allowed

Exporting Data To CSV

So, my first step was to determine what would it take to export data from a regular table to CSV. These are the basic steps:

1. Create the seed for the CSV table based on an existing table

CREATE TABLE test LIKE test_csv;

If you feel adventurous, use CREATE TABLE LIKE ... SELECT ... in which case you may be able to skip the next 2 steps. The engine for the new table will be redefined at the end of the process.

2. Get rid of the indexes

ALTER TABLE test_csv DROP PRIMARY KEY, DROP KEY test_idx;

Modify this statement to include all existing indexes.

3. Get rid of NULL columns

ALTER TABLE test_csv MODIFY test_char VARCHAR(10) NOT NULL DEFAULT '', MODIFY test_date TIMESTAMP NOT NULL DEFAULT '0000-00-00';

The DEFAULT values need to be reviewed so the application makes no mistake that these should be NULL. Numeric values could be tricky since you may not find a suitable replacement for NULL. Adapt to your particular case.

4. Convert to CSV

ALTER TABLE test_csv ENGINE CSV;

This step will create an empty CSV file in the schema data directory.

5. Export the data

INSERT INTO test_csv SELECT * FROM test WHERE ...

This would allow you to export the portion of the data from an existing table into the CSV table/file.

At this point your data is all stored in a CSV file called test_csv.CSV under the data subdirectory that corresponds to the schema and the table can be queried as any other regular MySQL table, which is what I was looking for at the beginning of the project.

You could even update the table. If you need to load this file to any other application, just copy the file.

Keep in mind that we are talking about regular text files, so they are not adequate for big number of rows and frequent write operations.

Importing Data From CSV

If you have to import some CSV data from another application, as long as the data formatted properly, you can just create an empty table with the right columns and then copy the CSV file with the proper table name. Example:

In MySQL:

use test
CREATE TABLE test_import LIKE test_csv;

In the OS shell:

cp data.csv /var/lib/mysql/test/test_import.CSV

Now if you do: SELECT * FROM test_import LIMIT 2; it should show you the data on the first 2 lines of the CSV file.

Import Data Example

Many banks allow you to export the list of transactions from your online statement as a CSV file. You could easily use this file as explained above to consult and/or manipulate the data using regular SQL statements.

Conclusion

The CSV engine provides a very flexible mechanism to move data in and out of regular text files as long as you have proper access the data directories. You can easily generate these tables from existing data. You can also easily manipulate the data with any editor and (re) import it at will. I see it as an invaluable tool to move around information, especially in development and testing environments.

Do you have interesting use cases? I'd like to hear about them.