2010-01-26

Nested Transactions in MySQL

Transactions are awesome. They can keep your database in a consistent state, without resorting to nasty business logic to cleanse the input first.

A couple months ago, I switched jobs and now I'm working as a Web Developer with a side of System Administration. This has recently meant that I've gotten up close and personal with some of the nastier quirks of MySQL. One of the things I wish it supported better is transactions.

The default storage engine doesn't support them at all, but we mostly use InnoDB tables anyway, so we get "transactions" for free. I use quotes here because these aren't quite the same as what I would expect. For example,

(editor's note: the default storage engine in MySQL 5.5+ is InnoDB, which does support transactions)

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> CREATE TABLE foo (...);
Query OK, 0 rows affected (0.00 sec)

mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)

Rolling back a transaction should undo everything in the transaction, right?

Not quite so in MySQL. Observe:

mysql> SHOW TABLES LIKE 'foo';
+----------------------+
| Tables_in_test (foo) |
+----------------------+
| foo                  |
+----------------------+
1 row in set (0.00 sec)

Why do I now have a table `foo`? MySQL COMMITs a transaction, and puts you back into auto-commit mode whenever you add or drop a table. This is a good caveat to know, as you can do destructive things thinking you're safely in a transaction when you're not.

Unfortunately, this means the following doesn't work as intended:

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> ALTER TABLE `foo` ADD FOREIGN KEY `bar_id` REFERENCES `bar` (`bar_id`);
Query OK, 5 rows affected (0.02 sec)

mysql> DELETE * FROM `bar`;
Query OK, 3 rows affected (0.01 sec)

mysql> ROLLBACK;
Query OK, 0 rows affected (0.00 sec)

Here, I have ninja-truncated `bar`. and by "ninja," I mean, "sh*t, where's my backup of that database?"

ALTER TABLE works by first renaming the original table, then creating a new table with the original name and new schema, and then inserting all the records again. This means there's a hidden CREATE TABLE statement lurking in every ALTER TABLE. Which commits the transaction, and puts me back into auto-commit mode. Even though it's between a BEGIN and a ROLLBACK, the DELETE isn't in a transaction at all. And this can spell trouble.

Is there a way around all this? Sure. Is it easy? Not so much.

The naive approach runs about like this:

<?

class MyPDO extends PDO {
  protected $transactions;

  function __construct($dsn) {
    $this->transactions = 0;
    parent::__construct($dsn);
  }

  /**
   * Push a transaction on the stack.
   * @return bool
   */
  function beginTransaction() {
    if (!$this->transactions) {
      if (!parent::beginTransaction()) return false;
    }
    ++$this->transactions;
    return true;
  }

  /**
   * Commit if we're the outermost transaction.
   * @return bool
   */
  function commit() {
    --$this->transactions;
    if (!$this->transactions) {
      return parent::commit();
    }
    return true;
  }

  /**
   * Roll back if we're the outermost transaction.
   * @return bool
   */
  function rollback() {
    --$this->transactions;
    if (!$this->transactions) {
      return parent::rollback();
    }
    return true;
  }
}

This follows the basic contract for PDO transactions. Of course, an actual implementation would have to recognize when database-altering statements destroy your transaction; probably by querying the database's current auto-commit mode setting after every query.

Also, note that this makes the outermost transaction win. In the code I see at work, this makes sense, because things generally follow the contract:

<?

try {
  // function with query that may fail
} catch (PDOException $e) {
  $conn->rollback();
  throw $e;
}

The problem is that our save methods do this same thing. And most of them call other objects' save methods. So nested transactions are the rule, not the exception.

Notice that throw $e;? I think that's what keeps the system from coming crashing down in a raging inferno; instead it's just a small, constant flame...biding its time...waiting to doom us all.

This works fine for a regular stack of calls, but fails utterly once you throw in looping constructs. For example, removing a node from the middle of a tree:

<?
try {
  foreach($this->childen as $c) {
    $c->setParent($this->parent);
    $c->save();
  }
  $this->delete();
} catch (Exception $e) {
  $conn->rollback();
}

All these steps need to occur atomically. But, the save() method starts and commits its own transaction, so the database isn't guaranteeing atomicity (hint: that's the reason for transactions in the first place). What happens when the 3rd child fails? Children 1 and 2 are already committed, so the database is now in an inconsistent state. That's not a happy place to be, and the logic to prevent it could break horribly because of concurrency issues. Remember, the <? and ?> tags mean this goes on the internets; there are going to be concurrency issues to deal with.

So to fix it, we'll have to hack PDO. There needs to be a single transaction, and the outermost transaction has to "win" instead of the innermost. MySQL just doesn't support these semantics out of the box, otherwise it'd be a non-issue.

I'm just glad that our app doesn't have any schema-altering queries in it...

:wq

No comments:

Post a Comment