2010-04-22

Copy-on-write Semantics on DABL

I've been thinking of a way to do this for a while, and finally got some time to hack out a prototype yesterday.

First, the problem statement. We have a piece of data (street address) that needs to be associated with multiple entities. Basically, I want a many:many relationship between them, but with a catch. Once the address is attached to an entity, it should be manageable on a per-entity basis.

The obvious solution is to make a copy of the data and just attach the copy to the second entity. That's fine, but it's not unrealistic to have 1,000 entities with the same information. That means 1,000 copies of the data, just in case it might change.

So, I implemented a copy-on-write save() method for DABL, our pet ORM/MVC.

(editor's note: DABL is now hosted on github)

The basic idea is that we need to generate a new ID for the record, then pass that new record off to a subset of the related records. If all the related records are to be modified, there's nothing to be gained (except revision history; if you want that, it's trivial) from creating a new record. Rather than letting the database update all the related records, we have to do it ourselves since we only want to CASCADE some of them.

For the source of my prototype pet project, see DABL cow-save() or fork my nonblocking-random repository on github. The basic database setup is a Userdata table which has many Users. Userdata holds the equivalent of GECOS information (firstname, lastname), while User holds the login information (username, password). The production version of this has much more data in the related table and is written in Propel, which makes it significantly uglier.

:wq

No comments:

Post a Comment