So I was trying to set up some data to go into a partitioned table in postgres and given our architecture relies on hibernate I thought it would be nice to be able to be consistent and use hibernate to push the data into the partition and read it out. I also wanted the partitioning table creation to be handled more or less automatically.
Setting up the partitioning in postgres was fairly easy. I created the master table FOO and an insert trigger on foo that calls a pl/pgsql function insert_foo. The FOO table is partitioned by date, so in insert_foo I take the date of the record to be inserted (NEW.datecreated) and use that to build up the name of the partition table I really want to insert it in: FOO_082008. I then use that table name an build a string that contains an insert command (carefully using quote_literal on the values to be safe) and EXECUTE that command. I catch the undefined_table exception which is thrown when the date rolls over to a new month for which we don’t yet have a table. In the exception handling code I dynamically create the table and rerun the original dynamic insert.
This actually all works quite well. The problem is hibernate, or more specificly hibernate helpfully trying to check for errors for you. Basically when you tell hibernate to save a Foo object it runs the insert on FOO. The trigger catches that and instead inserts the data into FOO_MMYYYY and returns null so no further processing by the database is done and the jdbc driver returns saying it inserted zero rows into the FOO table, which is technically true, and hibernate freaks out because it is expecting that 1 row should have been saved. That is reasonable enough, but annoyingly there is no way to tell hibernate you really expect zero rows. The exception that is thrown is a fairly generic HibernateException, so the only way to catch and swallow this one particular case would be to text match on the error message. We all know what a terrible idea that is, so we are a little SOL.
There are two things that seem like they would work with hibernate. One, is to use a postgres RULE instead of a pl/pgsql FUNCTION to do the partitioning. RULEs basically rewrites the sql you are going to run, so from the jdbc driver point of view you should get back that you did, in fact, save 1 row to FOO_MMYYYY. However I’ve never used rules and from what I can gather from my checking out the less that totally awesome documentation on the subject, it doesn’t seem like I can do the same level of magic table creation. You would have to maintain the rule so that each month you added a new if/then check to save the data to the appropriate table for the month and create the new table for the month. Even if you did that once a year and pre-created a years worth of tables it is still maintenance and someone could still screw it up. (Quite easily given my experience with DBAs The other option is fairly hacky but does work. If insert_foo returns NEW instead of NULL then the insert operation continues just like before the trigger was activated and the jdbc driver reports 1 row saved and hibernate is happy. Of course the problem is that we now have one copy of the data in FOO and one in FOO_MMYYYY, that’s no good as all FOO_MMYYYY table inherit from FOO so all queries on FOO will return duplicate results. So to get around that you can make a table FOO_IN that is the same definition of FOO. In the hibernate mapping you map FooIn to FOO_IN and add a trigger on FOO_IN to call insert_foo. You modify insert_foo to return NEW and to delete from FOO_IN, this all results in a copy of the data going into FOO_MMYYYY and another going into FOO_IN, which is deleted the next time anything is inserted. Of course you can’t use hibernate to read from FOO_IN since there is nothing there. So you create another mapping for a class FOO_OUT that is the same as FOO_IN but maps to the table FOO. This is a little redundant but you only have to do it once. You can make FooIn and FooOut inherit from FooBase and use that in places the data could be read in or out.
If there were someway to do a DELETE in postgres that doesn’t cascade to the child tables you could get away with one mapping and insert_foo could return NEW and also delete from the FOO table. That is a little problematic as you would always have 1 duplicate row in the master table, but I can’t figure out how to actually do that, so it isn’t much of an issue.
Of course depending on what you are doing you can also use straight sql, but kind of annoying to do the whole mix and match with database. Anyway maybe there is a better way but I couldn’t work it out after a day of poking around.
Here is an example of using date to do the partitioning in a postgres pl function:
CREATE OR REPLACE FUNCTION dw_foo_insert_trigger()
RETURNS TRIGGER AS $$
dateTable := ‘foo_’ || to_char(NEW.transaction_date, ‘MMYYYY’);
— you could also probably do NEW.* if you don’t care about column order changing.
cmd := ‘INSERT INTO ‘ || foo || ‘(id, code, transaction_date, override_id, product_code)’ ||
‘ VALUES (‘ || quote_literal(NEW.id) || ‘,’ ||
quote_literal(NEW.code) || ‘,’ ||
quote_literal(NEW.transaction_date) || ‘,’ ||
quote_literal(NEW.override_id) || ‘,';
— join strings together with null values results in a fail, so check for those explicitly
IF (NEW.product_code IS NULL) THEN
cmd := cmd || ‘null’ || ‘,';
cmd := cmd || quote_literal(NEW.product_code) || ‘,';
WHEN undefined_table THEN
— creat the new child table from the parent
createTable := ‘CREATE TABLE ‘ || dateTable || ‘( ‘ ||
‘CHECK ( transaction_date >= DATE ”’ || to_char(date_trunc(‘month’, NEW.transaction_date ), ‘YYYY-MM-DD’) ||
”’ AND transaction_date < DATE ”’ || to_char(date_trunc(‘month’, NEW.transaction_date + interval ‘1 month’), ‘YYYY-MM-DD’) || ”’)) INHERITS (foo)'; — create the index on the child table createIdxTUID := ‘CREATE INDEX IDX_’ || dateTable || ‘_ID ON ‘ || dateTable || ‘(id)'; EXECUTE createTable; EXECUTE createIdxTUID; — and rerun the command now that the child table exists EXECUTE cmd; RETURN NULL; END; END; $$ CREATE TRIGGER dw_foo_trigger BEFORE INSERT ON foo FOR EACH ROW EXECUTE PROCEDURE dw_foo_insert_trigger(); References: Hibernate In Action Is a pretty good reference for Hibernate and PostgreSQL Developers Library is one I want to get for postgres at some point in the future.