pg_trompe: https://llg.cubic.org/pg_trompe

Asynchronous multi-master replication for PostgreSQL

Dirk Jagdmann <doj@cubic.org>

http://dirk.jagdmann.de

about me

Foreword on replication

Replication types

SynchronousAsynchronous
Master/Slave - Shared Disk Failover (DRDB, NFS, ...) - Slony-I
- Log Shipping
- Command Prompt Mammoth Replicator
Multi Master - PGCluster
- Postgres-R
- Slony-II (dead)
- pg_trompe
- bucardo
- PgReplicator (dead)

Solutions based on middleware

My motivation to write pg_trompe

What I need

General problems with AMM

My view of the situation

We have layers of responsibilities:

1 user
2 application 
2.5 replication
3 database
 

I don't care for theory

My solution for the theory

1) inserts into unique columns
→ unique columns should be avoided at database level. Alternatively application is responsible to provide unique values.
2) two updates on the same row
→ chance of winning update problem is 50%. In practice this is not different from two users working on single-master DB, because their submit is not synchronized.
3) two deletes on the same row
→ one will fail, but end result is the same.
4) update on deleted row
→ will fail, but end result is the same as if order of events would have been reversed.

Constraints of my approach

Implementation

sample log table

idnodetablenameactionnrlo
13 2 user INSERT 4
14 2 user UPDATE 3
15 2 user UPDATE 77
16 2 product INSERT 5 56346685
17 2 user DELETE 8
18 4 alter table... DDL
19 2 alter table... DDL

perl program

2nd web application

Support for large objects

Other features

Demonstration

TODO

  • a look at the current TODO file
  • still thinking how to handle the globally unique invoice number in EZBill

License

why zlib/libpng license?

  • decision was made many years ago and I usually use it, because I now have an Emacs macro
  • I can read it in less than 1 minute and fully understand it

Comparison with bucardo

  • bucardo uses one central daemon to manage replication to all databases
  • can use default and custom conflict handlers
  • is a push replication model, pg_trompe is pull
  • Does not replicate DDL
  • Cannot handle more than two master nodes at a time
  • Uses an independent database for its config, pg_trompe uses schema in replicated db
  • can not handle large objects

Any questions will be answered