"Auditability of your data can be vital. Certainly in the financial sector."

History of objects

We make, amongst other things, transactional systems. We do this mostly for insurance companies. The number of transactions per hour in these systems is at most a few thousand which is relatively small, of course. However, there is the need of an audit trail of just about everything both for internal reporting purposes and regulatory requirements. In order to accommodate these requirements we build our own management information systems for these systems. Typical questions to be answered are :

  • Which employee has done what in this claim at what moment in time?
  • How many different employees have made updates to this claim?
  • Which employee has initiated this claim and how long did it take for the claim to be handled from start to finish?
  • We have just payed amount X; the payment was modified 5 times before it was authorised, who were the employees who changed it, what were the changes and why?
In these systems people (both customers and employees) enter data. They create (C) new records, they inspect records (read, R) and they update (U) records. They sometimes delete (D) records, more on that later. The astute reader has recognized the infamous “CRUD” acronym. In general we are not interested in the “R” part: every screen shown to customers and employees is littered with the data, keeping a record who has seen what field, when, and what the value of the field was at the time would create a deluge of history which is (as far as I can see) useless.



So what is it exactly we want? For each record ever produced in the database (in Django terms: for each object of a model ever created, updated or modified) we want to know for each transaction:

  • Was it a C, U or D?
  • Who done it?
  • When?
Let’s dig into an example and iterate through a number of solutions.

We define 4 employees as a choice list and we have an oversimplified payment table (e.g. missing in this table is to whom we make the payment). The models.py looks like:

from django.db import models


class TransactionDT(models.Model):
    """ To be added to (nearly) every model. Last_modified_user to be filled in the admin and in the views."""
    modified_dt = models.DateTimeField(auto_now=True, null=True)  # filled by Django
    last_modified_user = models.ForeignKey('auth.User', verbose_name='Was changes by', null=True, blank=True)

    class Meta:
        abstract = True


class Payment(TransactionDT):
    EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy'))
    employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False)
    amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0)
    payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False)

    class Meta:
        unique_together = ('employee', 'payment_dt')

    def __str__(self):
        return '{0}, ${1}, {2}'.format(self.employee, self.amount, self.payment_dt)

and the corresponding admin.py:

from django.contrib import admin
from yourapp.models import Payment


class LastUser:
    """
    mixin for all Admin classes to fill the last_modified_user of the mixin TransactionDT from the models.
    """
    def save_model(self, request, object, form, change):
        object.last_modified_user = request.user
        object.save()


class PaymentAdmin(LastUser, admin.ModelAdmin):
    list_display = ('employee', 'payment_dt')
    list_display_links = list_display
    ordering = list_display
    list_filter = ('employee',)
    search_fields = list_display
    fieldsets = (('Key', {'classes': ('wide', 'extrapretty'),
                          'fields': ('employee',)
                          }),
                 ('Value', {'classes': ('wide', 'extrapretty'),
                            'fields': ('amount', 'payment_dt')
                            }),
                 )
admin.site.register(Payment, PaymentAdmin)

Run makemigrations and migrate. Now we need some users to be able to log into the admin. I like to create my test data with scripts; see the bottom of this blog for the complete script.

The admin solution
Start Runserver and log into the admin with one of the created users. You can now add, edit and delete payments. Not only that, but at the top right there is this intriguing button “history”.

Pressing this button may give something like:

Upon inspecting the database one sees a table called “django_admin_log”; I leave it to the reader to have a look at the contents of this table. The name of the table says it all: the changes shown are the changes made in the admin, not changes made via user screens, via an API or via raw SQL (your DBA running some update / repair script outside of Django). In summary: nice, however for our goals by far not extensive enough.

The packaged options
Searching the internet we come across the very useful page:

    https://djangopackages.org/grids/g/model-audit

Obviously, we are not the first ones to hit upon this requirement. It seems most, if not all, of the packages listed use either middleware or signals or both. I had a look at Reversion, which seems well maintained and the most used. Changes made via Admin screens, via user screens and even via APIs will be registered, however, changes made by external SQL scripts will not end up in the history tables.

The database option
So, what are we left with? To ensure all CUD are always logged we have to look outside of Django, we have to ensure that at database level the changes are registered. Triggers (https://en.wikipedia.org/wiki/Database_trigger) are the way to go. We are going to add a table which is going to be updated outside of Django itself using these triggers. However, we do want to make use of the wonderful migrations feature of Django; when a model changes we want the history table to change as well without us even thinking about it... the best of boths worlds... the free lunch...

For this example we now modify our models to:

from django.db import models


class Mutation(models.Model):
    """ To be added to models where you want complete history.
    If a record is CUD by Django (admin or frontend) then mut_dt = modified_dt (give or take a few
    thousands of a second). However, if record is CUD outside of Django modified_dt will be NULL whereas
    mut_dt is still filled by trigger on database.

    Making mut_dt the primary key prevents the creation of "id" with a database sequence in the audit tables.
    The field named "id" is the copy of the id of the payment table."""
    MUTATION_CHOICES = (('C', 'Create'), ('U', 'Update'), ('D', 'Delete'))
    mut_type = models.CharField(max_length=1, choices=MUTATION_CHOICES, null=False)
    mut_dt = models.DateTimeField(primary_key=True, null=False)
    id = models.IntegerField(null=False)

    class Meta:
        abstract = True


class TransactionDT(models.Model):
    """ To be added to (nearly) every model. Last_modified_user to be filled in the admin and in the views."""
    modified_dt = models.DateTimeField(auto_now=True, null=True)  # filled by Django
    last_modified_user = models.ForeignKey('auth.User', verbose_name='Was changes by', null=True, blank=True)

    class Meta:
        abstract = True


class PaymentAbstract(TransactionDT):
    EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy'))
    employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False)
    amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0)
    payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False)

    class Meta:
        abstract = True
        unique_together = ('employee', 'payment_dt')


class Payment(PaymentAbstract):
    def __str__(self):
        return '{0}, ${1}, {2}'.format(self.employee, self.amount, self.payment_dt)


class AuditPayment(Mutation, PaymentAbstract):
    """ The order of the fields in the database table is important for the insert statement of the stored procedure: 
           mut_type, mut_dt followed by exactly the same variables as in Payment.
    Therefore the mixin Mutation has mut_type, mut_dt, id in it followed by the variables of PaymentAbstract."""
    def __str__(self):
        return '{0}, {1}, {2}, ${3}, {4}'.format(self.mut_dt, self.mut_type, self.employee, self.amount, self.payment_dt)


The admin.py is OK as it is; no need to change anything here.

On the database (PostgreSQL in our case) we create the trigger on the payment table in order to fill the auditpayment table:

CREATE OR REPLACE FUNCTION log_history() RETURNS trigger AS
$BODY$
    BEGIN
      IF (TG_OP = 'INSERT') THEN
            EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''C'', now(), ($1).*' 
            USING NEW;
      ELSIF (TG_OP = 'UPDATE') THEN
            RAISE NOTICE 'Tabel %', tg_relname || OLD.* ;
            EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''U'', now(), ($1).*' 
            USING NEW;
      ELSIF (TG_OP = 'DELETE') THEN
            EXECUTE 'INSERT INTO app2hist_auditpayment SELECT ''D'', now(), ($1).*' 
            USING OLD;
      END IF;
      RETURN NULL; -- result is ignored since this is an AFTER trigger
    END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION log_history()
  OWNER TO postgres;  



CREATE TRIGGER app2hist_payment
  AFTER INSERT OR UPDATE OR DELETE
  ON app2hist_payment
  FOR EACH ROW
  EXECUTE PROCEDURE log_history();


So what is stored in the auditpayment table if we create a payment, do two updates on it and delete it? Something like:


Keeping model and audit table of model in sync
If we now make a model change, run makemigrations and migrate, will that work? The answer is: yes, it will. Let's say we add a receiver to the payment model:

class Payment2Abstract(TransactionDT):
    EMPLOYEE_CHOICES = (('A', 'Ada'), ('B', 'Ben'), ('C', 'Carl'), ('D', 'Daisy'))
    RECEIVER_CHOICES = (('X', 'Alaxandra'), ('Y', 'Young'), ('Z', 'Zaanen'))
    employee = models.CharField(max_length=1, choices=EMPLOYEE_CHOICES, null=False)
    amount = models.DecimalField(max_digits=12, decimal_places=2, null=False, default=0)
    payment_dt = models.DateTimeField(help_text='when the payment will be executed', null=False)
    receiver = models.CharField(max_length=1, choices=RECEIVER_CHOICES, null=True)

    class Meta:
        abstract = True
        unique_together = ('employee', 'payment_dt')


and makemigrations / migrate this change. The result is that our table and our update table will be updated in such a way that the trigger will keep on doing it's job just as intended.


Et voila (as the French can say so beautifully), in a nutshell, that is it. All changes on your objects, done via Django or outside of Django, will be reflected in the audit table.


Improvements
There are a number of improvements one can think of when one wants to run this on production scale:
  • automatically create triggers and stored procedures based on the metadata of the database
  • set the audit tables aside in a separate schema. This is the subject of another post, there does seem to be a Django / PostgreSQL problem with that.
Both of these may be the subject of new blogs.


Thank You
Colleague Marco Oskam for his intimate knowledge of databases. Writing the trigger / stored procedure is his thing. Also automating the creation of these on production scale.

Alexander Kojevnikov for “hilite.me”. I used it for formatting the Python / SQL code to nice looking HTML. I used Style Colorful.



Fill initial data
import django
from django.contrib.auth.models import Group, Permission, User

django.setup()  # in Pycharm, needs to be done, your miles may vary !!


def ins_group(nm):
    k = Group()
    k.name = nm
    k.save()
    k.permissions.add(Permission.objects.get(codename='add_payment'))
    k.permissions.add(Permission.objects.get(codename='change_payment'))
    k.permissions.add(Permission.objects.get(codename='delete_payment'))
    return


def ins_user(us_nm, us_email, us_passw, fn, ln, gr_nm, staff, superuser):
    us1 = User.objects.create_user(us_nm, us_email, us_passw)
    us1.first_name = fn
    us1.last_name = ln
    us1.is_staff = staff
    us1.is_superuser = superuser
    us1.save()
    g = Group.objects.get(name=gr_nm)
    g.user_set.add(us1)
    return

User.objects.all().delete()
Group.objects.all().delete()

ins_group('employee')

# run only on development machine !! these passwords en emails are not strong
ins_user('Super_1', 'super@a.a', 'aabbcc', 'Super', 'Maarten', 'employee', True, True)
ins_user('Emp_1', 'emp@a.a', 'aabbcc', 'Emp', 'Zaanen', 'employee', True, False)
print('OK Created users')

https://www.amvtek.com/blog/posts/2014/Jun/13/accessing-multiple-postgres-schemas-from-django/