Azure Postgres & DML statement tracing with Log Analytics

Azure Postgres & DML statement tracing with Log Analytics

Introduction

There are situations where you want to trace certain SQL statements hitting your Azure Postgres database.

This might be the case when you have to comply with government, financial, or ISO certifications and need to provide data to auditors. In other cases you only want temporal tracing to debug certain and otherwise difficult to tackle application problems.

Unfortunately, Postgres doesn't provide features like Change Data Capture (CDC) or Change Tracking as MSSQL does. That's why I am going to demonstrate the options available to enable DML statement tracing on an Azure Postgres server instance.

When tracing is enabled, a lot of log data will be generated on heavily loaded databases, which has some performance implications. To mitigate this performance hit, we will disable the generation of log files and send data to a log analytics workspace for further analysis.

The options that we are going to look at are:

  • Audit logging with the pgaudit extension
  • Statement logging
  • Custom trigger & trigger functions
☝🏼 In case you know of any other options that work with Azure Postgres, please let me know!

I assume you already have a log analytics workspace in place, if that's not the case, here is an Azure CLI snippet to create one.

az monitor log-analytics workspace create \
    --resource-group rg-postgres-logging \
    --workspace-name log-postgres-traces \
    --location westeurope \
    --subscription <your_subscription>

Okay, let's get some work done!

Audit logging with the pgaudit extension

First of all, for Azure Postgres Single Server instances, pgAudit is still in preview and can only be enabled with the general purpose and memory-optimized tiers.

Second, we are also stuck with the General Purpose and Memory-optimized compute tiers when dealing with Azure Postgres Flexible Server instances. Admittedly, the pgAudit extension can also be used with the Burstable compute tier, but as we'd like to feed data to a log analytics workspace we are stuck with the higher tiers.  

Microsoft already provides some good documentation on that, so I am only going to distill the high-level steps and important pieces here.

  1. From within the server parameters enabled PGAUDIT on the parameter shared_preload_libraries.
  2. Set the logging_collector to OFF so no more log files are getting generated. Writing log files has some performance hit, why Microsoft recommends disabling it when feeding to Azure Monitor or other services.
  3. Set log_line_prefix from %t-%c- to %m-%c- so we get millisecond resolution in the logs.
  4. Set pgaudit.log to WRITE, which will log INSERT, UPDATE, DELETE, TRUNCATE and COPY. Alternative groups of statements can be configured, find more on the documentation.
  5. Save your changes! WAIT! You are not done yet. Don't forget to restart the server.
  6. Next, go to Monitoring > Diagnostic settings, enable logging for the category PostgreSQLLogs, and point it to an existing log analytics workspace.
  7. Log into your server and run create extension pgaudit;

Audit logs are now getting fed to the log analytics workspace. However, you'll need to give it some time, until you see the first messages arriving (10 to 15 minutes in my experience). Later on, have a query like so.

AzureDiagnostics
| where LogicalServerName_s == "psql-demo"
| where Category == "PostgreSQLLogs"
| where TimeGenerated > ago(1d) 
| where Message has "AUDIT:"
| project TimeGenerated, Message
| order by TimeGenerated desc

Unfortunately, pgAudit settings are specified globally and can't be specified at the database or role level.

Also, it's unfortunate, that expensive string comparison is required when searching for audit messages. It would be welcome if we could search for where errorLevel_s == "AUDIT" or similar.

Trying to work around this limitation e.g. with setting pgaudit.log_level = DEBUG5 to a more noticeable log level won't work, as this setting is only enabled when pgaudit.log_client is enabled. According to Microsoft, this however should generally be left disabled.

Statement Logging

Like with the previous pgAudit approach, feeding Postgres logs to a log analytics workspace via diagnostic settings requires general-purpose or memory-optimized SKUs (source).

  1. From within the server parameters set logging_collector = OFF, to not waste storage space and reduce the performance impact.
  2. Set log_statement = MOD and log_line_prefix = %m-%c-

In case you have different requirements on what statements to log you can follow the keywords from the table below.

Keyword Statements
off -
ddl create, alter, drop
mod create, alter, drop, insert, update, delete, truncate, copy from
all *

By saving the mentioned server parameters Azure will send a SIGHUB signal to the running process. These changes won't require a server restart.

☝🏼 When changing Postgres Server Parameters, pay close attention to the keywords Dynamic and Static. Changing a dynamic parameter doesn't require a server restart, however a static parameter does!

All that is left to be done is enable Postgres diagnostic logging and point it to your log analytics workspace. You can do so by navigating to Monitoring > Diagnostic settings and adding a new one. You want to enable the PostgreSQLLogs category and select your log analytics workspace created earlier.

You, later on, can query for your statements like shown below. Again, it's unfortunate that we need to do a string comparison to find the relevant logs.

AzureDiagnostics 
| where Message has "statement:"
| project TimeGenerated, Message
| order by TimeGenerated desc

Custom trigger & trigger functions

The described approaches are as simple as it can be. Unfortunately, they won't let us choose which specific tables we want to get logs from.

If you are only interested in one particular table and are willing to take a bigger performance hit, you might be better of with triggers & trigger functions. Here is a basic example that will fire on INSERT, UPDATE and DELETE.

create table debug.employee_trail (
    id serial PRIMARY KEY,
    operation varchar(6) not null,
    employee_id integer not null,
    timestamp timestamp not null
);

create or replace function debug.f_employee_changed() returns trigger as
$$
declare
    arg text;
begin
    if tg_argv[0] = 'delete' then
        insert into debug.employee_trail (operation, employee_id, timestamp) values (tg_argv[0], old.id, now());
    else
        insert into debug.employee_trail (operation, employee_id, timestamp) values (tg_argv[0], new.id, now());
    end if;

  return new;
end;
$$ language plpgsql;

create trigger t_on_insert after insert
    on employees for each row
    execute procedure debug.f_employee_changed('insert');

create trigger t_on_update after update
    on employees for each row
    execute procedure debug.f_employee_changed('update');

create trigger t_on_delete after delete
    on employees for each row
    execute procedure debug.f_employee_changed('delete');

insert into employees (first_name, last_name) values ('klaus', 'bommelmaier');
update employees set last_name = 'Maier' where last_name = 'bommelmaier';
delete from employees where last_name = 'Maier';

Final remarks

Replacing parameter placeholders

You might be tempted to set pg_qs.replace_parameter_placeholders = ON, however, I discourage you from doing so on a production system, as it might crash your server. Read more on this here.

Only successful DML executions are shown

Also please note, that only successful DML statement executions are getting logged. Statements that contain simple syntax errors won't show up, even when setting log_statement = all.

Further reading

Known issues and limitations for Azure Database for PostgreSQL - Single Server and Flexible Server
Lists the known issues that customers should be aware of.
Audit logging in Azure Database for PostgreSQL - Single Server
Concepts for pgAudit audit logging in Azure Database for PostgreSQL - Single Server.
pgaudit/README.md at master · pgaudit/pgaudit
PostgreSQL Audit Extension. Contribute to pgaudit/pgaudit development by creating an account on GitHub.
19.8. Error Reporting and Logging
19.8.&nbsp;Error Reporting and Logging 19.8.1. Where To Log 19.8.2. When To Log 19.8.3. What To Log 19.8.4. Using CSV-Format Log …