“How we use PostgreSQL at Trustly” at PGConf.EU

Slides from my talk, How we use PostgreSQL at Trustly, are now available.

psql \watch 1400000000 epoch time countdown counter

\pset fieldsep ' '
    (('epoch'::timestamptz + 14*10^8 * '1 s'::interval)-now())::interval(0),
    (14*10^8-extract(epoch from now()))::int,
    extract(epoch from now())::int
\watch 1

09:18:28 33508 1399966492
09:18:27 33507 1399966493
09:18:26 33506 1399966494
09:18:25 33505 1399966495
09:18:24 33504 1399966496
09:18:23 33503 1399966497

Garbage Collection of Unused PostgreSQL Tables and Columns

Over the last five years, our database at Trustly have kept growing in number of tables, columns and functions, where some of the tables and columns aren’t being used by any database functions any longer. Getting rid of them is important, as otherwise people working with the database will be confused and annoyed. Database developers should always be able to rely on the data model being relevant and up to date.

In our system, no applications access the database tables directly, instead everything goes through stored procedures.

This means, if a table or column name is not present anywhere in any function’s source code, it’s very likely the table/column is not being used by anything. The only exception is if you have dynamically crafted queries executed using EXECUTE, where the table/column names are constructed from different parts. In our system, we thankfully only have a few such cases.

FROM pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relkind = 'r'
AND pg_catalog.pg_namespace.nspname NOT IN ('information_schema','pg_catalog')
) Tables(TableName)
SELECT 1 FROM pg_catalog.pg_proc
WHERE pg_catalog.pg_proc.prosrc ~* Tables.TableName
ORDER BY Tables.TableName

This query returned quite a lot of table names with about half of them being false positives,
but still a managable list to go through manually.

50 minutes of manual work later:

92 files changed, 1114 deletions(-)
False positives: 14

Then I moved on to the task of finding unused table columns.
The query below excludes any false positives found in the previous query.

    FROM pg_catalog.pg_class
    INNER JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
    INNER JOIN pg_catalog.pg_attribute ON pg_catalog.pg_attribute.attrelid = pg_catalog.pg_class.oid
    WHERE pg_catalog.pg_class.relkind = 'r'
    AND pg_catalog.pg_attribute.attnum > 0
    AND NOT pg_catalog.pg_attribute.attisdropped
    AND pg_catalog.pg_namespace.nspname NOT IN ('information_schema','pg_catalog')
    -- Exclude columns in tables we know are unused by the stored procedures, but we want to keep around anyway:
    AND pg_catalog.pg_class.relname !~* '^(alexacountrie|bankersworldonline|bindgroup|clearinghousecurrencie|dailystat|geoiporganization|hourlystat|usercommitment|polishbank|polishbanknumber|swedishpostalcode|testbasedata|testperlmodule|useraccesslogarchive)'
) Columns(ColumnName)
    SELECT 1 FROM pg_catalog.pg_proc
    WHERE pg_catalog.pg_proc.prosrc ~* Columns.ColumnName
ORDER BY Columns.ColumnName

It took me two hours to go through all code. It was an interesting journey in time with lots of memories.

50 files changed, 5396 insertions(+), 5917 deletions(-)
False positives: 87

The reason why there were so many insertions and deletions, was because the dropped columns affected some of the base tables with reference data, which had to be regenerated, thus affecting all lines in those files.

In summary, the ROI on those three hours of time invested is enormous. Developers can now feel confident all tables and columns fulfill a purpose in the system. This exercise will of course need to be repeated in the future though.

SET search_path for all SECURITY DEFINER functions

As pointed out by Peter Eisentraut in a blog post named
Schema Search Paths Considered Pain in the Butt, you need to make sure the search_path is explicitly set for all SECURITY DEFINER functions in PostgreSQL.

Fixing this manually for, in my case, 2106 functions, is, indeed a “pain in the butt”, so I crafted a little query to automate the job:

\pset format unaligned
\o /tmp/fix_search_path_for_security_definer_functions.sql
            -- inject SET search_path in-between LANGUAGE and SECURITY DEFINER in the declaration
                E'(LANGUAGE [a-z]+)\\s+(SECURITY DEFINER)',
                E'\\1\n SET search_path TO public, pg_temp\n \\2'
from pg_proc
where prosecdef is true -- SECURITY DEFINER functions
-- don't include functions for which we have already specified a search_path
and not (coalesce(array_to_string(proconfig,''),'') like '%search_path%')
-- public schema
and pronamespace = 2200
\i /tmp/fix_search_path_for_security_definer_functions.sql
-- If all goes well you should see a lot of CREATE FUNCTION being spammed on the screen

Fun with pg_catalog.pg_depend

Learning PostgreSQL and SQL in general probably begins with the concept of TABLES. Then probably VIEWS, INDEXES and maybe TRIGGERS.
Some users might not ever go any further, which is sad, because there is so much more to explore!

I thought it would be cool to automatically generate a graph showing the dependencies between objects and their types.
This shows the order in which the different types of objects can be created,
perhaps mentally useful to think in terms of “after we have created a TABLE we can create an INDEX”.

Something like this would be nice to include in the PostgreSQL documentation online.
I think it would be helpful when learning about PostgreSQL different object types.

The graph below was produced using GraphViz dot command and live-data from pg_catalog.pg_depend:


As we can see, before anything else, we need a SCHEMA, which is the root node.
Once we have a SCHEMA, we can create TABLES, TYPES, VIEWS, SEQUENCES and FUNCTIONS.
Some users might not even know about SCHEMAs, as the schema “public” is pre-installed.
To create an INDEX, we first need a TABLE.
Etc, etc, etc…

You might be surprised FUNCTION and LANGUAGE have arrows pointing in both directions.
Turns out you need some functions before you can create a language like plperl, such as plperl_call_handler.
The self-referencing arrow from/to FUNCTION is less surprising as some functions can of course call other functions.

(Not all object types are included in this graph as I’m not using them all in my system.)

::xml madness

Nobody likes XML, except masochists and Microsoft consultants who charge by the hour.
Sometimes you are forced to deal with XML anyway, such as parsing the response from external APIs.
Parsing XML is per se a nasty business to begin with, but in this particular example, the ugliness set new records.

The only “solution” I came up with is too ugly for production use, but the alternatives were even uglier, so I had no option.

I hope there is someone out there reading this who can present a proper solution to the problem.

Let’s say you have this XML:

<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com">

Using xpath() you extract the content of testxml:

SELECT xpath(
    '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,
-- Result: <ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo>

The returned XML is not valid since its missing the xmlns definitions,
but the PostgreSQL XML data type doesn’t complain, which is OK I guess,
a bit of quirks mode perhaps?

Because of the missing xmlns, it’s impossible to make use of this XML fragment returned.
You cannot extract any subsequent sub-elements in it using XPath.

For instance, this won’t work:

SELECT xpath(
        '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,

-- Error:
-- ERROR:  could not create XPath object
-- DETAIL:  namespace error : Namespace prefix ns1 on foo is not defined
-- <ns1:foo>
--         ^
-- namespace error : Namespace prefix ns2 on bar is not defined
--   <ns2:bar>baz</ns2:bar>
--           ^
-- Undefined namespace prefix
-- xmlXPathCompiledEval: evaluation failed

Even if you pass it the NSArray in the outer xpath() call, you don’t get ‘baz’ but nothing at all.

SELECT xpath(
        '<testxml xmlns:ns1="http://www.example.com" xmlns:ns2="http://www.example2.com"><ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo></testxml>'::xml,

-- Returns:
--  xpath 
-- -------
--  {}
-- (1 row)
-- And NOT 'baz' which is what we want.

Therefore, given you have an XML “fragment” where the tags have ns:___ but without the xmlns:___=”URI” part,
is there any way to extract sub-elements, given you know the namespaces?

For instance, by wrapping the XML “fragment” inside an outer XML tag,
and specifying bogous xmlns elements for the namespaces,
I managed to hack together a work-around, probably with hundreds of flaws,
and this cannot possibly be the best way to approach this problem.

SELECT xpath_fragment('/ns1:foo/ns2:bar/text()','<ns1:foo><ns2:bar>baz</ns2:bar></ns1:foo>',ARRAY['ns1','ns2']);

(1 row)

Source code of insanely ugly “solution” to the “problem”:

CREATE OR REPLACE FUNCTION xpath_fragment(_XPath text, _XML xml, _NSNames text[]) RETURNS XML[] AS $BODY$
_ text;
_WrappedXML xml;
_NSArray text[][] := ARRAY[]::text[][];

SELECT ('<xml ' || array_to_string(array_agg('xmlns:' || unnest || '=" "'),' ') || '>' || _XML::text || '</xml>')::text INTO _WrappedXML FROM unnest(_NSNames);

FOR _ IN SELECT unnest(_NSNames)
    _NSArray := _NSArray || ARRAY[[_, ' ']];

RETURN xpath('/xml' || _XPath, _WrappedXML, _NSArray);

Ideas, anyone?

PL/pgSQL #1 annoyance

I cannot count the number of times I’ve written the same stupid code,
just to do something as simple as making sure the UPDATE
updates exactly one row.

Last time I complained about this in public was at our Christmas party at the office.
Normally nobody cares to listen to me when I’ve had a few beers and start talking about SQL stuff.
Next day I noticed a colleague had spent the hang-over-morning hacking on a patch.
What an awesome Christmas party I thought!

That was quite some time ago and my code still looks ugly, so let’s complain some more and hope it helps.

Number of ugly statements in our codebase:

$ grep -r -E "(IF NOT FOUND THEN)|(RETURNING (1|TRUE) INTO STRICT)" . | wc -l

In a typical OLTP application, each UPDATE typically affects exactly one row.
If more than one row or no row at all would match, you want to throw an exception, return false,
or what ever the proper behaviour might be in the situation.

This sounds like a trivial task, but it turns out there is no nice way of doing this in plpgsql.

Here are three different work-arounds to ensure the update went through:


Even if FOUND could mean more than one row was affected,
it’s safe to assume exactly one row was updated if you are updating
using a primary key or any other unique column.

UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;
    RAISE no_data_found;


If you are a bit more paranoid, you can check the exact row count,
and throw different exceptions based on if no rows were found
or if more than one was found.

UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;
IF _RowCount = 0 THEN
    RAISE no_data_found;
ELSIF _RowCount > 1 THEN
    RAISE too_many_rows;


This is the one I prefer. It guarantees only one row was affected,
and is at the same time a one-liner, or two lines if you could
the DECLARE of the boolean variable.

_OK boolean;
UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0 RETURNING TRUE INTO STRICT _OK;

However, none of them are pretty. Something as simple and frequent should be in the syntax of a language.

The proposed syntax in johto’s patch was:

UPDATE STRICT Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;

Tom Lane suggested to put the STRICT keyword in the beginning instead,
avoiding conflicts with table names, which sounds like a good idea to me.

STRICT UPDATE Orders SET Processed = 1 WHERE OrderID = _OrderID AND Processed = 0;

I really hope the community will find a way to fix this syntax bug,
without causing to much pain in terms of backwards compatibility issues for users.

I remember when we upgraded from 8.4 to 9.1, I struggled changing all code where
IN/OUT parameters were in conflict with column names,
as the behaviour changed to raise an error.

Users who didn’t want to invest time fixing their code could simply set the setting plpgsql.variable_conflict in postgresql.conf to get back the old behaviour.

I think this was a very user-friendly and nice way to deal with the compatibility issue.

If the STRICT syntax isn’t possible because of hypothetical compatibility issues,
perhaps it could be up to the user to turn it on or off, using a similar setting?

I would even like to go a bit further and suggest making the update of exactly one row the default in PL/pgSQL.
In the typically fewer cases where you want to update more than one row, could explicitly specify it with some other keyword,
such as ALL UPDATE or whatever. This would make the code much cleaner, as you wouldn’t need to type STRICT
all over your code.


Get every new post delivered to your Inbox.

Join 25 other followers