Two people might be simultaneously modifying the same database function. To protect against this scenario, I created a little tool to stage the source code of all functions in
pg_proc and the definition of all views. You then deploy your changes, and afterwards you call
fdiff() to show a diff of the changes made, by comparing the new content of
pg_proc and the definitions of all views. If the changes are what you expected, you go ahead and commit.
We’ve been using this methodology every day in both my current company and my previous one, and it has prevented us from doing mistakes.
Source code and installation instructions: https://github.com/joelonsql/fdiff
This is how to use it:
- Start a new transaction
- Temp store state of all views/functions
- Deploy changes to schema
- Show a diff of changes made
- If the changes are expected, go ahead and commit
CREATE OR REPLACE FUNCTION ...
CREATE OR REPLACE VIEW ...
Example on the output from
| Updated or replaced functions |
Argument data types...: _orderid bigint
Result data type......: boolean
20 c OR Orders.ShippingDate IS NOT NULL
20 c OR Orders.ShippingDate > now() - interval '2 month'
Hope you’ll like it!
This is interesting, but seems like a crutch to a higher level problem. How about put in place a release process with version control system such as git, perforce, etc etc. and avoid doing some special stuff in the database?
This is an extra layer of protection in addition to a VCS.
The VCS will never be real-time in sync with the production database schema, but you can come quite close to it.
We are using an in-house developed tool called schema_diff, which automatically every minute dumps the entire schema using the proposed pg_dump –split patch, where each db object is saved to a separate file. It then diffs the dump with the master branch of the schema git repo. In case they are not identical, an email alart is sent out to all the DBAs. It keeps spamming every 5th minute until you’ve made sure the git-repo is identical to the production schema.
It won’t bug you if there are cAsE differences, white-space, etc. This is achieved by first dumping, then restoring, and then dumping again to normalize the dumps before diffing them.
So, every second of my work day, I can feel confident the master branch in the git-repo is _identica_ to what’s in the production database. But even in this utopia, I still want to be sure _within the same database transaction_, the state of the schema right before my deploy is what I expected it to be. Someone might be working on the exact same db object at the exact same time, or there might be a bug in schema_diff, or something other might fail. Redundant security layers are key.
Can you describe the situations where this method has helped you avoid mistakes? If the production deployment is the responsibility of one (or few people, who coordinate with each other), which is usually the case for production databases, then I don’t see a need for such mechanism.
You don’t need to coordinate using this approach, then people can work and deploy things simultaneously in the same database, we do it every day, and it works because you are always in-sync with the git repo before you deploy, and when deploying you immediately notice if something would differ compared to the git repo.