postgresql salt state

2013-09-10

This post will not just be about how I’ve decided to manage our PostgreSQL servers, but my journey into Jinja templating.

Tuning PostgreSQL is pretty much a neccessity (especially on FreeBSD), because out of the box its tuned for a small server with <1GB of physical memory.

The default configuration (postgresl.conf) does not assume it will be a dedicated SQL server, which is fine with me.

The convention I came across is to simply use pgtune.

However, I did not want to create a salt state to in the end, just run pgtune (which also creates a needless pkg dependency) and output that to postgresql.conf.

What I decided to do instead was to take a look at what pgtune does, and create a jinja template to make the same optimizations.

Salt State Layout

First, lets take a look at my state:

pgsql/
├── init.sls
├── nagios_pgsql.cfg.jinja
├── postgres.conf.jinja
├── postgresql.conf.jinja
├── rc
│   └── postgresql.jinja
└── replication
    └── init.sls

    2 directories, 6 files

I have a separate state for Replication, and we use pg-pool-iii for that, which is not included in every instance we run here.

pgsql init

postgresql:
   pkg:
      - installed
      - name: {{ pillar.postgresql }}
   service:
      - running
      - require:
         - pkg: postgresql
         {% if grains.os == "FreeBSD" %}
         - file: postgresql.rcng
         {% endif %}

{{ "{% set pg_db_dir = salt['pillar.get']('db:data_dir') + '/db/postgresql'"" }}%}

{% if grains.os == "FreeBSD" %}
postgresql.rcng:
   file.managed:
      - name: /etc/rc.conf.d/postgresql
      - source: salt://pgsql/rc/postgresql.jinja
      - template: jinja
      - context:
         db_dir: {{ pg_db_dir }}
{% endif %}

{{ "{{ pg_db_dir  "}}}}:
   file.directory:
      - user: pgsql
      - group: pgsql
      - makedirs: True
      - recurse:
         - user
         - group

{{ "{{ pg_db_dir "}}}}/postgresql.conf:
    file.managed:
      - source: salt://pgsql/postgresql.conf.jinja
      - template: jinja
      - context:
         pg_db_dir: {{ pg_db_dir }}
         conn: 300


{{ "{{ pillar.db.backup_dir "}}}}/postgresql:
   file.directory:
      - user: pgsql
      - group: pgsql
      - makedirs: True
      - recurse:
         - user
         - group

# munin
p5-DBD-Pg:
   pkg.installed

munin_pgsql.conf:
   file.managed:
      - name: {{ pillar.etc_prefix }}/munin/plugin-conf.d/postgres.conf
      - source: salt://pgsql/postgres.conf.jinja
      - template: jinja
      - require:
         - pkg: munin-node

# nagios
nagios_pgsql.conf:
   file.managed:
      - name: {{ pillar.etc_prefix }}/nrpe.cfg.d/check_pgsql.cfg
      - source: salt://pgsql/nagios_pgsql.cfg.jinja
      - template: jinja

Nothing special here, I have a few “if"s in for FreeBSD because I like to use the “RCng” style, and so far there is no direct support in Salt for that.

Salt will use the built in “service” command, but that has a bug in it still, where it only looks at /etc/rc.conf. So, I cannot enable and then set custom options in /etc/rc.conf.d/postgresql.

Also, I created a variable called pg_db_dir, which is part of a top level pillar value I use across all system which I call the data_dir.

On most systems, I create a volume (Hardware or ZFS ZVOL) called /data.

I /data, I establish a heirarchy for the service(s) the system will provide.

For example, databases go in /data/db/${type}, backups of said databases go in /data/db/backups/${type}. Web content usually goes in /data/www/${fqdn}.

It is simple for me, and it creates essentially one volume that needs to be backed up.

Anyway, back to our salt-state

Here is the postgresql.conf template that is rendered:

postgresql.conf Jinja2 template

data_directory = '{{ "{{ pg_db_dir "}}}}'
hba_file = '{{ "{{ pg_db_dir "}}}}/pg_hba.conf'    # host-based authentication file
listen_addresses = '*'      # what IP address(es) to listen on;
ssl = on
ssl_ciphers = 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH'   # allowed SSL ciphers
ssl_renegotiation_limit = 512MB
ssl_cert_file = '{{ "{{ grains.fqdn "}}}}.pem'
ssl_key_file = '{{ "{{ grains.fqdn "}}}}.key'
ssl_ca_file = 'ca.pem'
password_encryption = on
db_user_namespace = off
temp_buffers = 16MB
max_stack_depth = 2MB
wal_level = hot_standby
fsync = off
wal_sync_method = open_sync
archive_mode = on
archive_command = 'cp -i %p {{ "{{ pg_db_dir "}}}}/archives/%f'
max_wal_senders = 5
hot_standby = on
log_destination = 'syslog'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_statement = 'all'           # none, ddl, mod, all
log_timezone = 'US/Pacific'
update_process_title = off
datestyle = 'iso, mdy'
timezone = 'US/Pacific'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 50 
maintenance_work_mem = {{ "{{ ((grains.mem_total|int) / 16)|round|int "}}}}MB
constraint_exclusion = off
checkpoint_completion_target = 0.9
effective_cache_size = {{ "{{ (((grains.mem_total|int) * 3) / 4)|round|int "}}}}MB
work_mem = {{ "{{ ((grains.mem_total|int) / conn) |round|int "}}}}MB
wal_buffers = 8MB
checkpoint_segments = 16
shared_buffers = {{ "{{ ((grains.mem_total|int) / 4)|round|int "}}}}MB
max_connections = {{ "{{ conn " }}}} 

Lets take a look at some of these, because it was not intuitive by any means.

Detour : Jinja Templates

Salt can use different template engines, and I’ve always used Jinaj2

When you write a template, you can call grains, pillar data, and pass through other values as “context” (see the db_dir value I delcare in init).

To access those values in the Jinja2 template, you call it as:

{{ "{{ pg_db_dir "}}}}

Which will get rendered as /data/db/postgresql

That is easy, and works all of the time.

Where things got complicated was when I was attempting to perform some basic math.

Here are two examples.

{{"{{ grains.num_cups * 2 "}}}}

This will spit out the number of CPU’s your system has, times 2. So if I had a server/vm with 2 cpu’s, this would output 4

{{"{{ grains.mem_total / 4 "}}}}

This results in a wonderful error:

----------
    State: - file
    Name:      /data/db/postgresql/postgresql.conf
    Function:  managed
        Result:    False
        Comment:   Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/salt/utils/templates.py", line 63, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/local/lib/python2.7/site-packages/salt/utils/templates.py", line 116, in render_jinja_tmpl
    output = jinja_env.from_string(tmplstr).render(**context)
  File "/usr/local/lib/python2.7/site-packages/Jinja2-2.7.1-py2.7.egg/jinja2/environment.py", line 969, in render
    return self.environment.handle_exception(exc_info, True)
  File "/usr/local/lib/python2.7/site-packages/Jinja2-2.7.1-py2.7.egg/jinja2/environment.py", line 742, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "<template>", line 34, in top-level template code
TypeError: unsupported operand type(s) for /: 'str' and 'int'

It is pretty clear, mem_total is set as a string datatype, and we’re trying to perform a division operation on it.

By why does multiplication work?

Because, python has a str.mul method for but not a str.div by default.

So the first thing I did was to make sure that mem_total was being returned as a integet and not a string.

It wasn’t, so I submitted a pull request and got that taken care of.

However, that did not fix the issue. Even when the grain was returned as a integer, it seems Jinja still treated it as a string.

Jinaj2 has Filters, which took a while for me to find and figure out. I discovered the int filter, which casts a string as an integer.

So, I changed:

                                                      
{{"{{ grains.mem_total / 4 "}}}}                                                 

to

                                                      
{{"{{ grains.mem_total|int / 4 "}}}}                                                 

However, that will rendered:

1019.0

A float…

Even worse, in the case of “grains.mem_total * 3 / 4” it will render

13.5866666667

Either cases are not good, as postgresql will fail trying to parse those values.

So the next filter to apply is round:

                                                      
{{"{{ (grains.mem_total|int * 3 / 4)|round "}}}}                                                 

Which will display ‘‘‘14.0’’’

Again, that wont work.

So, what else to do? Run it through the int filter again:

                                                      
{{"{{ (grains.mem_total|int * 3 / 4)|round|int "}}}}                                                 

Now, our final rendered postgresql.conf has all of the proper values:

data_directory = '/data/db/postgresql'
hba_file = '/data/db/postgresql/pg_hba.conf'    # host-based authentication file
listen_addresses = '*'      # what IP address(es) to listen on;
ssl = on
ssl_ciphers = 'ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH'   # allowed SSL ciphers
ssl_renegotiation_limit = 512MB
ssl_cert_file = 'pg-4.discdrive.bayphoto.com.pem'
ssl_key_file = 'pg-4.discdrive.bayphoto.com.key'
ssl_ca_file = 'ca.pem'
password_encryption = on
db_user_namespace = off
temp_buffers = 16MB
max_stack_depth = 2MB
wal_level = hot_standby
fsync = off
wal_sync_method = open_sync
archive_mode = on
archive_command = 'cp -i %p /data/db/postgresql/archives/%f'
max_wal_senders = 5
hot_standby = on
log_destination = 'syslog'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_statement = 'all'           # none, ddl, mod, all
log_timezone = 'US/Pacific'
update_process_title = off
datestyle = 'iso, mdy'
timezone = 'US/Pacific'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
default_text_search_config = 'pg_catalog.english'
default_statistics_target = 50 
maintenance_work_mem = 255MB
constraint_exclusion = off
checkpoint_completion_target = 0.9
effective_cache_size = 3057MB
work_mem = 14MB
wal_buffers = 8MB
checkpoint_segments = 16MB
shared_buffers = 1019MB
max_connections = 300

The rest of the state is MUCH simpler :)

Munin and Icinga checks

We use Icinga (nagios) for process checks, and I like to use Munin to display pretty graphs and metrics.

Both require their respective packages to be installed, but we require those two at the top level, so the only relevent packages to install here are p5-DBD-Pg.

The Munin plugin-conf.d/postgres.conf template is quick and easy, its:

[postgres_*]
user pgsql
env.PGUSER pgsql
env.PGPORT 5432

and the Nagios check is just as simple:

command[check_postgres]=/usr/local/libexec/nagios/check_procs -c 1: -C postgres

Bonus: Replication

If I want to add a new PostgreSQL server to our replication environment, I’ve added this new and simple pgsql.replication state:

replication:
   postgres_user.present:
      - runas: pgsql
      - name: replication
      - encrypted: True
      - password: {{ pillar.pgsql.replication_passwd }}
      - replication: True

{{ pillar.db.data_dir }}/db/postgresql/archives:
   file.directory:
      - user: pgsql
      - group: pgsql
      - makedirs: True
      - recurse:
         - user
         - group

{{ pillar.db.data_dir }}/db/postgresql/main:
   file.directory:
      - user: pgsql
      - group: pgsql
      - makedirs: True
      - recurse:
         - user
         - group

AAAAYour0000SSH000Key000Here0000:
  ssh_auth:
      - present
      - user: pgsql
      - enc: ssh-rsa

This creates the required pgsql role ‘‘‘replication’’’ using the pillar data:

pgsql:
   - replication_passwd: yourpassword

The ssh-key for the pgsql user (the system account that runs PostgreSQL, which would be postgres on linux) is required for pg-pool.

What is missing from this setup is how to roll it into your pg-pool instance

Well, that and a few more os cases, but to be honest, I’ll stick with FreeBSD as a PostgreSQL server unless there is a strong technical reason.