Coastal's Blog: September 2012

Thursday 20 September 2012

postgresql replication, load balancing and SPOFs

here's article for setting up two pgpools - so no single point of failure (thanks to Nozomi Anazai)
http://www.pgpool.net/pgpool-web/contrib_docs/watchdog/en.html

here's an article how to set up pgpool to handle replication and load balancing of a postgresql cluster (thanks to Tatsuo Ishii)
http://pgpool.projects.postgresql.org/contrib_docs/simple_sr_setting/index.html

And here's a great presentation from Tom Fry @ OrdnanceSurvey on their setup (spatial databases this time) using pgPool. No replication duties in this case I think as it's only handling select duties. Here.

Wednesday 19 September 2012

Show databases and tables in postgresql

show databases:
select datname from pg_database;
show tables:
\dt

postgres ssh-copy-id: ERROR: No identities found

I'm working on hot_standby with postgresql on centos. One thing I got stuck on being a noob at linux was how to create ssh keys for the postgres account. I was following the 3dtin blog on postgresql hot_standby and got stuck at the following point;
su - postgres
ssh-copy-id
ssh-copy-id: ERROR: No identities found

Ok, remove that pesky selinux. I believe that's at fault. Set SELINUX in /etc/sysconfig/selinux to disabled.
su - postgres; pwd should be /var/lib/pgsql
mkdir .ssh
ssh-keygen (hit defaults)
ssh-copy-id -i ~./ssh/id_rsa.pub
check with ssh So if you do that on the pg_master and drop the key on the pg_slave, you should then repeat the other way (as the 3dtin blog says).

Friday 14 September 2012

Cluster your PostGIS

I've been asked to look at clustering work's PostGIS needs. We've got some OSM which is only used for Mapnik to read and render from so I'm going to ignore that for now. I'm more interested in setting up a PostGIS cluster for user-created content.

I've found a great PDF presentation from Steve Bennett at the Ordnance Survery. Also a great, simple note on clustering Postgresql from Satoshi Nagayasu. I'm in the process of knocking up a simple test cluster so I'll report back.

Eggie5's run through of streaming replication looks a brief but solid guide to replication.

Jyro's tutorial here on 3DTin is wonderful. It pushed this fat-arsed noob all the way through so I've got a master/standby setup running. Something to play with and attempt to break - that's the point right? Tutorial here.

Monday 10 September 2012

Loading PostGIS with ESRI Shapefiles

Note to self:
Use shp2pgsql to create CREATE/INSERT sql;
shp2pgsql -c -D -s 4326 -I <shp file> <table name> > <file output name>.sql
and then follow it up with the lovely;
psql -d <database name> -U <user name> -f <file output name>.sql
This way I've got all that lovely SQL to store somewhere if I need it.

Thursday 6 September 2012

How to get an up-to-date OSM in PostGIS

How to build an OSM database that is up to date.

Rather than doing a quarterly manual job of downloading a 14Gb+ file, decompressing it (250Gb+) and inserting it into a database it will be a lot easier to download daily (50Mb) OSM change files (.osc) and apply the changes to the existing database.

The instructions below is a modified version of Martin van Exel's tutorial here - https://docs.google.com/document/pub?id=1paaYsOakgJEYP380R70s4SGYq8ME3ASl-mweVi1DlQ4

# get osmosis
su cd /opt
wget http://dev.openstreetmap.org/~bretth/osmosis-build/osmosis-latest.tgz
tar zxvf osmosis-latest.tgz
rm -rf osmosis-latest.tgz
# in my case osmosis-0.41

# now pop into PostgreSQL and build a suitable database & user (could be tidier)
adduser osm
passwd osm
# and some badly chosen password: osm, osm
su - postgres psql
createdb osm
createlang plpgsql osm
CREATE USER osm WITH PASSWORD 'osm';
~~#CREATE DATABASE osm;~~
GRANT ALL PRIVILEGES ON DATABASE osm to osm;
GRANT ALL PRIVILEGES ON DATABASE osm to postgres;
\q
psql -d osm -U osm -f /usr/pgsql-9.1/share/contrib/postgis-2.0/postgis.sql
psql -d osm -U osm f /usr/pgsql-9.1/share/contrib/postgis-2.0/spatial_ref_sys.sql
psql -U osm -d osm
CREATE EXTENSION hstore;
\q
psql -U osm -d osm -f /home/<user name>/osm/osmosis-0.41/script/pgsimple_schema_0.6.sql

# make some directories and get some data
su mkdir /tmp/osm
cd tmp/osm
mkdir planet
mkdir planet/replication

# now get some OSM data - either get the whole planet.. or get a sample from http://download.geofabrik.de/osm/
cd /tmp/osm/planet/
wget http://planet.openstreetmap.org/planet-latest.osm.bz2

# and thump into the database (using the VERY IMPORTANT --slim option, -C is how much memory we're setting aside for this: 4Gb in this case) /opt/osm2pgsql/osm2pgsql -S default.style --slim -d osm -C 4000 /tmp/osm/planet/planet-latest.osm.bz2

# create the configuration.txt and download.lock
osmosis --rrii workingDirectory=/tmp/osm/planet/replication

# create your state.txt file, visit http://toolserver.org/~mazder/replicate-sequences/ and enter details. To check the datetime required examine the contents of the bz2 file.

bunzip2 -c planet-latest.osm.bz2 | head

Now at this point I've been running osmosis to grab the change file and osm2pgsql to throw it at the database but you can apparently run osmosis on it's own;

# Osmosis on it's own version
/opt/osmosis-0.41/bin/osmosis --rri workingDirectory=/tmp/osm/planet/replication --sc --wpc user="osm" database="osm" password="osm"

# Or the two command route - grab a change file from the queue of necessary changes to perform
/opt/osmosis-0.41/bin/osmosis --rri workingDirectory=/tmp/osm/planet/replication --simplify-change --write-xml-change /tmp/osm/planet/replication/changes.osc.gz

# And append the changes to the database
/opt/osm2pgsql/osm2pgsql --append -S /opt/osm2pgsql/default.style -d osm -U postgres --slim /tmp/osm/planet/replication/changes.osc.gz

# You can put either set of commands into a script and cron it up, here's the script for the osm2pgsql route

#!/bin/sh
n=`ps -ef | grep -v grep | grep /opt/osmosis-0.41/ | wc -l`
m=`ps -ef | grep -v grep | grep /opt/osm2pgsql/ | wc -l`
let i=n+m
if [ $i -gt 0 ]; then
echo osmosis or osm2pgsql running
else
echo not running
/opt/osmosis-0.41/bin/osmosis --rri workingDirectory=/tmp/osm/planet/replication --simplify-change --write-xml-change /tmp/osm/planet/replication/changes.osc.gz
/opt/osm2pgsql/osm2pgsql --append -S /opt/osm2pgsql/default.style -d osm -U postgres --slim /tmp/osm/planet/replication/changes.osc.gz
fi

# Now you're probably running mod_tile and mapnik to render and serve up OSM tiles so we need a method to tell mod_tile to re-render all tiles that have updates. We can amend the osm2pgsql call with -e options to produce a list of tiles that need to be expired and thus re-rendered. [See OSM Tile Expire Methods].
# change the osm2pgsql command to read
/opt/osm2pgsql/osm2pgsql --append -S /opt/osm2pgsql/default.style -d osm -U postgres --slim /tmp/osm/planet/replication/changes.osc.gz -e15 -o expire.list

# We now expire all tiles listed in the expire.list. I haven't included the install and setup of Mapnik or mod_tile so i'm going to use some dummy locations for them. I'm expiring tiles from zoom level 6 and downwards
cat expire.list | /opt/mapnik/mod_tile/render_expired --map=osm --min-zoom=6 --touch-from=7 >/dev/null

# The OSM Tile Expire Methods guide also recommends a daily re-render of the lower zoom levels.