Search code examples
regexpostgresqlsubstringtrim

Trim UK Postcodes in PostgreSQL


I know a similar question exists but that solution doesn't work in PostgreSQL.

What i'm trying to do; create new columns with copy of the full postcode and then trim this down first to sector, then trim that to district, and finally to area. ie. Copy postcode to postcode_sector trim postcode_sector.

TA15 1PL becomes:

  • TA15 1 for sector

  • TA15 for district

  • TA for area.

What I've tried:

Create new columns in table for each, then;

SELECT postcode_sector FROM postcodes
RTRIM (Left([postcode_sector],(Len([postcode_sector])-2)) + " " + 
Right([postcode_sector],3));

Throws syntax error;

Select
Postcode,
RTRIM(LEFT(Postcode, PATINDEX('%[0-9]%', Postcode) - 1)) As AreaTest
From postcodes

Doesn't work as no PATINDEX function in PostgresSQL. From here I have looked at an alternate approach using SUBSTRING function elped by the excellent tutorial here . Using;

SELECT 
substring (postcode FROM 1 FOR 6) AS postcode_sector
FROM postcodes;

Gets me part way I now have a column with TA15 1 but due to the way the system works i also have T15 1A. Is there a way in PostgresSQL to count the number of characters in a cell and delete one? Out of wider interest is it faster to use TRIM than SUBSTRING I'm executing across the full postcode file which is ~27million rows


Solution

  • I'm not that familiar with UK post codes, but according to Wikipedia's format, this should handle all cases:

    select postcode,
           m[1] || m[2] || ' ' || m[3] sector,
           m[1] || m[2] district,
           m[1] area
    from   src,
           regexp_matches(postcode, '^([A-Z]{1,2})([0-9A-Z]{1,2}) ([0-9])([A-Z]{2})') m
    

    http://rextester.com/KREPX19406