There are some problems with inserting text in postgres that is too long. when I have a simple table with text, I can insert text as long as I like (I tested up to 40K chars). However, when I add a unique
constraint I begin to encounter a strange btree
problem, see the minimal working example (MWE) below
MWE:
#!/bin/bash
N=4096 # Aiming for a URL of length 4,096 characters
DB_NAME='foo'
# Generate random N character alphanumeric string of lenght 4,096
URL="http://www.$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w $(($N-15)) | head -n 1).com"
# Case 1 we have a table ('website') which has a single column with no
# constraints
TABLE_NAME='website'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
# Case 2 we have a table ('website2') which has a single column which must be
# unique
TABLE_NAME='website2'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text unique);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
Output:
$ ./test.sh
DROP DATABASE
CREATE DATABASE
NOTICE: table "website" does not exist, skipping
DROP TABLE
CREATE TABLE
INSERT 0 1
DROP DATABASE
CREATE DATABASE
NOTICE: table "website2" does not exist, skipping
DROP TABLE
CREATE TABLE
ERROR: index row size 4112 exceeds btree version 4 maximum 2704 for index "website2_url_key"
DETAIL: Index row references tuple (0,1) in relation "website2".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
Question: What should I do if I have extremely long text (>3K chars)?
unique
constraint and check at the application level only?Better yet, cast the md5 to type uuid
:
CREATE UNIQUE INDEX unique_url_index ON $TABLE_NAME ((md5(url)::uuid)); -- parens required
Makes the index smaller and faster. See: