Search code examples
sqljsonpostgresqlpostgresql-json

Postgresql: replacing whole strings inside json, based on delimiters


Is there a good way to do a string replace in postgresql that takes this into account?:

  • The strings to replace are similar to https://my.oldserver.com/api/v1/images/929009-ee-cda-6-4227-83-e-4-80-fc-954730-b-6.jpeg?id=MQkvMjAyMi8wMS8xOC85MjkwMDllZS1jZGE2LTQyMjctODNlNC04MGZjOTU0NzMwYjYuanBlZwk4OTY=
  • I want to select and replace the whole string based on the fact that it starts with https://my.oldserver.com/api/v1/images/929009-ee-cda-6-4227-83-e-4-80-fc-954730-b-6.jpeg , it has a ?id= param/separator and/or ?id=<BASE64> and ends with a " or the following ,
  • In the end, I should end up with a value like: https://my.newserver.com/2022/01/18/929009ee-cda6-4227-83e4-80fc954730b6.jpeg, replacing the value from the first point
  • I'm using a replace instruction like this simplified example:
update document_revisions dr
set data = replace(
    (dr.data)::text,
    '"url": "https://my.oldserver.com/api/v1/images',
    '"url": "https://my.newserver.com'
  )::jsonb;

Seems pretty simple to replace a value by another directly like this example, but selecting a whole string based on the set of conditions I showed above is not so trivial?

I'm treating the whole json in the data column as text and doing attempting the replacement on structures (among others) like this:

        {
            "identifier": "p:814:794.image",
            "id": "doc-1f73vuahm0",
            "content": {
                "img": {
                    "originalUrl": "https://my.oldserver.com/api/v1/images/covid-19-m-rna-vaccines.png?id=MQkvMjAyMS82LzEvZWI2YzQwZjItMjYzMS00MTY4LWFhZGQtNjZjMmVkOTg5ZDQ2LnBuZwk4MTQ=",
                    "url": "https://my.oldserver.com/api/v1/images/covid-19-m-rna-vaccines.png?id=MQkvMjAyMS82LzEvZWI2YzQwZjItMjYzMS00MTY4LWFhZGQtNjZjMmVkOTg5ZDQ2LnBuZwk4MTQ%3D&auto=format",
                    "mediaId": "3nO-shKnBV18",
                    "width": 1390,
                    "height": 341,
                    "mimeType": "image/png"
                },
                "caption": "COVID-19 mRNA vaccines",
                "source": "Source: Example1"
            }
        },

How would I go about replacing these entries properly? In the end, since I have multiple entries like these, I'm thinking about storing the value on a variable and iterate, image by image, all the replacements that need to be made, but I just can't get select the whole string so far, only directly replacing parts of it.

Thanks


Solution

  • You can use regexp_replace:

    update document_revisions set data = regexp_replace(data::text, 
     '(https\://my\.)oldserver(\.com/api/v1/images/[\w\-]+\.[a-z]+\?id\=(?:[^"]+)*)', 
     '\1newserver\2', 'g')::jsonb
    

    See fiddle