Search code examples
postgresqlamazon-web-servicesamazon-rdspg-cron

Postgres 13 pg_cron error on RDS that's hard to track down


pg_cron jobs all fail with "role of....doesn't provide permission to schedule a job"

I'm working on getting pg_partman and pg_cron set up on RDS, but when my pg_cron jobs run, they return this error:

ERROR: The protected role of rds_super doesn't provide permission to schedule a job.

From the error text, it seems like I'm missing a simple permissions issue on something in the directory or resources holding pg_cron, but I can't find the source of the problem. And, it's possibly something else. A lot of Googling, hunting through sources, and trial-and-error hasn't lead me to any answers, and I'm hoping for help.

For background, this is Postgres 13.4 with pg_cron 1.3. These are the latest versions now available on RDS. My goal is to have pg_cron running jobs in various databases in this cluster, but I've reduced the problem example to the cron schema in postgres.

RDS defines a role named rds_superuser that doesn't have a login, which you can then grant to other users. We're using a custom role named rds_super, and have for years.

When you create extension pg_cron up on RDS, the install is into the postgres database by default, and it creates a new schema named cron. That's all fine. As a "hello world" version of the problem, here's a simple table and task to insert the current time every minute into a text field.

DROP TABLE IF EXISTS cron.foo;

CREATE TABLE IF NOT EXISTS cron.foo (
    bar text
);

GRANT INSERT ON TABLE cron.foo TO rds_super;

INSERT INTO cron.foo VALUES (now()::text);

select * from cron.foo;

-- Run every minute
SELECT cron.schedule('postgres.populate.foo','*/1 * * * *',
$$INSERT INTO cron.foo VALUES (now()::text) $$);

The bare statement INSERT INTO cron.foo VALUES (now()::text) works fine, when connected directly as the rds_super user. But when it's executed through the cron.job defined above, the cron.job_run_details output has the right code, the expected user, but a failure result with this error:

ERROR: The protected role of rds_super doesn't provide permission to schedule a job.

Does this ring a bell for anyone? I've deleted, reinstalled, set permissions explicitly. No improvement.

Public

This may be off, but I ran into a couple of things where it looked like I needed to provide access to public. I started in PG 9.4 or 9.5, couldn't get my head around securing public...and stripped all of the rights off it everywhere. Putting some back in may be needed here?

Permissions checks

Here are the permissions checks that I could think of.

select grantor,
       grantee,
       table_schema,
       table_name,
       string_agg (privilege_type, ',' order by privilege_type) as grants

  from information_schema.role_table_grants
where table_catalog = 'postgres'
  and table_schema  = 'cron'
  and grantee       = 'rds_super'

group by 1,2,3,4

order by 1,2,3,4;

I gave the user all privileges on all of the tables, just to see if that cleared things up. No joy.

grantor          grantee     table_schema  table_name             grants
rds_super        rds_super   cron          foo                    DELETE,INSERT,REFERENCES,SELECT,TRIGGER,TRUNCATE,UPDATE
rds_super        rds_super   cron          job_run_details_plus   DELETE,INSERT,REFERENCES,SELECT,TRIGGER,TRUNCATE,UPDATE
rds_superuser    rds_super   cron          job                    DELETE,INSERT,REFERENCES,SELECT,TRIGGER,TRUNCATE,UPDATE
rds_superuser    rds_super   cron          job_run_details        DELETE,INSERT,REFERENCES,SELECT,TRIGGER,TRUNCATE,UPDATE

Nothing is obviously wrong with the schema rights:

select pg_catalog.has_schema_privilege('rds_super', 'cron', 'CREATE') AS create,
       pg_catalog.has_schema_privilege('rds_super', 'cron', 'USAGE')  AS usage;

create  usage
t       t

Likewise, nothing pops out when I check function execution rights:

   select proname, proargnames
     from pg_proc
    where has_function_privilege('rds_super',oid,'execute')
      and pronamespace::regnamespace::text = 'cron'
order by 1,2


proname                  proargnames
job_cache_invalidate
schedule                 {job_name,schedule,command}
schedule                 {schedule,command}
unschedule               {job_id}
unschedule               {job_name}

Answer

I do not know if this is the answer, but it seems to have fixed my problem. Spoiler: Log in as the user the pg_cron scheduler background worker runs as.

I burned everything down and restarted, and then found that my jobs simply would not run. No error, no results. I checked the status of the background workers like this:

 select application_name,
        usename,
        backend_type,
        query,
        state,
        wait_event_type,
        age(now(),backend_start) as backend_start_age,
        age(now(),query_start)  as query_start_age,
        age(now(),state_change) state_change_age

  from pg_stat_activity
 where backend_type != 'client backend';

I noticed that the background worker had been running for over a day (it's loaded as a shared library), and seemed to be stuck. I rebooted the server, and redid everything logged in as dbadmin, instead of my custom user. That's the user name that the pg_cron scheduler process is running as, in this case. I don't remember if dbadmin is part of the package with RDS Postgres, or if it's something I added years back. There's nothing in the RDS pg_cron instructions about this, so maybe it's just me. I needed to set up its search_path and permissions a bit to get everything working the way I needed, but that's normal.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL_pg_cron.html


Solution

  • At least in my case, the answer is to run the jobs as the same user as the pg_cron background thread. I've posted more details to the end of the original question.