Search code examples
ceph

Ceph: too many PGs per OSD


I configured Ceph with the recommended values (using a formula from the docs). I have 3 OSDs, and my config (which I've put on the monitor node and all 3 OSDs) includes this:

osd pool default size = 2
osd pool default min size = 1
osd pool default pg num = 150
osd pool default pgp num = 150

When I run ceph status I get:

 health HEALTH_WARN
        too many PGs per OSD (1042 > max 300)

This is confusing for two reasons. First, because the recommended formula did not satisfy Ceph. Second, and most puzzling, is that it says I have 1042 PGs per OSD, when my configuration says 150.

What am I doing wrong?


Solution

  • Before setting PG count you need to know 3 things.

    1. Number of OSD

    ceph osd ls
    
    Sample Output:
     0
     1
     2
     
     Here Total number of osd is three.
    

    2. Number of Pools

    ceph osd pool ls or rados lspools

    Sample Output:
      rbd
      images
      vms
      volumes
      backups
         
    Here Total number of pool is five.
    

    3. Replication Count

    ceph osd dump | grep repli
    
    Sample Output:
     pool 0 'rbd' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 38 flags hashpspool stripe_width 0
     pool 1 'images' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 40 flags hashpspool stripe_width 0
     pool 2 'vms' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 42 flags hashpspool stripe_width 0
     pool 3 'volumes' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 36 flags hashpspool stripe_width 0
     pool 4 'backups' replicated size 2 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 30 pgp_num 30 last_change 44 flags hashpspool stripe_width 0
    
    You can see each pool has replication count two.
    

    Now Let get into calculation

    Calculations:

    Total PGs Calculation:

    Total PGs = (Total_number_of_OSD * 100) / max_replication_count
    
    This result must be rounded up to the nearest power of 2.
    

    Example:

    No of OSD: 3
    No of Replication Count: 2

    Total PGs = (3 * 100) / 2 = 150. Nearest Power of 150 to 2 is 256.

    So Maximum Recommended PGs is 256

    You can set PG for every Pool

    Total PGs per pool Calculation:

    Total PGs = ((Total_number_of_OSD * 100) / max_replication_count) / pool count
    
    This result must be rounded up to the nearest power of 2.
    

    Example:

    No of OSD: 3
    No of Replication Count: 2
    No of pools: 5

    Total PGs = ((3 * 100) / 2 ) / 5 = 150 / 5 = 30 . Nearest Power of 30 to 2 is 32.

    So Total No of PGs per pool is 32.

    Power of 2 Table:

    2^0     1
    2^1     2
    2^2     4
    2^3     8
    2^4     16
    2^5     32
    2^6     64
    2^7     128
    2^8     256
    2^9     512
    2^10    1024
    

    Useful Commands

    ceph osd pool create <pool-name> <pg-number> <pgp-number> - To create a new pool
    
    ceph osd pool get <pool-name> pg_num - To get number of PG in a pool
    
    ceph osd pool get <pool-name> pgp_num - To get number of PGP in a pool
    
    ceph osd pool set <pool-name> pg_num <number> - To increase number of PG in a pool
    
    ceph osd pool set <pool-name> pgp_num <number> - To increase number of PGP in a pool
    
    *usually pg and pgp number is same