Search code examples
prometheusgrafanaprometheus-node-exporter

node_exporter unable to export specific volume/mount point metrics - err="permission denied"


We have 3 disks on the server

  1. /dev/nvme2n1p1 for root
  2. /dev/nvme0n1 for /data
  3. /dev/nvme1n1 for /data/postgresql/12/main/pg_wal

The node-exporter can export all the metrics to Prometheus server for the first 2 mount points but for the 3rd one (nvme1n1) few metrics, it won't be able to export.

It can still export the following metrics for 3rd one

curl "http://localhost:9100/metrics"|grep nvme1n1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0node_disk_discard_time_seconds_total{device="nvme1n1"} 0
node_disk_discarded_sectors_total{device="nvme1n1"} 0
node_disk_discards_completed_total{device="nvme1n1"} 0
node_disk_discards_merged_total{device="nvme1n1"} 0
node_disk_io_now{device="nvme1n1"} 0
node_disk_io_time_seconds_total{device="nvme1n1"} 21338.728
node_disk_io_time_weighted_seconds_total{device="nvme1n1"} 56694.8
node_disk_read_bytes_total{device="nvme1n1"} 8.1892795904e+11
node_disk_read_time_seconds_total{device="nvme1n1"} 4943.992
node_disk_reads_completed_total{device="nvme1n1"} 3.130765e+06
node_disk_reads_merged_total{device="nvme1n1"} 1948
node_disk_write_time_seconds_total{device="nvme1n1"} 83561.291
node_disk_writes_completed_total{device="nvme1n1"} 1.5033066e+07
node_disk_writes_merged_total{device="nvme1n1"} 2.85686e+06
node_disk_written_bytes_total{device="nvme1n1"} 3.1148191744e+12
node_filesystem_device_error{device="/dev/nvme1n1",fstype="ext4",mountpoint="/data/postgresql/12/main/pg_wal"} 1
100 84365    0 84365    0     0  13.4M      0 --:--:-- --:--:-- --:--:-- 13.4M

But it won't be able to ship the following metrics

node_filesystem_size_bytes
node_filesystem_avail_bytes
node_filesystem_free_bytes

This is the error from the debug logs:

Nov 12 14:39:10 host1 node_exporter[20020]: level=debug ts=2020-11-12T09:09:10.701Z caller=filesystem_linux.go:94 collector=filesystem msg="Error on statfs() system call" rootfs=/data/postgresql/12/main/pg_wal err="permission denied"

Kindly advise what's wrong here. Thanks


Solution

  • It may help somebody!

    Giving +rx on all directories in that path (/pg-data, /pg-data/postgresql, /pg-data/postgresql/12 and so on) – either to all (a+rx) or the group that owns them and add the node exporter to that group. This is not a security issue because it allows anyone / that group to ls -l the directories but not to access the files within unless those files' permissions are way too wide.

    I have tried the following options:

    • If I update the permission (a+rx) of the /pg-data/postgresql/12/main folder then the postgres service won't come up.
    • I have tried adding node_exporter user to postgres group but it has no effect.

    Since the above steps didn't help me out so running node_exporter service as a root user(systemd unit) did a trick.

    [Unit]
    Description=Node Exporter
    After=network.target
    
    [Service]
    User=root
    Group=root
    Type=simple
    ExecStart=/usr/local/bin/node_exporter
    
    [Install]
    WantedBy=multi-user.target
                  
    

    Thanks to MATT

    The metrics visible in Grafana now. enter image description here