Search code examples
linuxxenstat

cstime error in /proc/pid/stat file


stime or cstime in /proc/pid/stat file is so huge that doesn't make any sense. But just some processes have this wrong cstime on occasion. Just as following:

# ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc
4815     1 Jan08  1-01:20:02 213503-23:34:33 20226149  0.1 /usr/sbin/nscd
#
# cat /proc/4815/stat
4815 (nscd) S 1 4815 4815 0 -1 4202560 2904 0 0 0 21 1844674407359 0 0 20 0 9 0 4021 241668096 326 18446744073709551615 139782748139520 139782748261700 140737353849984 140737353844496 139782734487251 0 0 3674112 16390 18446744073709551615 0 0 17 1 0 0 0 0 0

You can see the stime of proc 4815, nscd, is 1844674407359, equal to 213503-23:34:33, but has just been running for 1-01:20:02.

Another problem process has wrong cstime is following:

a bash fork a sh, which fork a sleep.

8155 (bash) S 3124 8155 8155 0 -1 4202752 1277 6738 0 0 3 0 4 1844674407368 20 0 1 0 1738175 13258752 451 18446744073709551615 4194304 4757932 140736528897536 140736528896544 47722675403157 0 65536 4100 65538 18446744071562341410 0 0 17 5 0 0 0 0 0

8184 (sh) S 8155 8155 8155 0 -1 4202496 475 0 0 0 0 0 0 0 20 0 1 0 1738185 11698176 357 18446744073709551615 4194304 4757932 140733266239472 140733266238480 47964680542613 0 65536 4100 65538 18446744071562341410 0 0 17 6 0 0 0 0 0

8185 (sleep) S 8184 8155 8155 0 -1 4202496 261 0 0 0 0 0 0 0 20 0 1 0 1738186 8577024 177 18446744073709551615 4194304 4212204 140734101195248 140734101194776 48002231427168 0 0 4096 0 0 0 0 17 7 0 0 0 0 0

So you can see that cstime in proc bash is 1844674407368, which is much larger than total cpu time of its children.

My server has one Intel(R) Xeon(R) CPU E5620 @ 2.40GHz, which is 4 cores and 8 threads. Operating system is Suse Linux Enterprise Server SP1 x86_64, as following

# lsb_release  -a
LSB Version:    core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID: SUSE LINUX
Description:    SUSE Linux Enterprise Server 11 (x86_64)
Release:    11
Codename:   n/a
#
# uname -a
Linux node2 2.6.32.12-0.7-xen #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux

So is it the kernel's problem? Can anybody help fix it?


Solution

  • I suspect that you might be simply seeing a kernel bug. Update to the latest offered update kernel for SLES (which is something like 2.6.32.42 or so) and see if it still occurs. Btw, it's stime, not cstime, that is unusually high—in fact, looking close you will notice it is a value that is like a string truncation of 18446744073709551615 (2^64-1) ±a few clocks offset.

    pid_nr: 4815
    tcomm: (nscd)
    state: S
    ppid: 1
    pgid: 4815
    sid: 4815
    tty_nr: 0
    tty_pgrp: -1
    task_flags: 4202560 / 0x402040
    min_flt: 2904
    cmin_flt: 0
    max_flt: 0
    cmax_flt: 0
    utime: 21 clocks (= 21 clocks) (= 0.210000 s)
    stime: 1844674407359 clocks (= 1844674407359 clocks) (= 18446744073.590000 s)
    cutime: 0 clocks (= 0 clocks) (= 0.000000 s)
    cstime: 0 clocks (= 0 clocks) (= 0.000000 s)
    priority: 20
    nice: 0
    num_threads: 9
    always-zero: 0
    start_time: 4021
    vsize: 241668096
    get_mm_rss: 326
    rsslim: 18446744073709551615 / 0xffffffffffffffff
    mm_start_code: 139782748139520 / 0x7f21b50c7000
    mm_end_code: 139782748261700 / 0x7f21b50e4d44
    mm_start_stack: 140737353849984 / 0x7ffff7fb9c80
    esp: 140737353844496 / 0x7ffff7fb8710
    eip: 139782734487251 / 0x7f21b43c1ed3
    obsolete-pending-signals: 0 / 0x0
    obsolete-blocked-signals: 0 / 0x0
    obsolete-sigign: 3674112 / 0x381000
    obsolete-sigcatch: 16390 / 0x4006
    wchan: 18446744073709551615 / 0xffffffffffffffff
    always-zero: 0
    always-zero: 0
    task_exit_signal: 17
    task_cpu: 1
    task_rt_priority: 0
    task_policy: 0
    delayacct_blkio_ticks: 0
    gtime: 0 clocks (= 0 clocks) (= 0.000000 s)
    cgtime: 0 clocks (= 0 clocks) (= 0.000000 s)