Linux Process States

R, S, D, Z — what the letters in ps and top actually mean.

At a Glance

The Model

What a Linux kernel actually stores, and why tooling sometimes disagrees.

struct task_struct {
    ...
    unsigned int            __state;       /* TASK_RUNNING, TASK_INTERRUPTIBLE, ... */
    unsigned int            exit_state;    /* EXIT_ZOMBIE, EXIT_DEAD */
    ...
};

/* include/linux/sched.h */
#define TASK_RUNNING            0x00000000
#define TASK_INTERRUPTIBLE      0x00000001
#define TASK_UNINTERRUPTIBLE    0x00000002
#define __TASK_STOPPED          0x00000004
#define __TASK_TRACED           0x00000008
#define TASK_DEAD               0x00000080
#define TASK_WAKEKILL           0x00000100  /* combined with D to form "killable" */
#define TASK_NOLOAD             0x00000400  /* combined with D to form TASK_IDLE   */

#define TASK_KILLABLE   (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
#define TASK_IDLE       (TASK_NOLOAD   | TASK_UNINTERRUPTIBLE)

#define EXIT_ZOMBIE             0x00000020
#define EXIT_DEAD               0x00000010

The States

Every letter you'll see, its kernel-side name, and what it means for scheduling, signals, and load.

LetterKernel constantMeaningWakes onLoad avg?
R TASK_RUNNING On a CPU or on a runqueue waiting for one. already runnable Yes
S TASK_INTERRUPTIBLE Sleeping; most idle processes live here (read on a socket, epoll_wait, futex). event or any unblocked signal No
D TASK_UNINTERRUPTIBLE Blocked on a specific completion, typically disk I/O or a driver. Signals are ignored. only the awaited event Yes
D (K) TASK_KILLABLE D plus: SIGKILL can wake it. Used by NFS, FUSE, and anything that used to trap users in D forever. awaited event or SIGKILL Yes
I TASK_IDLE D plus NOLOAD: excluded from load-average calculation. Used by kernel worker threads. awaited event No
T TASK_STOPPED Paused by SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU. Resumes on SIGCONT. SIGCONT No
t TASK_TRACED Stopped by a tracer (ptrace, gdb, strace) at a syscall, signal, or breakpoint. tracer PTRACE_CONT No
Z EXIT_ZOMBIE Exited; task_struct kept around so the parent can read exit status. parent wait() No
X EXIT_DEAD / TASK_DEAD Being reaped; the task_struct is on its way out. Rarely seen by tooling. No

State Diagram

The common transitions a userspace task goes through.

stateDiagram-v2 [*] --> R: fork / clone R --> S: blocking syscall (interruptible) R --> D: blocking I/O (uninterruptible) S --> R: event or signal D --> R: I/O completes R --> T: SIGSTOP / SIGTSTP T --> R: SIGCONT R --> t: ptrace stop t --> R: PTRACE_CONT R --> Z: exit / killed S --> Z: killed Z --> [*]: parent wait()

R — Running or Runnable

R means "the scheduler would pick this task if given the chance," not "this task is executing now."

S vs D — The Two Sleeps

Both are off-CPU and blocked. The difference is what it takes to wake them.

S — InterruptibleD — Uninterruptible
Kernel helperwait_event_interruptible()wait_event() / io_schedule()
Wakes on signal?Yes — syscall returns -EINTRNo — signal stays pending
Typical callerssockets, pipes, futex, epoll, sleep(), wait()block-layer I/O, page fault on disk, NFS, direct disk read
Counts toward loadNoYes
Can kill -9?YesNo — only SIGKILL + TASK_KILLABLE (see below)
Why it existsMost things. Default for well-written drivers.The task holds a kernel-allocated resource (buffer, lock, reference) that a signal handler cannot safely release mid-flight.

TASK_KILLABLE — The Fix for Stuck D

Added in 2.6.25 specifically to escape "D forever on NFS."

The problem: an NFS server goes away while a client task is blocked on read(2). Inside the kernel, the task is in wait_event() with TASK_UNINTERRUPTIBLE. A SIGTERM (or even SIGKILL) cannot wake it because the wait is uninterruptible. The task is stuck for eternity, the PID leaks, the mount is unkillable.

The fix: wait_event_killable(), which sets TASK_UNINTERRUPTIBLE | TASK_WAKEKILL. SIGKILL (and only SIGKILL) wakes it; every other signal is still ignored. Callers preserve the "don't return -EINTR from a random syscall" guarantee while still letting the admin reap a truly stuck process.

ps reports D for both plain TASK_UNINTERRUPTIBLE and TASK_KILLABLE; you can't tell them apart from userspace without reading /proc/PID/stack and recognising the wait function.

TASK_IDLE — Why Your Load Average Doesn't Spike

A late addition (4.2) to stop kernel threads from inflating load.

Before 4.2, kernel worker threads like nfsd, loop*, and various XFS workers used TASK_UNINTERRUPTIBLE while waiting for work. That meant an idle file server with 16 NFS threads reported a load average of 16. "Load average" in Linux is runnable + uninterruptibly-sleeping, a heritage from when D was a rare, short-lived state.

TASK_IDLE = TASK_UNINTERRUPTIBLE | TASK_NOLOAD. The NOLOAD flag excludes the task from the load-average tick. Signal behaviour is unchanged: still uninterruptible, still not killable. New code should use wait_event_idle() / schedule_timeout_idle() for "I'm a kernel thread waiting patiently for work."

Z — Zombie

The task is dead. The PID is not yet freed.

T and t — Stopped and Traced

Two related states with different causes.

T (TASK_STOPPED)t (TASK_TRACED)
CauseJob-control signal: SIGSTOP, SIGTSTP (^Z), SIGTTIN, SIGTTOUTracer attached (ptrace): stopped at a syscall boundary, signal, or breakpoint
ResumeSIGCONTTracer issues PTRACE_CONT, PTRACE_SYSCALL, etc.
Who can resumeanyone who can signal itonly the tracer
Typical toolsshell job control (fg, bg)gdb, strace, ltrace, perf uprobes
ptrace interactiona T task can still be attached by a tracercannot be signalled through normal kill except SIGKILL

Load Average — What Actually Counts

The number is not "CPU busy %." It's nr_running + nr_uninterruptible, averaged with three exponential decays (1, 5, 15 minutes).

How to Inspect

Every answer ultimately comes from /proc/PID.

SourceGives youExample
/proc/PID/status Human-readable; State: S (sleeping) grep State /proc/1234/status
/proc/PID/stat One-line, tool-parseable; state letter is field 3 awk '{print $3}' /proc/1234/stat
/proc/PID/wchan Kernel function the task is sleeping in cat /proc/1234/wchanfutex_wait_queue
/proc/PID/stack Full in-kernel stack (needs CONFIG_STACKTRACE) cat /proc/1234/stack
ps -eo pid,stat,wchan,cmd State + flags + sleep point in one line see all D tasks: ps -eo stat,pid,cmd | awk '$1 ~ /^D/'
top / htop Live view; column S is the state letter press t in htop to toggle task tree
bpftrace / perf sched Tracks transitions (sched_switch, sched_wakeup) with nanosecond timestamps bpftrace -e 'tracepoint:sched:sched_switch { @[args->prev_state] = count(); }'

ps State Suffix Flags

The state letter is often followed by one or more flag characters.

SuffixMeaning
<High-priority (negative nice)
NLow-priority (positive nice)
LHas pages locked into memory (mlock)
sSession leader
lMulti-threaded (uses CLONE_THREAD)
+In the foreground process group of its tty

So Ssl+ = interruptible sleep, session leader, multi-threaded, foreground. A very typical shell-launched server.

Gotchas

References