How Linux Executes a Binary

From ./prog to the first instruction of main().

At a Glance

The Full Sequence

The end-to-end path from a shell command to the first line of main() for a dynamically-linked ELF.

sequenceDiagram participant Sh as Shell participant K as Kernel participant Ld as ld.so (interpreter) participant L as libc participant P as Program Sh->>K: fork() Note over Sh,K: child is a COW copy of the shell Sh->>K: execve("./prog", argv, envp) Note over K: read magic, pick binfmt handler
(ELF / script / binfmt_misc) K->>K: tear down old address space K->>K: mmap each PT_LOAD segment K->>K: mmap the PT_INTERP interpreter K->>K: build initial stack (argv, envp, auxv) K->>Ld: jump to ld.so entry (AT_ENTRY of the interpreter) Ld->>Ld: resolve DT_NEEDED, mmap each shared object Ld->>Ld: apply relocations (GOT now, PLT lazily) Ld->>Ld: run DT_INIT / .init_array of libs Ld->>P: jump to the program's _start P->>L: __libc_start_main(main, argc, argv, ...) L->>L: init TLS, run program's .init_array L->>P: call main(argc, argv, envp)

execve(2)

The one syscall that replaces the current process image. Never returns on success.

int execve(const char *pathname,
           char *const argv[],
           char *const envp[]);

What happens during a successful execve:

CategoryPreservedReset / replaced
Address spaceAll mappings torn down; a fresh one built from the binary.
Process identityPID, PPID, PGID, SID, session terminal
File descriptorsOpen FDs without FD_CLOEXECFDs flagged O_CLOEXEC are closed.
SignalsSignal mask, pending signalsHandlers reset to SIG_DFL (except ignored → still ignored).
CredentialsReal UID/GIDEffective UID/GID set from file permission bits (setuid/setgid).
Memory locks, timersCleared.
argv, envp, auxvRebuilt on the new stack for the new program.

Binary Format Handlers

Kernel dispatches to a struct linux_binfmt based on the first bytes of the file. Each handler owns the job of setting up the new process image.

HandlerMatchesRole
binfmt_elfMagic \x7fELFParse ELF header + program headers, mmap PT_LOAD, load PT_INTERP, build stack, jump to entry.
binfmt_scriptFirst two bytes #!Parse the shebang line, rewrite argv, recursively execve the interpreter.
binfmt_miscUser-registered magic / extensionRuns a configured interpreter for the file (e.g. java for .class, qemu-user for foreign-arch binaries).
binfmt_flatFLAT headerLegacy format for MMU-less systems (uClinux).

ELF Layout

Two parallel views of an ELF file: section headers describe what the linker uses to assemble the binary; program headers describe what the kernel should put in memory. Both are valid at the same time.

$ readelf -l /bin/ls | head -20

Elf file type is DYN (Position-Independent Executable file)
Entry point 0x67d0
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x00000040 0x00000040 0x0002d8 0x0002d8 R   0x8
  INTERP         0x000318 0x00000318 0x00000318 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x00000000 0x00000000 0x003558 0x003558 R   0x1000
  LOAD           0x004000 0x00004000 0x00004000 0x013e95 0x013e95 R E 0x1000
  LOAD           0x018000 0x00018000 0x00018000 0x008190 0x008190 R   0x1000
  LOAD           0x021010 0x00022010 0x00022010 0x001248 0x0025c8 RW  0x1000
  DYNAMIC        0x021a58 0x00022a58 0x00022a58 0x000200 0x000200 RW  0x8
  ...
Program headerPurpose
PT_LOADA segment to mmap into the process image. One per RWX permission combination (often 4: R, RX, R, RW).
PT_INTERPPath to the dynamic linker/loader (usually /lib64/ld-linux-x86-64.so.2).
PT_DYNAMICPoints to the DT_* table used by ld.so: DT_NEEDED, DT_SYMTAB, DT_RELA, etc.
PT_NOTEBuild-ID, ABI info, auxiliary metadata. Used by gdb, perf, debuginfod.
PT_GNU_STACKFlags the stack as non-executable (standard since the mid-2000s).
PT_GNU_RELROAfter relocations finish, ld.so mprotects this range read-only — hardens the GOT.
PT_TLSThread-local storage initialization image.

Process Address Space after exec

Post-exec layout of a typical x86-64 Linux process. Arrows show growth direction; all base offsets are randomized by ASLR.

High address  (0x00007fff...)
  ┌──────────────────────────────┐
  │   kernel space (not mapped)  │
  ├──────────────────────────────┤
  │             stack            │   grows ↓   argv, envp, auxv at top
  │               ↓              │
  │                              │
  │           ... gap ...        │
  │                              │
  │               ↑              │
  │         mmap region          │   ld.so, shared libs, mmap() calls
  ├──────────────────────────────┤
  │               ↑              │
  │             heap             │   grows ↑   brk() / glibc's arenas
  ├──────────────────────────────┤
  │             .bss             │   zero-initialized globals
  ├──────────────────────────────┤
  │             .data            │   initialized globals (writable)
  ├──────────────────────────────┤
  │            .rodata           │   string literals, const data
  ├──────────────────────────────┤
  │             .text            │   executable code (read-only)
  └──────────────────────────────┘
Low address   (randomized PIE base, e.g. 0x55...)

Inspect a live process at /proc/<pid>/maps.

Dynamic Linking (ld.so)

For a dynamic binary, control actually starts in ld-linux.so. It loads the libraries the program needs, patches up addresses, then transfers control to the program's entry point.

_start → main()

Between the kernel's jump and your main() there are several pieces of libc glue. Simplified x86-64 version:

/* crt1.o  (from glibc) */
_start:
    xor   %ebp, %ebp                  /* zero the base ptr (end of call chain) */
    mov   (%rsp), %edi                /* argc from kernel's stack */
    lea   8(%rsp), %rsi               /* argv */
    lea   16(%rsp,%rdi,8), %rdx       /* envp = argv + argc + 1 */
    and   $-16, %rsp                  /* 16-byte align */
    lea   __libc_csu_fini(%rip), %r8  /* finalizer for __libc_start_main */
    lea   __libc_csu_init(%rip), %rcx /* initializer */
    lea   main(%rip), %rdi            /* pointer to user's main() */
    call  __libc_start_main           /* never returns */
    hlt                               /* unreachable */

__libc_start_main then:

Shebang (#!) Resolution

Scripts aren't magic — the kernel recognizes #! as a binary format and rewrites the execve in-flight.

sequenceDiagram participant U as User process participant K as Kernel U->>K: execve("./deploy.py", ["./deploy.py", "prod"], envp) Note over K: read first bytes → "#!" Note over K: binfmt_script reads line 1:
#!/usr/bin/env python3 Note over K: rewrite argv to:
["/usr/bin/env", "python3", "./deploy.py", "prod"] K->>K: recursive execve on /usr/bin/env Note over K: /usr/bin/env is ELF → binfmt_elf
takes over

Quirks worth knowing:

Security & Hardening

Exec is also the only moment when privileges, namespaces, and memory protections get a fresh start. The kernel does most of the enforcement here.

FeatureWhat it does
setuid / setgid bitEffective UID/GID becomes the file's owner. Dropped silently if filesystem is mounted nosuid or the process has no_new_privs set.
File capabilitiessetcap cap_net_bind_service+ep lets a non-root binary bind to port 80 without full root. Replaces setuid for finer control.
no_new_privsOnce set (prctl(PR_SET_NO_NEW_PRIVS)), no subsequent exec can gain privileges — setuid bits and capabilities are ignored. Required for seccomp-bpf in unprivileged processes.
ASLRRandomizes PIE base, mmap region, heap start, and stack. Controlled by /proc/sys/kernel/randomize_va_space.
NX (W^X)The PT_GNU_STACK segment marks the stack non-executable. Data pages aren't executable; .text isn't writable.
RELROAfter relocations, .got / .init_array / .dynamic become read-only. "Full RELRO" (-z now) resolves all PLT entries up-front so the GOT can be hardened immediately.
Stack canariesCompiler inserts a random guard value between locals and the return address; libc aborts if it changes before ret.
Close-on-execFDs with O_CLOEXEC are closed during exec, so a new program can't inherit sensitive handles.

References