TCP Handshake
The three-way handshake that establishes a reliable, ordered byte stream.
At a Glance
- Purpose — Synchronize initial sequence numbers (ISNs) and negotiate connection options before either side sends data. Both sides must agree on where the byte stream starts.
- Three segments —
SYN(client → server),SYN-ACK(server → client),ACK(client → server). The ACK can piggyback the first data bytes. - Cost — 1 RTT of latency before the client can send application data. On LAN that's <1ms; over the internet often 50–300ms.
- State machine — Each side walks a well-defined FSM:
CLOSED,LISTEN,SYN-SENT,SYN-RECEIVED,ESTABLISHED, and the teardown states. - What's negotiated — ISN, MSS, window scale, SACK-permitted, timestamps. All carried in TCP options on the SYN and SYN-ACK.
- 4-tuple — A connection is identified by
(src IP, src port, dst IP, dst port). Everything from connection multiplexing to ephemeral-port exhaustion follows from this. - Termination is separate — Closing is a four-way
FIN/ACKdance because each direction of the stream closes independently (half-close is legal). - Edge cases — simultaneous open, half-open connections, SYN floods, TCP Fast Open.
Sequence Diagram
Client picks ISN x; server picks ISN y. Each side ACKs the other's sequence + 1 (SYN consumes one sequence number).
Connection Identity: the 4-tuple
A TCP connection isn't named by a single "handle" — it's identified by the four values in the IP and TCP headers that distinguish it from every other connection on the host.
The 4-tuple is:
( source IP, source port, destination IP, destination port ) The kernel's TCP table is keyed on exactly this quadruple. Implications that flow from it:
- Connection multiplexing — many clients can talk to one server on the same
(dst IP, dst port=443); each connection is still unique because the source side differs. - Ephemeral ports — when a client connects out, the OS picks a free source port (usually from
/proc/sys/net/ipv4/ip_local_port_range). That port plus the other three values keeps the 4-tuple globally unique on the client. - Outbound connection limit — a client can only have ~28K connections to the same
(dst IP, dst port)because there are only ~28K ephemeral ports.SO_REUSEPORT/ binding a pool of source IPs is the usual fix. - TIME-WAIT collision — reusing a 4-tuple that's still in
TIME-WAITon either side is rejected. This is why heavy client-initiated close causesEADDRNOTAVAIL. - NAT — a middlebox that rewrites source IP/port must remember the translation keyed by 4-tuple; the 5-tuple (adding protocol number) is what most stateful firewalls index.
Segment Flags
Six bits in the TCP header that shape how a segment is interpreted.
| Flag | Meaning | When set |
|---|---|---|
SYN | Synchronize sequence numbers | Only on the first segment from each side during handshake. |
ACK | Acknowledgment field is valid | On every segment after the initial SYN. |
FIN | No more data from sender | Graceful half-close; peer may still send. |
RST | Abort — tear down immediately | On unexpected segment, closed port, or explicit reset. |
PSH | Deliver buffered data to app now | Sender hints the receiver not to wait for more. |
URG | Urgent pointer is valid | Rarely used in practice; many stacks ignore it. |
State Transitions
The handshake is the first half of the TCP state machine. Each row is one transition.
| Side | From | To | Trigger |
|---|---|---|---|
| Server | CLOSED | LISTEN | Passive open — listen() called. |
| Client | CLOSED | SYN-SENT | Active open — connect() sends SYN. |
| Server | LISTEN | SYN-RECEIVED | Received SYN; replied with SYN-ACK. |
| Client | SYN-SENT | ESTABLISHED | Received SYN-ACK; sent final ACK. |
| Server | SYN-RECEIVED | ESTABLISHED | Received client's ACK. |
What's Negotiated
TCP options on the SYN and SYN-ACK. Each side offers; both effectively take the minimum or the intersection.
| Option | Typical value | Purpose |
|---|---|---|
| Initial sequence number (ISN) | Random 32-bit | Randomized per RFC 6528 to prevent spoofing and confusion with prior connection instances. |
| Maximum Segment Size (MSS) | 1460 bytes (Ethernet) | Largest TCP payload each side will accept. Usually MTU - 40 (IPv4+TCP headers). |
| Window Scale | 7–14 (shift count) | Left-shifts the 16-bit window field so receive windows >64KB are possible. Required for high-BDP links. |
| SACK Permitted | On by default | Enables Selective ACKs so the receiver can acknowledge non-contiguous ranges, avoiding full-window retransmits. |
| Timestamps | On by default | Better RTT estimates (smoothed RTT for retransmit timer) and PAWS: protect against wrapped sequence numbers on fast links. |
Connection Termination
Each direction of the stream closes independently — hence four segments (or three if the FIN-ACK is combined).
Why TIME-WAIT exists
The initiator of the close sits in TIME-WAIT for 2×MSL (Maximum Segment Lifetime, typically 60–120s). Two reasons:
- Reliable last ACK — if the final ACK is lost, the peer retransmits its FIN and needs us to still be around to re-ACK it.
- Old segment flush — prevents delayed segments from a prior connection instance (same 4-tuple) from being mistaken for data in a new connection.
Consequence: a server that initiates many closes accumulates TIME-WAIT sockets. Production tuning usually involves SO_REUSEADDR, tcp_tw_reuse, or making the client close first.
Edge Cases
Simultaneous open
Both sides connect() to each other at the same time. Each receives a SYN in state SYN-SENT; both transition through SYN-RECEIVED and eventually ESTABLISHED. Rare in practice but the state machine handles it.
Half-open connections
One side crashed or had its route torn down; the other still thinks the connection is ESTABLISHED. Detected only when the live side tries to send — peer responds with RST (if it was rebooted) or nothing (if it vanished, needing a keepalive to detect).
SYN flood / SYN cookies
An attacker sends many SYNs with spoofed source addresses; server allocates state for each half-open connection and eventually its SYN queue fills, denying service. Mitigation: SYN cookies — encode connection state into the server's ISN via an HMAC, so no server-side memory is allocated until the client completes the handshake with a matching ACK.
TCP Fast Open (TFO)
Skip the handshake cost for repeat peers. First connection: server mints a TFO cookie during the handshake. Subsequent connections: client sends SYN + cookie + data in one segment; server accepts the data before the 3WHS completes. Saves 1 RTT, but only useful if the request-response is idempotent (replay risk if the SYN+data is retransmitted).
References
- RFC 9293 — Transmission Control Protocol (2022, the consolidated and current spec).
- RFC 793 — the original TCP specification (1981).
- RFC 6528 — Defending against sequence-number attacks (ISN randomization algorithm).
- RFC 7323 — TCP Extensions for High Performance: window scaling, timestamps, PAWS.
- RFC 2018 — TCP Selective Acknowledgment (SACK).
- RFC 7413 — TCP Fast Open.
- RFC 4987 — TCP SYN Flooding Attacks and Common Mitigations.
- RFC 6298 — Computing TCP's Retransmission Timer.
- tcp(7) — Linux TCP implementation man page: sockopts, tunables, Linux-specific extensions.
- Wikipedia: TCP — high-level overview and history.
- Wikipedia: TCP connection termination — four-way close and TIME-WAIT.