map / Concepts / Networking

Concepts Core ~180 min requires: Operating Systems

Networking

This assumes you know only that "computers talk over a network," and takes you to where you can trace a packet through every layer and debug any connectivity problem out loud. Every idea is built from the one before it. Grounded in Stevens' TCP/IP Illustrated and the Google SRE book. Read it top to bottom; the labs and the Interview Gauntlet at the end will then feel like review.

Now build it by hand

Q1See L2 and L3 on one interfacewarm-up

An interface has a hardware address (L2, the MAC) and an IP address (L3). Packets are addressed by IP but delivered on the local wire by MAC — two layers, one interface.

Under the hood

ip addr shows both: link/ether is the burned-in MAC used by L2/ARP on the local wire; inet is the assigned, routable IP used by L3. The /24 after the IP is the subnet mask that tells the kernel which destinations are "local" (deliver directly) vs "remote" (send to the gateway).

Task

List your interfaces and identify the MAC and the IP, and note the subnet.

Verify it yourself

verify

$ ip addr show

link/ether = L2 MAC; inet …/24 = L3 IP + subnet. Two addresses, two layers. ip -br addr is a compact view.

Reveal solution

solution

$ ip addr show
$ ip -br addr
$ ip neigh   # the ARP cache: local IP -> MAC mappings

Q2Read the routing table and ask the kernelcore

The kernel decides where each packet goes by longest-prefix match over the routing table, falling back to the default route.

Under the hood

ip route lists your routes; the default via <gateway> line is 0.0.0.0/0. ip route get <ip> runs the actual forwarding decision and prints which interface and next-hop the kernel would use — the single most useful routing command, because it shows reality, not your assumption.

Task

Show your routes, then ask the kernel how it would reach a public IP and a local one.

Verify it yourself

verify

$ ip route get 1.1.1.1

For a public IP you'll see it routed via your gateway; for a local IP, delivered directly on your interface (no gateway). That difference is the local-vs-remote decision.

Reveal solution

solution

$ ip route
$ ip route get 1.1.1.1        # remote -> via gateway
$ ip route get 192.168.1.1    # local -> direct (adjust to your subnet)

Q3Capture the 3-way handshakecore

A TCP connection opens with SYN → SYN-ACK → ACK before any data. You can watch it on the wire.

Under the hood

tcpdump shows the flags: [S] is SYN, [S.] is SYN-ACK, [.] is ACK. Seeing the handshake complete proves L3+L4 connectivity to the port. Seeing repeated [S] with no [S.] reply is the signature of a firewall/black hole — your SYNs leave but nothing answers.

Task

Sniff TCP on port 443, then make an HTTPS request in another terminal.

Verify it yourself

verify

$ sudo tcpdump -i any -n 'tcp port 443' -c 6

You'll see Flags [S], then [S.], then [.] — the handshake — followed by encrypted data. The exact sequence from the stepper, live on your machine.

Reveal solution

solution

$ sudo tcpdump -i any -n 'tcp port 443' -c 10 &
$ curl -sI https://example.com >/dev/null
# watch the [S] / [S.] / [.] flags in the capture

Reach engineers who read the man page

Native, contextual, no tracking — this is how the curriculum stays free.

Q4Sockets are file descriptors — inspect them with sscore

Every connection is a socket, which is just an fd. ss shows what's listening and what's established, with the owning process and TCP state.

Under the hood

ss -tlnp answers "what is listening and who owns it?" — the first check for "connection refused" (is anything on that port?). ss -tnp shows established connections and states (ESTABLISHED, TIME_WAIT, CLOSE_WAIT). A pile of CLOSE_WAIT means your app isn't closing sockets — an fd leak wearing a networking costume.

Task

List listening sockets with their processes, then established connections.

Verify it yourself

verify

$ ss -tlnp

You see each LISTEN socket, its port, and the process holding it. ss -tnp then shows live connections and their TCP states from the state-machine diagram.

Reveal solution

solution

$ ss -tlnp        # listening sockets + process
$ ss -tnp         # established connections + state
$ ss -s           # summary counts by state

Q5Resolve a name and watch the hierarchycore

DNS turns a name into an IP by walking resolver → root → TLD → authoritative, caching by TTL.

Under the hood

dig NAME shows the answer, the TTL, and which server replied. dig +trace NAME performs the walk yourself from the root down, so you can see exactly which level fails when resolution breaks. Separating DNS from connectivity is the fastest way to cut an incident in half.

Task

Resolve a name, read its TTL, then trace the full hierarchy.

Verify it yourself

verify

$ dig +noall +answer google.com

You get the A record and its TTL. dig +trace google.com then shows root → .com → authoritative — the stepper, live.

Reveal solution

solution

$ dig +noall +answer google.com
$ dig +trace google.com | tail -20
$ cat /etc/resolv.conf   # which resolver you are using

Q6Break-and-fix — connection refused vs timeoutdebug

"Refused" and "timeout" are different failures with different causes — mixing them up sends you debugging the wrong thing.

Under the hood

Refused: your SYN reached the host and got a TCP RST back — the host is up but nothing is listening (wrong port, service down). Timeout: your SYN got no reply at all — silently dropped by a firewall, or the host is unreachable. nc -vz distinguishes them instantly, and tcpdump confirms (RST returned vs no reply).

Task

Trigger a "connection refused" against a closed local port, and reason about what a timeout would look like instead.

Verify it yourself

verify

$ nc -vz 127.0.0.1 9   # port 9 has nothing listening

You get Connection refused immediately — the host (localhost) is up and sent a RST because nothing listens on port 9. A timeout (hanging, no answer) would instead mean a firewall silently dropped the packet — a completely different root cause.

Reveal solution

Refused = reached host, nothing listening (check the service / port). Timeout = never reached / silently dropped (check firewall, routing, MTU).

solution

$ nc -vz 127.0.0.1 9        # refused: nothing listening
$ nc -vz 8.8.8.8 12345      # likely timeout: dropped by firewall
$ ss -tlnp | grep :9        # confirm nothing is listening

What you just built

You can now trace a packet through every layer: L2 frames and MAC/ARP on the local wire, L3 IP and routing across networks, L4 TCP's handshake/reliability/state-machine and UDP's speed, sockets as file descriptors, DNS resolving names, and NAT/MTU shaping the real-world path. "What happens when you type a URL" is a story you can tell end to end — and every debugging problem is just walking the layers until one breaks. This is the exact machinery you'll build by hand in Linux and watch Kubernetes orchestrate.

The interview gauntlet

The questions actually asked for SRE, network, and systems roles — conceptual, debugging, and the famous synthesis prompts. Each expands to show what the interviewer is really probing for, a model answer, and the follow-up traps. Answer all of these out loud and you have mastered this module.

Q1What happens when you type a URL and press Enter?

What they're really probing

The synthesis question — whether you can narrate every layer in order, from DNS to render.

Model answer

DNS resolves the name to an IP (checking browser/OS caches, then resolver → root → TLD → authoritative). The kernel makes a routing decision — not on my subnet, so send to the default gateway; it ARPs for the gateway MAC and frames the packet. A TCP 3-way handshake (SYN/SYN-ACK/ACK) opens a connection to port 443. For HTTPS, a TLS handshake negotiates encryption and verifies the certificate. Then HTTP: GET / → 200 OK + HTML; the browser parses it, fetches sub-resources over reused connections, and renders. Every layer of the stack fires in sequence.

Follow-up traps

"Where would you look if it's slow?" — time each stage: DNS (dig), connect+TLS (curl -w), server response.
"What if DNS is the problem?" — nothing downstream works; always check DNS first.

Q2TCP vs UDP — differences, and when do you choose each?

What they're really probing

Understanding reliability vs speed trade-offs and real use cases.

Model answer

TCP is connection-based, reliable, and ordered: a handshake, then acknowledgements and retransmission guarantee every byte arrives once, in order — at the cost of setup latency and overhead. UDP is connectionless with no guarantees: fire a packet, no handshake, no retransmit, no ordering — minimal latency. Choose TCP when correctness matters (web, SSH, databases, file transfer). Choose UDP when timeliness beats perfection (DNS — one small query; live video/voice/gaming — a late packet is worse than a lost one). Modern protocols like QUIC/HTTP3 build reliability on top of UDP to get both.

Follow-up traps

"Why does DNS use UDP?" — one small request/response; a handshake would double the latency.
"How does TCP guarantee order?" — sequence numbers let the receiver reassemble regardless of arrival order.

Q3Explain the 3-way handshake. Why three messages?

What they're really probing

Whether you understand that both directions must be established, not just one.

Model answer

SYN → SYN-ACK → ACK. The client's SYN opens its direction and proposes a starting sequence number. The server's SYN-ACK both acknowledges the client's SYN and opens the server's own direction with its sequence number. The client's ACK acknowledges the server's SYN. Three messages because a TCP connection is bidirectional — each direction needs a SYN and a matching ACK, and the server can piggyback its SYN onto the ACK, collapsing four messages into three. After this, both sides have confirmed send and receive work, and data flows.

Follow-up traps

"Cost?" — one full round-trip before any data; why keep-alive/connection reuse matter.
"What's a SYN flood?" — many SYNs, no final ACK, exhausting the server's half-open connection table (DoS).

Q4What is TIME_WAIT, and is a lot of it a problem?

What they're really probing

A favourite trap — understanding why TIME_WAIT exists and distinguishing it from CLOSE_WAIT.

Model answer

After the side that initiates the close finishes the teardown, its socket lingers ~60s in TIME_WAIT. Two reasons: to absorb any straggling/retransmitted packets from the old connection, and to prevent a brand-new connection reusing the same four-tuple from receiving stale data. It's normal and usually harmless. It only becomes a problem on a busy client making huge numbers of short-lived outbound connections, where accumulated TIME_WAITs can exhaust ephemeral ports — fixed by connection reuse/keep-alive, not by blindly tuning it away.

Follow-up traps

"TIME_WAIT vs CLOSE_WAIT?" — TIME_WAIT: you closed, waiting safely (normal). CLOSE_WAIT: the peer closed and your app hasn't called close() — an application socket leak.
"Which side gets TIME_WAIT?" — whoever closes first.

Q5"Connection refused" vs "connection timed out" — what does each tell you?

What they're really probing

Whether you can diagnose from the failure mode instead of guessing.

Model answer

Connection refused: your SYN reached the host and it replied with a TCP RST — the host is up and reachable, but nothing is listening on that port (service down, wrong port). Fix on the service side. Connection timed out: your SYN got no reply at all — silently dropped, almost always by a firewall or a routing/MTU black hole, or the host is down/unreachable. Fix on the network/firewall side. The failure mode tells you which half of the stack to investigate — refused points at the app, timeout points at the network.

Follow-up traps

"How confirm quickly?" — nc -vz host port; tcpdump to see RST returned vs no reply.
"Refused but the service is running?" — bound to the wrong interface/localhost only, or wrong port.

Q6A connection works for small requests but hangs on large transfers. What's going on?

What they're really probing

The MTU black hole — a deep, distinguishing question.

Model answer

Classic MTU black hole. Small packets (the handshake, tiny requests) fit within every link's MTU and pass fine, so the connection looks healthy. But a large transfer produces packets bigger than some link's MTU; that link drops them and sends back an ICMP "fragmentation needed" message — and if a firewall blocks that ICMP, the sender never learns to shrink its packets, so large payloads are silently dropped and the transfer hangs. Common with VPNs and overlay networks (VXLAN in Kubernetes steals ~50 bytes of MTU). The fix is MSS clamping (advertise a smaller segment size) or correcting the overlay MTU.

Follow-up traps

"How confirm?" — ping -M do -s <size> to find where large packets stop; check for blocked ICMP.
"Why does it bite Kubernetes?" — encapsulation overhead lowers the real MTU between nodes.

Q7Walk me through how a private machine (or container) reaches the internet.

What they're really probing

Understanding NAT, private vs public addressing, and statefulness.

Model answer

The machine has a private, non-routable IP (10.x/192.168.x). Its router/host holds a public IP and performs NAT: on the way out it rewrites the source from the private IP to the public IP (SNAT/masquerade) and records the mapping in a connection-tracking table; on the way back it reverses the rewrite, delivering the reply to the right private machine. Many devices thus share one public IP, disambiguated by port. It's stateful — the mapping is created on the first packet and reused for the flow — which is why only the outbound direction needs a rule and why a full conntrack table drops new connections. Docker and Kubernetes do exactly this with iptables.

Follow-up traps

"How does inbound port-forwarding work?" — DNAT: rewrite a public destination to an internal one (docker -p).
"What breaks at scale?" — conntrack table exhaustion; ephemeral port limits.

Q8Design: distribute traffic across many backend servers behind one address. How?

What they're really probing

Load-balancing fundamentals — L4 vs L7, health checks, and the trade-offs.

Model answer

Put a load balancer behind one virtual IP/DNS name. Two levels: an L4 load balancer forwards by IP/port (fast, protocol-agnostic, just picks a backend per connection — e.g. via DNAT, exactly like a Kubernetes Service). An L7 load balancer understands HTTP, so it can route by path/host, terminate TLS, and retry — richer but heavier. Distribute with a policy (round-robin, least-connections, or hashing for stickiness), and run health checks so unhealthy backends are removed automatically. Key concerns: session stickiness if needed, graceful backend drain, and avoiding the LB itself becoming a single point of failure (run several, fronted by DNS or anycast).

Follow-up traps

"L4 vs L7 — when each?" — L4 for raw speed/any protocol; L7 for HTTP routing, TLS termination, retries.
"How does this map to Kubernetes?" — a Service is an L4 LB (kube-proxy/iptables); an Ingress is L7.
"How do clients find the LB?" — DNS, and often anycast for the DNS/LB itself.

Q9Why is DNS behind so many outages, and how do you debug it?

What they're really probing

Operational maturity — TTL/caching pitfalls and separating DNS from connectivity.

Model answer

DNS sits before every connection, so when it fails, everything downstream "breaks" with confusing symptoms. Common causes: caching/TTL (a changed record hasn't propagated, so some clients hit the old IP — "works for me, not for you"), a slow/unreachable resolver adding latency to every request, or a misconfigured record. Debug by separating resolution from connectivity: dig NAME to check the answer and TTL, dig +trace to see which level of the hierarchy fails, and compare against a known-good resolver (dig @1.1.1.1 NAME). Only once DNS returns the right IP do you move on to ping/ss/tcpdump.

Follow-up traps

"Why long vs short TTL?" — long = fast + less load but slow to change; short = quick propagation but more queries.
"In Kubernetes?" — CoreDNS resolves Service names; a slow/failing CoreDNS makes the whole cluster "slow."

Q10What actually is a subnet, and why does it matter?

What they're really probing

Whether the CIDR/local-vs-remote concept is solid — it underpins routing and cloud networking.

Model answer

A subnet is a contiguous range of IPs sharing a prefix, written in CIDR like 10.0.1.0/24 — the /24 means the first 24 bits are the network, the last 8 identify hosts (256 addresses). It matters because it defines local vs remote: a machine delivers directly (ARP + frame) to destinations within its subnet, and sends everything else to the gateway. It's the basis of routing (longest-prefix match over subnets), of cloud network design (VPCs are carved into subnets), and of Kubernetes (each pod/node gets addresses from planned CIDR ranges — overlaps cause silent, painful bugs).

Follow-up traps

"How many hosts in a /24? a /16?" — 254 usable, ~65k.
"Two machines same subnet, can't talk?" — check they truly share the subnet mask; a mismatch makes each think the other is remote.

Networking

1 · What a network actually is

2 · The layered model and encapsulation

3 · L2: the local wire, MAC addresses, and ARP

ARP: the IP → MAC bridge

4 · L3: IP addresses, subnets, and routing

Subnets and CIDR — the one piece of math you must know

The routing decision

5 · L4: TCP, UDP, ports, and the handshake

The TCP 3-way handshake

The TCP state machine (and TIME_WAIT)

6 · Sockets: networking is just file descriptors

7 · DNS: turning names into addresses

8 · The whole journey: what happens when you type a URL

9 · MTU, fragmentation, and the black hole

10 · NAT: how private networks reach the internet

11 · A debugging method that always works

Reach engineers who read the man page