Tailscale Authors - How NAT Traversal Works (Highlights)

# Tailscale Authors - How NAT Traversal Works (Highlights) ![rw-book-cover|256](https://tailscale.com/blog/how-nat-traversal-works/social.png) ## Metadata **Review**:: [readwise.io](https://readwise.io/bookreview/26687469) **Source**:: #from/readwise **Zettel**:: #zettel/fleeting **Status**:: #x **Authors**:: [[Tailscale Authors]] **Full Title**:: How NAT Traversal Works **Category**:: #articles #readwise/articles **Category Icon**:: 📰 **Document Tags**:: #networking **URL**:: [tailscale.com](https://tailscale.com/blog/how-nat-traversal-works/) **Host**:: [[tailscale.com]] **Highlighted**:: [[2023-04-21]] **Created**:: [[2023-04-22]] ## Highlights - First, the protocol should be based on UDP. You *can* do NAT traversal with TCP, but it adds another layer of complexity to an already quite complex problem, and may even require kernel customizations depending on how deep you want to go. ([View Highlight](https://read.readwise.io/read/01gyhwyw88p6cwcmhk835sbtx8)) ^512619221 - If you’re reaching for TCP because you want a stream-oriented connection when the NAT traversal is done, consider using QUIC instead. It builds on top of UDP, so we can focus on UDP for NAT traversal and still have a nice stream protocol at the end. ([View Highlight](https://read.readwise.io/read/01gyhwzhwwkjgdc4vdcp0141v2)) ^512619262 - Second, you need direct control over the network socket that’s sending and receiving network packets. As a rule, you can’t take an existing network library and make it traverse NATs, because you have to send and receive extra packets that aren’t part of the “main” protocol you’re trying to speak. ([View Highlight](https://read.readwise.io/read/01gyhx0knbywm7d67935cw9751)) ^512619337 ### Figuring out firewalls #### Firewall face-off - Our only constraint is that the machine that’s *behind* the firewall must be the one initiating all connections. Nothing can talk to it, unless it talks first. ([View Highlight](https://read.readwise.io/read/01gyhx7t1fc34b77cy3r4teg07)) ^512620033 - In the VPN world, this leads to a hub-and-spoke topology: the hub has no firewalls blocking access to it and the firewalled spokes connect to the hub. ([View Highlight](https://read.readwise.io/read/01gyhx8x3qq3jejr2gknx6xz81)) ^512620088 #### Finessing finicky firewalls - As long as *some* packet flowed outwards with the right source and destination, any packet that *looks like* a response will be allowed back in, even if the other side never received your packet! ([View Highlight](https://read.readwise.io/read/01gyhxg34wa9jpc55c3qpvw0r1)) ^512621002 - So, to traverse these multiple stateful firewalls, we need to share some information to get underway: the peers have to know in advance the `ip:port` their counterpart is using. One approach is to statically configure each peer by hand, but this approach doesn’t scale very far. To move beyond that, we built a [coordination server](https://tailscale.com/blog/how-tailscale-works/#the-control-plane-key-exchange-and-coordination) to keep the `ip:port` information synchronized in a flexible, secure manner. ([View Highlight](https://read.readwise.io/read/01gyhxbxj8e71dm1q7426nw9fk)) ^512620602 #### Creative connectivity caveats - Both endpoints must attempt communication at roughly the same time, so that all the intermediate firewalls open up while both peers are still around. ([View Highlight](https://read.readwise.io/read/01gyhxng1jmsz0nc7t6baxv3vt)) ^512621571 > In Tailscale, our coordination server and fleet of DERP (Detour Encrypted Routing Protocol) servers act as our side channel. - Stateful firewalls have limited memory, meaning that we need periodic communication to keep connections alive. If no packets are seen for a while (a common value for UDP is 30 seconds), the firewall forgets about the session, and we have to start over. ([View Highlight](https://read.readwise.io/read/01gyhxmqdm7a11bq9fnb722qb0)) ^512621487 ### The nature of NATs #### A study in STUN - STUN is both a set of studies of the detailed behavior of NAT devices, and a protocol that aids in NAT traversal. ([View Highlight](https://read.readwise.io/read/01gyhy0xb035kvfddn5vrdqzdz)) ^512623514 - STUN relies on a simple observation: when you talk to a server on the internet from a NATed client, the server sees the public `ip:port` that your NAT device created for you, not your LAN `ip:port`. ([View Highlight](https://read.readwise.io/read/01gyhy1mr8fash9m125nk684xn)) ^512623797 - Incidentally, this is why we said in the introduction that, if you want to implement this yourself, the NAT traversal logic and your main protocol have to share a network socket. ([View Highlight](https://read.readwise.io/read/01gyhy5pry8247j9qe9ge7axkr)) ^512623936 #### How this helps - The problem is an assumption we made earlier: when the STUN server told us that we’re `2.2.2.2:4242` from its perspective, we assumed that meant that we’re `2.2.2.2:4242` from the entire internet’s perspective, and that therefore anyone can reach us by talking to `2.2.2.2:4242`. ([View Highlight](https://read.readwise.io/read/01gyhy7tvb6aygd06xxtanaaq5)) ^512624306 - Other NAT devices are more difficult, and create a completely different NAT mapping for every different destination that you talk to. On such a device, if we use the same socket to send to `5.5.5.5:1234` and `7.7.7.7:2345`, we’ll end up with two different ports on 2.2.2.2, one for each destination. If you use the wrong port to talk back, you don’t get through. ([View Highlight](https://read.readwise.io/read/01gyhya0r7741fnw1xsgf1k2j6)) ^512624431 #### Naming our NATs - [RFC 4787](https://tools.ietf.org/html/rfc4787) calls the easy variant “Endpoint-Independent Mapping” (EIM for short), and the hard variant “Endpoint-Dependent Mapping” (EDM for short). ([View Highlight](https://read.readwise.io/read/01gyhyex9rx7rnpp27kdrrhf1w)) ^512624769 #### NAT Cone Types #### Have you considered giving up? - We could use a relay that both sides can talk to unimpeded, and have it shuffle packets back and forth. But wait, isn’t that terrible? ([View Highlight](https://read.readwise.io/read/01gyhz33mxwaqrkxb887yfsr56)) ^512627038 > That’s still much better than no connection at all, which is where we were heading. - You could implement relays in a variety of ways. The classic way is a protocol called TURN (Traversal Using Relays around NAT). ([View Highlight](https://read.readwise.io/read/01gyhz4nkryfpcba47gne3e3fr)) ^512627136 - Instead, we created [DERP (Detoured Encrypted Routing Protocol)](https://tailscale.com/blog/how-tailscale-works/#encrypted-tcp-relays-derp), which is a general purpose packet relaying protocol. It runs over HTTP, which is handy on networks with strict outbound rules, and relays encrypted payloads based on the destination’s public key. ([View Highlight](https://read.readwise.io/read/01gyhz6aryzgm9gcrsrr55rtz4)) ^512627195 - As we briefly touched on earlier, we use this communication path both as a data relay when NAT traversal fails (in the same role as TURN in other systems) and as the side channel to help with NAT traversal. DERP is both our fallback of last resort to get connectivity, and our helper to upgrade to a peer-to-peer connection, when that’s possible. ([View Highlight](https://read.readwise.io/read/01gyhz7266v68002zf6r8nvam9)) ^512627351 ### NAT notes for nerds #### The benefits of birthdays - We can do much better than that, with the help of the [birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem). Rather than open 1 port on the hard side and have the easy side try 65,535 possibilities, let’s open, say, 256 ports on the hard side (by having 256 sockets sending to the easy side’s `ip:port`), and have the easy side probe target ports at random ([View Highlight](https://read.readwise.io/read/01gyhzffe4jka5394nvjeq0xw8)) ^512628184 #### Partially manipulating port maps - So, to help our connectivity further, we can look for UPnP IGD, NAT-PMP and PCP on our local default gateway. If one of the protocols elicits a response, we request a public port mapping. ([View Highlight](https://read.readwise.io/read/01gyhztt3ma97t1y1k0h5z2q6e)) ^512629258 #### Negotiating numerous NATs - The big thing that breaks is our port mapping protocols. They act upon the layer of NAT closest to the client, whereas the one we need to influence is the one furthest away. ([View Highlight](https://read.readwise.io/read/01gyhzwxx16p0bj44sd94hqm1a)) ^512629654 - To work around this, ISPs apply SNAT recursively: your home router SNATs your devices to an “intermediate” IP address, and further out in the ISP’s network a second layer of NAT devices map those intermediate IPs onto a smaller number of public IPs. This is “carrier-grade NAT”, or CGNAT for short. ([View Highlight](https://read.readwise.io/read/01gyj00mymexbjm1ya0gv8chtg)) ^512630128 #### Concerning CGNATs - We do have to overcome a new challenge, however: how do we connect two peers who are behind the same CGNAT, but different home NATs within? ([View Highlight](https://read.readwise.io/read/01gyj02fv0x974cw3prbws53pt)) ^512630353 > The problem here is that STUN doesn’t work the way we’d like. - If you’re thinking that port mapping protocols can help us here, you’re right! If either peer’s home NAT supports one of the port mapping protocols, we’re happy, because we have an `ip:port` that behaves like an un-NATed server, and connecting is trivial. ([View Highlight](https://read.readwise.io/read/01gyj04yzbaz82ky11q6cst3cm)) ^512630583 - This behavior of NATs is called hairpinning, and with all this dramatic buildup you won’t be surprised to learn that hairpinning works on some NATs and not others. ([View Highlight](https://read.readwise.io/read/01gyj07xpfh9bttbg6mgdwxc02)) ^512630892 #### Ideally IPv6, NAT64 notwithstanding - Detecting NAT64+DNS64 is easy: send a DNS request to `ipv4only.arpa.` That name resolves to known, constant IPv4 addresses, and only IPv4 addresses. If you get IPv6 addresses back, you know that a DNS64 did some translation to steer you to a NAT64. That lets you figure out what the NAT64 prefix is. ([View Highlight](https://read.readwise.io/read/01gyj0e2t0bx9mn91td6g2mh21)) ^512632077 ### Integrating it all with ICE - The algorithm is: try everything at once, and pick the best thing that works. That’s it. Isn’t that amazing? ([View Highlight](https://read.readwise.io/read/01gyj0h60cxsqf7sztjshr2ya7)) ^512632382 ### Concluding our connectivity chat - **Here’s a parting “TL;DR” recap:** For robust NAT traversal, you need the following ingredients: • A UDP-based protocol to augment • Direct access to a socket in your program • A communication side channel with your peers • A couple of STUN servers • A network of fallback relays (optional, but highly recommended) Then, you need to: • Enumerate all the `ip:ports` for your socket on your directly connected interfaces • Query STUN servers to discover WAN `ip:ports` and the “difficulty” of your NAT, if any • Try using the port mapping protocols to find more WAN `ip:ports` • Check for NAT64 and discover a WAN `ip:port` through that as well, if applicable • Exchange all those `ip:ports` with your peer through your side channel, along with some cryptographic keys to secure everything. • Begin communicating with your peer through fallback relays (optional, for quick connection establishment) • Probe all of your peer’s `ip:ports` for connectivity and if necessary/desired, also execute birthday attacks to get through harder NATs • As you discover connectivity paths that are better than the one you’re currently using, transparently upgrade away from the previous paths. • If the active path stops working, downgrade as needed to maintain connectivity. • Make sure everything is encrypted and authenticated end-to-end. ([View Highlight](https://read.readwise.io/read/01gyj0qahnkfvfh37cggsc9vng)) ∎ ^512633383 #tldr