Wednesday, February 4, 2009

NAT Primer

If you're reading this text, there's a good chance your computer is sitting behind a home gateway or router: a NAT device. NAT devices come in many names and flavors. Despite all that, even the minimally network-literate can configure one.

So why do we use them? How do they work? What does a software developer need to know?

Home gateways provide a great deal of functionality for such a small package:
  1. Switching
  2. Routing
  3. Wireless (802.11) Access
  4. NAT
Each of these features deserves an article in its own right but for today we'll just limit our attention to NAT. It may not be the most interesting, but it certainly poses the biggest problem to your average developer.

NAT stands for Network Address Translation. The concept of NAT was conceived as a temporary solution to IPv4 rapid address depletion (see RFC1631 from 1994).

So what is it? In short, NAT allows multiple network devices to share a single public IP address by providing each device with its own private IP address. This approach easily beat out other time-sharing approaches of the decade. To this day, NAT continues to stave off address depletion (just barely).

How does it work? Let's say you have a Linksys wireless router and at least half a brain. The router probably has one ethernet interface marked for the WAN (wide area network AKA the internet). Having at least half a brain, you plug in the ethernet from your modem (cable, 56K, fiber optic, whichever). The router is automatically assigned an IP address from your ISP (internet service provider), 64.1.2.3 for example. To the rest of the world, 64.1.2.3 is the IP for every device that connects to your home network.

On the LAN (local area network), the picture is a bit different. The router is responsible for serving up unique IP addresses to all its clients and even to itself. Each device has a distinct private address. IANA has reserved three ranges for just this purpose; we'll choose the class C block: 192.168.0.0 - 192.168.255.255. You may recognize this address range from your own escapades into home networking and you may also recognize that a Linksys router will normally assign itself 192.168.1.1. Each other device is assigned a distinct address within that range. We'll assume your computer has connected to the router and you've been given IP address 192.168.1.100.

Great. Everyone has an IP address; the gateway (router) even has two to call its own. We'll refer to 192.168.1.1 as the gateway's private address and 64.1.2.3 as its public address. Let's talk about NAT.

Obviously, your 192.168.1.100 address can't be used on the public internet, since it's registered as a private address. NAT takes care of this problem transparently. Suppose, just for the example's sake, that you're interested in participating in a multiplayer game. Your computer sends out a UDP request to join the game. The outbound packet uses source address 192.168.1.100 and port number 12000. As the packet passes through the NAT (on your home router) the source address and port are translated on the fly. The NAT quickly chooses an external port number: 54000. The packet's source, as it's sent onto the public internet, is translated to 64.1.2.3:54000 instead of 192.168.1.100:12000. This translation process is NAT.

What happens if the server sends back a response? There's a bit a magic going on behind the scenes in your home router. When that first UDP packet was translated, some state was saved: a "NAT binding" was created dynamically. This binding is essentially a mapping from 192.168.1.100:12000 to 64.1.2.3:54000 that lasts at least 30 seconds if left unused. If traffic across this binding is active, it could remain open indefinitely. Let's say the game server sends back a response towards 64.1.2.3:54000. When the response reaches the home router, the router does a look-up of port 54000 in its internal bindings table and finds the mapping to 192.168.1.100:12000. With a mapping in place, the destination address and port are translated from 64.1.2.3:54000 to 192.168.1.100:12000 and the packet is sent on its merry way.

I should also note, for completeness, that some NATs filter inbound traffic by source address and port. So although public port 54000 is mapped to 192.168.1.100:12000, if the inbound traffic is coming from the wrong place it won't be forwarded. For example, in the strictest form of a NAT filter (symmetric), the router will only allow inbound packets with the same source address and port as the initial outbound packet's destination to pass through. In our example above, only packets with the game server's address and port would be allowed. Other types of NAT are less restrictive, but as a software developer, you should plan for the worst.

Wow, quite a bit of explanation for such a common device. Translation happens for every single packet sent out from a private network onto the public network. As you might imagine, packet processing is hardware accelerated for maximum performance.

So what's the issue? When there are just two agents involved (a client behind a NAT and a server on the public internet) this model works just fine. Big issues arise for two reasons:
  1. Referrals
  2. Peer-to-Peer
Referrals take place whenever a protocol is used to convey address/port information inside its message body. For example, when a video-on-demand session is set up, the client might tell the server that it wants to receive video traffic at address 192.168.1.100 on port 12000. However, the negotiation protocol is taking place on some other port so there's no binding in place yet for port 192.168.1.100:12000. On top of that, the server doesn't know the location of 192.168.1.100, since it's a private address.

Peer-to-peer exchanges are extremely common on the modern internet. Let's take Skype for example, a VoIP service. Skype's services do not route all calls through their own servers because it simply isn't scalable. Instead, each call participant announces its IP address and port for receiving audio packets. If one or more of these participants is behind a NAT, we run into trouble. (Well, Skype doesn't since their developers already solved this issue, but if you're writing your own Skype knock-off you will.)

Fixing these issues is a little beyond the scope of a "NAT Primer" and this post is already getting lengthy. Over the next few weeks I'll explain the different methods of dealing with NAT limitations. If you'd like to start brainstorming, consider that bindings are only created from outbound traffic and a binding needs to be in place for a private address to receive inbound traffic. If you're chomping at the bit to learn the solution, search for STUN, TURN, and ICE on your favorite engine (unless your favorite is cuil, then suck it up and go to google).

2 comments:

  1. Thx for writing this article .. I'm looking forward, to read your articles about STUN, TURN, ICE, etc.

    Henning

    ReplyDelete
  2. Henning,

    Thanks for your support! The articles on TURN and ICE are underway.

    John

    ReplyDelete