Saturday, January 29, 2011

Do NATs verify the source port of a SYN received after a SYN has been sent?

I'm unsure of whether this question is more appropriate for stackoverflow or serverfault. If you think it's more suited for stackoverflow, let me know and I'll delete this and move it over.

I have a STUNT implementation. If you don't know what STUNT is, it's a protocol used to make a direct TCP connection between two peers behind separate NATs. I'll give a brief overview, though my question doesn't directly relate to the protocol.

It's done by using a third party to predict which port each of the peers' NATs will map their next outgoing connection do. The third party then tells each peer the other's predicted port, and they both attempt to connect to each other on the predicted ports. When they fire off their SYN packets to each other, it opens a 'hole' in the NAT at that port, which allows the SYN packet through and the handshake to take place.

One of my colleagues suggested having the peer that initiates the STUNT connection attempt to fire off SYN packets to the predicted port, as well as the next four ports, in case the predicted port is used by another application (or even our application) before our connection is attempted, but after the predictions have been made.
An example of this would be predicting that the other peer is going to connect to us from port 80, but then another application ends up using port 80, so the connection actually comes from port 81; but firing off SYN packets to five different ports, we would (in theory) succeed if it comes from 80, 81, 82, 83, or 84.

However, this isn't the case; when tested, only the first SYN packet has a chance of succeeding. Even if the first SYN packet is sent to the wrong port, but one of the next four are sent to the correct port, they're all silently dropped; there's no response, the connection attempt just times out.

A quick example:
Peer A is initiating a STUNT connection to Peer B.
Peer A is predicted to connect to Peer B from port 1000.
Peer B is predicted to connect to Peer A from port 2000.
The server sends 1000 to Peer B and 2000 to Peer A.
Another application on Peer B's computer makes a new connection, taking over port 2000.
Another application on Peer A's computer makes a new connection, taking over port 1000.
Peer A simultaneously attempts to connect to Peer B on ports 2000, 2001, 2002, 2003, and 2004; the connection attempts come from ports 1001, 1002, 1003, 1004, and 1005.
Peer B attempts to connect to Peer B on port 1000; the connection attempt comes from port 2001.
A hole is created in Peer A and Peer B's NATs, at ports 1001 and 2001 respectively; it should allow SYN packets through.

What I believe is happening is that Peer B's NAT isn't allowing Peer A's SYN packet through port 2001, because it was expecting the connection to come from 1000 but it came from 1002. This suggests to me that the NAT is verifying the source of the SYN packet.

However, from what I've read, one flaw of STUNT is that when creating the connection there's a window of vulnerability where the connection can be hijacked by another source. If it's standard behavior for the NAT to verify the source of the incoming connection, then I don't see how this window of vulnerability can exist.

Note that my implementation is not flawed. If either of the predicted ports is correct, the connection will succeed; the problem happens when both of the predicted ports are wrong and it resorts to attempting multiple ports at once.

As a side note to anyone familiar with the protocol, I'm using the method that involves sending a SYN that I expect to be silently before attempting the actual connection. I'm not using the low-TTL implementation.

Is my NAT rejecting the SYN packets because they aren't coming from the expected port, or is it something else that's rejecting them? If it is my NAT, is this the expected/standard behavior? Note that by rejected I mean 'silently dropping,' as no RST or any response at all is being returned, the connections just time out.

EDIT: The routers in question are both the kind you'd buy at Walmart for home-use, not the kind large businesses would use.

  • Short answer: Yes-- your NAT is rejecting packets because they're not exactly what it expects. This is going to vary from NAT implementation to NAT implementation.

    Editorial aside: I hadn't heard of STUNT before... (STUN, yes-- not STUNT). Oh, ick... what a hack. It makes me weep that the Internet has turned into this NAT trailer park and that we're resorting to hacks like this.

    After reading the paper I get what they're doing. It's basically STUN with the added bit of packet sniffing on each "endpoint" to sniff the intial sequence numbers and trade them up via a third-party on the 'net who can accept arbitrary incoming TCP connections. It's a cute trick, actually.

    You're never going to get 100% reliable communication like this. I see that the researchers who initially implemented this have some test results that look fairly good at port prediction, though... That's actually a very interesting, albeit dense, little chart. The paper goes into all the prediction issues in more depth, and gives some success percentages. It's shocking to me that they did so well, actually.

    Any sane NAT implmentation is going to use, at minimum, source IP, destination IP, protocol, protocol source port, and protocol destination port to identify "connections". (Hopefully they're tracking sequence / acknowledgement numbers, too.)

    If your SYN doesn't exactly match the combination of these attributes stored in the device's NAT table the NAT implementation really ought to silently drop it. That would be the behaviour that I'd expect.

    squillman : Nice. Even though I live in a trailer park. (jk)
    dauphic : Thank you, I'll take it. I just wanted a second opinion before deciding whether it would make sense to leave this 'scatter shot' implementation in, or just completely remove it because it wouldn't do much to increase the chance of success.
    Evan Anderson : So many of us live in Internet trailer parks. I'm horrified that we're not going to see IPv6 adoption but, rather, the rise of "carrier grade NAT" implementations. I can hear the dripping as Time Warner salivates over putting their Customers behind gigantic NAT implementations.
    Farseeker : It dissapoints me that even in 2010 ISP's "Do not believe that running out of IPv4 addresses is anything to be concerned about". Only two or three Australian ISP's support it, and NONE of the big ISPs do. They don't even offer IPv6 tunnels.
    dauphic : Luckily, UPnP is supported by most routers these days. It's a bit annoying, but I believe UPnP gives us a not-completely-terrible-like-STUNT solution to most of our peer-to-peer communication difficulties.
    joeqwerty : It doesn't make sense to me that a firewall would track the SYNs and ACKs in it's NAT, state, or session table as it would introduce additional memory and CPU overhead and also the fact that UDP is connectionless and doesn't use the three way handshake, so it would only be effective for TCP connections, which would seem like a half baked way of manufacturing a firewall. AFAIK, the NAT table tracks the 4-tuple that comprises the connection: source port, source address, destination port, destination address.
    Evan Anderson : The NAT implementor can be really slipshod in their tracking of TCP state, or they can be really precise. It's up to them. (Obviously, one can use different tuples for tracking different protocols. TCP and UDP are just two protocols that might need session-level tracking. There could be lots more, and you could be tracking application-layer tuples, too. Ideally, you design your NAT engine to be extensible on a per-protocol basis.) Personally, I'd want a NAT implementation that tracked the TCP state very closely because it would be much more difficult for spoofed packets to "leak" by.
    Evan Anderson : TCP sequence number tracking and initial sequence number (ISN) randomization might be nice if I were NAT'ing a box stuck running an operating system that had easily guessable ISNs, too. Mangling the sequence number, source port (on outbound connections), and possibly the TOS would do a great job of obscuring the operating systems and number of hosts being the NAT device, too.
    Teddy : You *read the paper* **and** wrote this answer in just 40 minutes? FGITW indeed.
    Evan Anderson : Oh, no-- I skipped around in the paper and didn't read it word-for-word. I got the idea they were trying to communicate fairly quickly, since it's basically another STUN implementation, just with some added wrinkles.

0 comments:

Post a Comment