Ideas or algorithms when programming an NAT

I'm working on a Python tunneling project using TUNTAP. The data received on a TUNTAP interface contains the original IP packet including all headers. I can do one of two things.

On the incoming side I am listening with Twisted. On the outgoing side I will have a raw socket which dumps the IP packet. Before dumping the packet the program swaps the source address with that of the server. It also recomputes the TCP and UDP checksums. It also swaps the ports using one of the following methods. This information is tracked in the NAT table

1) Use a single port per user such as

 US.ER.01.IP:10000 ----> SE.RV.ER.IP:3000 ----> facebook.com:80
 US.ER.01.IP:10001 ----> SE.RV.ER.IP:3000 ----> facebook.com:80
 US.ER.02.IP:3000 ----> SE.RV.ER.IP:3001 ----> facebook.com:80

Could this cause issues if the second with user's 1s simultaneous requests for facebook? How would the system know how to route facebook's reply. It is incoming on port 3000 so it belongs to user1 but does it get mapped back to 10000 or 10001?

2) Use a unique port for each connection such as

 US.ER.01.IP:10000 ----> SE.RV.ER.IP:3000 ----> facebook.com:80
 US.ER.01.IP:10001 ----> SE.RV.ER.IP:3001 ----> facebook.com:80
 US.ER.02.IP:3000 ----> SE.RV.ER.IP:3002 ----> remoteHost.com:22

How would I know when to remove entries from the NAT table? I could see the NAT table filling up very quickly using this method. The solutions to this are:

  I could wit for FIN packets from the server.  This will not work with UDP though.
  I could age the NAT entry on each hit.  I could then run garbage collection 
     every N seconds.  I see this being an issue if garbage collection runs
     and how would a server's delayed response get to the proper host if it gets
     deleted from the table.

There is also the issue of reading from a raw socket. I know how to send on one but would it be possible to receive individual IP packets. Could the raw socket receive one packet per sock.recieve(65535) call possibly receive more than one IP packet?

Which implementation is best? Any other tips or things I should be watching out for?

EDITS:

Ok so I have N many clients. If you misunderstood me the enitre /30 is used between the client and itself. It is just an abstraction to make the tunnel possible. I also didn't think it mattered but the websocket actaully goes through a "proxy" on the LAN (the IPdata is simply repackaged into a new websocket, the mappings are unique however). I did not want to make the explanation so confusing. I do not see how this changes anything.

      Client PC     CLIENT PC              Client PC----->LAN                               INTERNET    
 Client 1: 10.1.1.2 ----> 10.1.1.1 ----> Websocket(IPdata) ----> Browser ---> newWebSocket(IPData) ----> SE.RV.ER.IP
 Client 2: 10.1.1.4 ----> 10.1.1.3 ----> Websocket(IPdata) ----> Browser ---> newWebSocket(IPData) ----> SE.RV.ER.IP
 Client 3: 10.1.1.6 ----> 10.1.1.5 ----> Websocket(IPdata) ----> Browser ---> newWebSocket(IPData) ----> SE.RV.ER.IP

Each client set it's default route to be the tunnel endpoint (10.1.1.1 for example). The client gets the IP datagram, puts it into a websocket, sends the websocket to a browser on the LAN, which then sends it to the server (or perhaps another proxy). The inside of the websocket contains the original IP datagram (with the source of 10.1.1.2 or some other internal IP).

It is important to note that the server recieves a websocket message from the internet CONTAINING the goodes (with the private source address). How would the python server use this? Create a new tunnel with itself then dump the packet raw into the tunnel and route appropriately?

Or perhaps I could use a mapping?

How would I be able to "map" a tunnel abstraction over this chain of websockets? The client does not have a route to the internet but can reach the "Browser" which can get to the internet. This seems to be the same case with VPN tunnels. The abstraction would be as follows:

 Client 1: 10.1.1.2 ----> 10.1.1.1 ----> Websocket(IPdata) ----> Browser ---> newWebSocket(IPData) ----> SE.RV.ER.IP -> Internet
           10.1.2.2------------------------------------------------------------------------------------> 10.1.2.1 ----> Internet

If you know any resources to get me on the right track that would be great!

Solution

Implementing NAT

You must use a unique port for each connection, not a single port per user, for exactly the reason that you outline in your question: if you don't then you can (and will!) end up with multiple connections using the same 5-tuple (protocol,local-address,local-port,remote-address,remote-port) and you won't be able to disambiguate them.

Moreover, if you want to play nice with some protocols that do NAT traversal then you should try to not remap the original source port if possible, that is, only remap it (to a new random port) if it conflicts with an existing connection which you are tracking.

To implement a NAT correctly, you must track the state of each connection.

For TCP this means watching the flags, setting up new state when you see a SYN, and tearing down the state when you see FINs from both sides. The state you track must contain at least the original source port and the remapped source port (which might be the same, see above). If you want to support FTP then you will also have to sniff the contents of FTP TCP control connections and rewrite IP addresses contained therein (and this means you will need to track a lot more state because you may sometimes need to enlarge a TCP segment which means you need to start remapping sequence numbers). You should also have a time out associated with each tracked connection so that you get rid of it in case the endpoints disappear without closing the connection properly.

For UDP this means watching the combinations of local and remote port numbers and creating state for each unique combination (of 4-tuple of addresses and ports) that you see. Because UDP is connectionless you have to expire this state information based on a timeout. This timeout will be much shorter than the one you use for TCP (on the order of minutes instead of hours) in order to prevent your state table from getting too large.

For ICMP echo request you should proceed in a manner similar to UDP with the icmp_id playing the role of port number.

For other types of ICMP like destination unreachable you must inspect the ICMP packet to see if it is part of a TCP or UDP connection you are tracking and attempt to translate it back to the original source.

In order to prevent routing loops you should also be decrementing the IP TTL as you forward translated packets.

There are probably some more important bits which I'm forgetting. In short, implementing NAT is a lot like implementing an IP stack for a router! That's why NAT is virtual always bolted on to an IP stack in the kernel, not implemented in userspace.

Sending and receiving packets

So the architecture as I understand it is this:

Client originates a packet which goes into the TUNTAP interface
Your software gets this packet, encapsulates it in a Websocket message, and sends it off
Your Twisted server gets it and does its magic
The translated packet goes out from the server through a raw socket

The return path:

The reply comes back to your server somehow (perhaps libpcap)
Your code does the reverse magic
Your server transits the result back to the client over Websocket
Client sees the resulting backet come back through the TUNTAP interface.

I think the easiest way to handle the last step in the forward path and the first step in the return path is a second TUNTAP device: a tun interface on the server.