This tutorial describes how to set up IP Masquerading with iptables in Linux. We cover both the theory and principles behind Masquerading and the Linux iptables commands to run it, mainly for the home user. IP Masquerading is a form of Network Address Translation (NAT) and provides a way of modifying the addresses and port numbers in IP packets. This allows one Linux machine to be a Gateway between a local network and your Internet Service Provider (ISP) so that all machines on the local network can access the Internet through a single IP address.
I've seen dedicated boxes for routing, masquerade and firewalls in computer stores for $100-300 (US) and this is a reasonable solution. I've never owned one, so I can't offer any advice, except that you probably don't need the VPN (Virtual Private Network) features unless you or your company has a similar VPN box at the remote end. These notes are about using iptables in Linux to build such a machine from scratch. Even a low end Pentium machine with a couple of ISA network cards is plenty fast enough for this purpose.
You certainly should read the iptables(8) man page to see what the options are and what they do. There are official Howtos on IP Masquerading, Firewalls and IP Chains (but not yet IP Tables) in the Networking Section of the Linux Documentation Project. SyrLUG keeps a local copy of these and other Howtos. The NAT, Packet Filtering, Netfilter Hacking and Netfilter Extensions Howtos are available at the IP Tables home page which also contains links to other tutorials. I also have a separate tutorial on designing your own firewall rules in ipchains.
For a good introduction to networking in Unix, I recommend TCP/IP Network Administration (O'Reilly) by Craig Hunt. The definitive book on packet filtering firewalls is Building Internet Firewalls (O'Reilly) by Chapman and Zwicky. It doesn't cover masquerading, but one chapter gives an excellent service by service analysis of the ports and packets used by each of the major protocols. Both of these books are fairly advanced, but they tell you the real story.
Let's suppose that your home network is set up as in the following diagram. You have several machines running various operating systems and connected via ethernet. One of these machines, the Gateway, runs Linux and connects to your ISP over a modem. You are sitting at your Work Station and wish to connect through your Gateway to a remote Web Server.
Work eth0 Linux ppp0 Web Station ---------- hub ----------- Gateway ---------- ISP ----- ... ----- Server 192.168.1.2 /\ 192.168.1.1 218.104.22.168 222.214.171.124 / \ / \ Other Machines
In this example, we use private IP addresses in the 192.168.1.* range for the local network and a telephone modem (ppp0 interface) to connect to the ISP, your numbers may be different. It doesn't matter whether your Internet connection is via telephone or cable modem, or whether the Gateway uses a static or dynamic IP address. Also in this example, we use 2126.96.36.199 for the external address of the Linux Gateway and 2188.8.131.52 for the address of the Web Server. Of course, these numbers are not valid since they are larger than 255. I chose bogus numbers just so they won't conflict with real addresses, but they will suffice for this example.
Note that the Gateway sits between the ethernet hub and the ISP and has separate network interfaces for them. The Gateway contains the firewall and masquerading rules, so this configuration forces packets between the internal network and the Internet to go through the Gateway. If instead you used a cable modem and plugged it directly into your ethernet hub (not recommended), then the Internet could reach your internal network and bypass the firewall.
Also note that the Gateway has two IP numbers. Most machines have only one network interface, so it's common to blur the line between machines and addresses. But technically, IP numbers belong to interfaces, not machines. The Gateway has two interfaces, so it has two IP addresses, one for the local network and one for the link to the ISP. And actually, that's what gateway means, it's a machine that connects two networks and routes packets between them.
Now suppose that you want to access the remote Web Server. The web browser on your Work Station (eg, Netscape) asks the kernel for an available port and let's suppose that you get port 2000. The web server on the remote machine (eg, Apache) uses port 80. So, your network connection in this example is between address 192.168.1.2 port 2000 and address 2184.108.40.206 port 80. But this connection will not work. 192.168.1.2 is a private IP address and is not known outside your local network. The Web Server would have to return packets to 192.168.1.2 and it doesn't know where to send them. Your machine could actually send packets to the Web Server, but it would not receive an answer.
But if your Work Station had a valid, externally visible IP address, then the Web Server could return packets to you and the connection would succeed. In this case, everything would work right out of the box (with one minor setting) and there would be nothing to do. I'm sure your ISP could give you some externally visible addresses, but they would also charge you for them. It would also work if you made the connection from the Gateway since the external interface (ppp0) on the Gateway does have a valid IP address. This is one of the main things you pay your ISP for is the right to use a real address.
Masquerading on the Gateway works around this problem by intercepting packets from the Work Station and replacing the Work Station's IP address with its own. But the Work Station's port (2000) may already be in use on the Gateway, so suppose it changes the port to 3000. The Gateway modifies outgoing packets from the Work Station to make it appear as if they came from the Gateway and makes a note that incoming packets destined for address 2220.127.116.11 port 3000 should actually go to address 192.168.1.2 port 2000. The Work Station thinks the connection is between 192.168.1.2 port 2000 and 218.104.22.168 port 80, the Web Server thinks the connection is between 222.214.171.124 port 80 and 2126.96.36.199 port 3000, and only the Gateway knows the truth.
Basic masquerading relies on two assumptions. First, that connections are initiated from inside the local network. It's easy to masquerade outgoing packets, the hard part is demasquerading the return packets. As long as the connection begins with an outgoing packet, this sets the rule for translating the return packets and masquerading will succeed. But if the first packet is incoming, then the Gateway won't know what to do with it. This is usually not a problem for the home user since most connections will start on the Work Station. Second, basic masquerading only modifies the headers of packets and doesn't look inside the packet's data. If a protocol puts addresses and port numbers inside its data, then the remote end will get the wrong values and there will be problems. But again, most protocols don't do this, although some do.
Netfilter is the packet filtering part of the networking code in the Linux 2.4.x kernels, and iptables is the program to inspect and modify the filtering rules. Netfilter does both pure filtering (ACCEPT or DROP packets) and address translation (modify source or destination address and port). IP packets flow through netfilter as in the following diagram.
--------------> FORWARD --------------- / \ --> PREROUTING ---> o o ---> POSTROUTING --> \ local / ---> INPUT ---> process ---> OUTPUT ---
The middle part of this diagram contains the INPUT, OUTPUT and FORWARD chains that make up the filter table. This part is a pure packet filter, its only purpose is to ACCEPT or DROP a packet. The PRE- and POSTROUTING chains on the outside make up the nat table. The PREROUTING chain and DNAT target modify the destination address and port (if needed), and the POSTROUTING chain and SNAT target modify the source address and port. Masquerading is done on the POSTROUTING chain.
Note that forwarded packets do not traverse the INPUT or OUTPUT chains. Incoming packets destined for the local machine traverse the INPUT chain, outgoing packets originating on the local machine traverse the OUTPUT chain, and forwarded packets (which are both incoming and outgoing) traverse the FORWARD chain. So, on the Gateway, you would use the FORWARD chain for the firewall rules for the local network and the INPUT and OUTPUT chains for the firewall rules for the Gateway itself. This was different in ipchains where forwarded packets would pass through all three chains. The iptables way is more flexible, but you will have to write some rules twice.
Another important point is that on the INPUT, OUTPUT and FORWARD chains, packets have unmasqueraded addresses. That is, packets from the local network traverse the Gateway's FORWARD chain before they are masqueraded (in POSTROUTING), and the return packets are demasqueraded (implicitly in PREROUTING) before they traverse the FORWARD chain. So, in the previous Work Station to Web Server example, the FORWARD chain would see the addresses the same way the Work Station sees them.
Actually, this diagram is somewhat misleading about exactly where the address translation happens. The PRE- and POSTROUTING chains are really only used for the first packet of a connection to decide if and how that connection is translated. After the kernel decides how to translate the first packet, the remaining packets are implicitly translated in the same way without consulting these chains. But you may as well think that the translation happens on the PRE- and POSTROUTING chains because the rest of the diagram is consistent with that view. Just remember that it's really implicit, and if you count the number of packets with ``iptables -t nat -L POSTROUTING -v'', then the count only includes the first packet of each connection.
Now we give the rules for basic masquerading. Most of the work is done on the FORWARD and POSTROUTING chains on the Gateway, but there are just a few prerequisites to check first.
The only prerequisite on the Work Station is to set its default gateway. This varies by operating system, but assuming the Work Station is also running Linux and the network addresses are as in the above diagram, this can be set as follows.
route add default gw 192.168.1.1
This command tells the Work Station to send any packets not destined for the local network to the Gateway (192.168.1.1) and the Gateway will know where to send them. You would use the same command if the Work Station had an externally visible address and the Gateway was not masquerading its packets.
There are also a few prerequisites to check on the Gateway. First, networking must be running, for both the internal (eth0) and external (ppp0) interfaces. But this is beyond the scope of these notes, so we just assume that you have this running.
Second, you must enable forwarding on the Gateway. Forwarding tells the Gateway to accept incoming packets not destined for itself and pass them on to another machine. This is normally turned off by default, but it is easily turned on.
echo 1 >/proc/sys/net/ipv4/ip_forward
And third, the ip_tables and related modules must be loaded into the kernel. Normally, this happens automatically when you first run iptables. But in case it doesn't, you can load them manually with modprobe.
Masquerading needs at least the ip_tables, iptable_filter, ip_conntrack, iptable_nat, ipt_state and ipt_MASQUERADE modules and possibly more depending on what rules you use. See /lib/modules/2.4.x/kernel/net/ipv4/netfilter for a list of available netfilter modules. But be aware that the ipchains and ip_tables modules cannot both be in the kernel at the same time. You must choose between chains and tables.
Recall from Section 3 that the Gateway uses its FORWARD chain for the firewall rules for the local network, it uses the INPUT and OUTPUT chains for its own firewall, and the masquerading rules go on the POSTROUTING chain. It would suffice for masquerading if the FORWARD chain just accepted everything, but you probably don't want to do that. Crafting good firewall rules is somewhat involved, so we present a simple design here and save the real work for a separate tutorial (still written in ipchains).
iptables -A FORWARD -i ppp0 -s 192.168.0.0/16 -j DROP iptables -A FORWARD -i ppp0 -o ppp0 -j DROP iptables -A FORWARD -i ppp0 -o eth0 -m state --state ESTABLISHED,RELATED -j ACCEPT iptables -A FORWARD -i eth0 -o ppp0 -j ACCEPT iptables -A FORWARD -j DROP
The first rule blocks incoming packets from the Internet that claim to be from the local network. Blocking this address spoofing must be done on the Gateway because only the Gateway knows that the packet came in from the external interface. The second rule blocks external machines from bouncing packets off your Gateway that are destined for another outside machine. This rule is not so important if you're not doing masquerading, but it becomes very important when you are.
The other rules only allow connections that were initiated from inside the local network. They allow all outgoing packets and incoming packets only if they are part of (or related to) a connection that was started from inside the local network. (The options for --state are somewhat clumsy, it should really know how to load this module automatically.) And remember that this is just a minimal set of rules. You probably also want to block access to certain ports (1-1023), restrict access to servers, etc. And remember that you also need rules on the INPUT and OUTPUT chains for the Gateway's firewall.
And finally, the POSTROUTING rules to turn on masquerading.
iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE iptables -t nat -A POSTROUTING -j ACCEPT
The first rule masquerades packets as they leave the Gateway over its external interface (ppp0). It applies to packets both from inside the local network and from the Gateway itself. The return packets are demasqueraded automatically, you don't need (and shouldn't use) a separate PREROUTING rule. The ACCEPT target on the PRE- and POSTROUTING chains means to pass the packet through unmodified. So, the second rule turns off any further masquerading.
There are two very important pitfalls to avoid here. First, you don't want to masquerade packets leaving the Gateway on its internal interface. That would make the entire Internet appear as the Gateway to the local network. This is the reason for ``-o ppp0'' on the first POSTROUTING rule, to restrict masquerading to the Gateway's external interface.
Second, you don't want the Gateway to forward packets that both enter and leave on its external interface. The problem is that a hacker could manipulate his routing tables to bounce packets through your Gateway and have them masqueraded on the way out. Then he would be free to attack another machine and it would appear as though the attack came from you. This is the reason for the second FORWARD rule. Technically, the remaining rules would also block such packets, but it's a good idea to use an explicit rule. Alternatively, you could restrict masquerading to packets with a source address of your internal network or the Gateway's external address.
The networking protocols were not designed with masquerading in mind, so it's something of a minor miracle that it works at all. But most protocols work just fine and without the client ever knowing that its connections are being translated. Most protocols use one or more TCP connections opened from the client side. This is ideal for masquerading because the connection starts on the Work Station and the return packets use the same ports. Simple query response UDP protocols such as Domain Name Service (DNS) also work well. And it doesn't really matter what the operating system on the Work Station is, as long as it runs TCP/IP. What matters is the protocol and how that protocol uses ports. I have tested HTTP, passive FTP, Telnet, Secure Shell (SSH), SMTP (sendmail), POP Mail, nslookup, traceroute and ping, and they all work fine.
The most common problem for masquerading is a protocol that initiates a connection from outside the local network. When the Gateway doesn't see the outgoing packets first, it doesn't know what to do with the incoming packets. This is usually not a problem for the home user because most connections originate on the Work Station, but it is an issue for FTP in active mode. Even though the session is started on the Work Station, standard FTP opens a second, data connection from the server back to the Work Station, and this is bad for both masquerading and firewalls.
Masquerading also has trouble with protocols that send packets out on one port but expect return packets on a different port, and with protocols that write address and port information inside the packet's contents (not just the header). Again, the problem is that the Gateway doesn't know how to demasquerade the incoming packets. An example is talk, roughly the Unix analogue to Instant Messenger (and much older than IM). The talk protocol sends UDP packets to the server on one port and then expects a return TCP connection on a different port. Again, this is bad for both masquerading and firewalls, but fortunately, most protocols don't do this.
All of these problems stem from the need to modify packets. If the Gateway could simply forward packets without modifying them, then these problems would go away. But your Work Station would need a real IP address for that to work. So, if masquerading has trouble with some protocol, try running it from the Gateway.
The problem with standard FTP is that it opens the data connection from the server. This is bad for masquerading because the Gateway doesn't know that the connection should go to the Work Station. The best solution is to use passive mode whenever possible. Passive mode opens the data connection from the Work Station, and this is better for both masquerading and firewalls.
But iptables comes with two kernel modules that allow masquerading to work with FTP in active mode. The modules are not loaded automatically, you have to load them manually with modprobe.
modprobe ip_conntrack_ftp modprobe ip_nat_ftp
These modules go beyond the bounds of basic masquerading. They understand the underlying protocol and look inside the packet's data. The first module examines FTP packets to identify related connections, and the second module modifies the packets. Note that the FORWARD rules must allow incoming RELATED packets for these modules to work correctly.
At the time of this writing, iptables comes with modules for FTP and Internet Relay Chat (IRC). There are many other modules for various protocols, but you have to download and compile them separately with the Patch-O-Matic feature of netfilter. Start by reading the Netfilter Extensions Howto from the IP Tables home page.
These notes are written mainly for the home user. By that, we mean that your network connections begin from inside the local network and that you're not running any servers. But now suppose that you want to run a web server and access it from the Internet. One solution is to put the web server on the Gateway, but suppose that you want to run it on the Work Station. In that case, you use the Gateway's PREROUTING chain to redirect incoming connections for its HTTP port (80) to the Work Station.
iptables -t nat -A PREROUTING -i ppp0 -p tcp --dport 80 -j DNAT --to-dest 192.168.1.2:80 iptables -t nat -A PREROUTING -j ACCEPT iptables -A FORWARD -i ppp0 -o eth0 -p tcp --dport 80 -j ACCEPT
The first rule modifies the destination address of incoming TCP connections for port 80 on the Gateway to port 80 on the Work Station (192.168.1.2), and the second rule turns off any further port forwarding. As with the MASQUERADE target on the POSTROUTING chain, the DNAT target implicitly translates the return (outgoing) packets. The last rule allows those connections through the FORWARD filter. Recall that the previous FORWARD rules block such incoming connections, so put this rule somewhere before the final DROP rule.
The Linux 2.2.x kernels use ipchains instead of iptables, but the 2.4.x kernels can use either ipchains or iptables (but not both at the same time). I recommend using iptables if your kernel supports it, but ipchains does almost everything that iptables does. The main difference, aside from a few changes in syntax and a completely redesigned implementation, is that iptables supports stateful rules and ipchains does not. But the basic principles of what masquerading does remain the same.
In ipchains, forwarded packets traverse the input, forward and output chains, the -i option refers to either the incoming or outgoing interface depending on the chain, and masquerading happens on the forward chain.
ipchains -A input -i ppp0 -s 192.168.0.0/16 -j DENY ipchains -A input -i ppp0 -p tcp --syn -j DENY ipchains -A input -j ACCEPT ipchains -A output -j ACCEPT ipchains -A forward -i ppp0 -s 192.168.1.0/24 -j MASQ ipchains -A forward -j DENY
The first rule blocks spoofs of the local network addresses, the second rule blocks incoming TCP connections, and the last two rules turn on masquerading. Again, these are just a minimal set of rules, and I recommend writing a more complete firewall.
Linux supports IPv6 and netfilter supports filtering (ACCEPT or DROP) of IPv6 packets through ip6tables(8). The filtering options for ip6tables are nearly identical to those of iptables, except for the syntax of IPv6 addresses. But at the time of this writing, ip6tables does not yet support address translation or IP masquerading.
Timeouts can be a problem for masquerading for long-lived connections with long idle times. The problem is that the Gateway does not receive any direct notification when a connection ends, only the two ends do. So, if a connection is idle for a long time, the Gateway will assume that it has finished and reclaim its port. Later, when the connection sends another packet, the Gateway will think it belongs to a new connection, and the connection will break. In ipchains, the timeout values are somewhat low, but you can increase them as follows.
ipchains -M -S 7200 10 60
The first number is the TCP connection timeout (in seconds), the third number is the UDP timeout, and the middle number is the TCP timeout after a close connection (FIN) packet. Of the three, you really only need to increase the first number, but the command requires that you write all three. But don't set this value to infinity because then the Gateway will eventually run out of ports.
Iptables does not have a command to reset the timeouts, but the default values are somewhat higher. Maybe future releases will provide such a command. As a workaround, you could open the connection from the Gateway.
An essential feature to masquerading is that all of the return packets come back through the Gateway. This would not be an issue if the packets were not being modified. But since the Gateway masquerades the outgoing packet's addresses, only the Gateway knows how to demasquerade them. This is rarely a problem for the home user since the Gateway is probably your only connection to your ISP. But if your network topology is more complicated, then you must make sure that the return packets are sent through the same machine.
A similar issue happens with fragments. The problem is that when a packet is broken into pieces, only the first piece contains its port numbers. This means that the Gateway doesn't know what to do with the second and remaining fragments until they are reassembled, so fragments must also come back through the Gateway. But again, this is rarely a problem for the home user.
Another use for address translation is automatically redirecting connections, called Transparent Proxying. For example, you could have the Gateway intercept outgoing HTTP connections (port 80) and redirect them to your own web server. Your server could then examine the request, send back a reply, the Gateway would translate the reply, and the client would think it was talking to the original server.
$Id: ipmasq.html,v 1.1 2002/11/13 06:47:40 krentel Exp $