network-programming routes ip ubuntu-server iproute

Can't establish connection over second NIC (two hops)

We are having trouble with network routing configuration in Ubuntu Xenial.

We have many servers with both Debian 8.4 (Jessie) and Ubuntu 16.04.2 (xenial) and the exact same networking setup (or at least as far as we can see).

They all have two NICs attached to two VLANs (Say "A" and "B") both accessible though other VLANs say, for example, from VLAN "C".

Both /etc/network/interfaces files are of the form:

NOTE: I faked names and IPs for the sake of better readability.

# VLAN A
auto eth0
iface eth0 inet static
address 192.168.111.xxx
netmask 255.255.255.0
broadcast 192.168.111.255
network 192.168.111.0
gateway 192.168.111.254
dns-nameservers 192.168.111.25 192.168.111.26

# VLAN B
auto eth1
iface eth1 inet static
address 192.168.222.xxx
netmask 255.255.255.0
broadcast 192.168.222.255
network 192.168.222.0
gateway 192.168.222.254 # <-- (Commented out in Ubuntu machine)
dns-nameservers 192.168.111.25 192.168.111.26

...say xxx is 100 for Debian Machine and 200 for Ubuntu machine and I'm trying to ping from 192.168.1.10 in VLAN "C" to following addresses:

192.168.111.100: Works fine.
192.168.222.100: Works fine.
192.168.111.200: Works fine.
192.168.222.200: NO Answer!!

The "B" vlan is used mostly for backup and other "background" traffic to avoid saturation problems in vlan "A".

I know that having two network paths to access same machine is not an usual setup and I must say that only being able to connect thought one of them from other networks is not a big problem nowadays. But what stucks to me is why I can access to Debian Machines and not to Ubuntu ones?

Even, on the other hand, if it were working well in both platforms, we could consider closing some services (such as ssh, and backend interfaces) from NIC "A" to improve security (Our firewall only allows access to vlan "B" from our IT staff vlan).

Of course, as it is commented in previous interfaces snippet, gateway row is commented out in Ubuntu machines, but that is because, networking initialization fails in that machines otherwise. That is, in fact, what we are trying to solve.

But both machines routing tables are almost identical. The only difference I could see was the onlink flag in the Ubuntu machine:

myUser@debianMachine:~$ sudo ip route
default via 192.168.111.254 dev eth0
192.168.111.0/24 dev eth0  proto kernel  scope link  src 192.168.111.100
192.168.222.0/24 dev eth1  proto kernel  scope link  src 192.168.222.100


myUser@ubuntuMachine:~$ sudo ip route
default via 192.168.111.254 dev eth0 onlink
192.168.111.0/24 dev eth0  proto kernel  scope link  src 192.168.111.200
192.168.222.0/24 dev eth1  proto kernel  scope link  src 192.168.222.200

...but I was able to remove it by following command:

myUser@ubuntuMachine:~$ sudo ip route replace default via 192.168.111.254 dev eth0
myUser@ubuntuMachine:~$ sudo ip route
default via 192.168.111.254 dev eth0
192.168.111.0/24 dev eth0  proto kernel  scope link  src 192.168.111.200
192.168.222.0/24 dev eth1  proto kernel  scope link  src 192.168.222.200

And it did'nt fix the problem.

After that, I also tried to uncomment gateway row of 'VLAN B' which, as I said, it were commented out in /etc/network/interfaces file and tryed to restart networking but this is what happened:

myUser@ubuntuMachine:~$ sudo /etc/init.d/networking restart
[....] Restarting networking (via systemctl): networking.serviceJob for networking.service failed because the control process exited with error code. See "systemctl status networking.service" and "journalctl -xe" for details.
failed!

...and the onlink flag came back again.

As a note, commenting out that line again and issuing new /etc/init.d/networking restart command, the output is the same until the machine is rebooted, (even networking, despite the VLAN B default gateyay issue, continues working as usual).

Following are the output of suggested commands:

myUser@ubuntuMachine:~$ sudo systemctl status networking.service
● networking.service - Raise network interfaces
   Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor preset: enabled)
  Drop-In: /run/systemd/generator/networking.service.d
           └─50-insserv.conf-$network.conf
   Active: failed (Result: exit-code) since jue 2017-12-21 14:55:29 CET; 42s ago
     Docs: man:interfaces(5)
  Process: 8552 ExecStop=/sbin/ifdown -a --read-environment --exclude=lo (code=exited, status=0/SUCCESS)
  Process: 8940 ExecStart=/sbin/ifup -a --read-environment (code=exited, status=1/FAILURE)
  Process: 8934 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ] && [ -n "$(ifquery --read-envi
 Main PID: 8940 (code=exited, status=1/FAILURE)

dic 21 14:55:29 ubuntuMachine systemd[1]: Stopped Raise network interfaces.
dic 21 14:55:29 ubuntuMachine systemd[1]: Starting Raise network interfaces...
dic 21 14:55:29 ubuntuMachine ifup[8940]: RTNETLINK answers: File exists
dic 21 14:55:29 ubuntuMachine ifup[8940]: Failed to bring up eth1.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILUR
dic 21 14:55:29 ubuntuMachine systemd[1]: Failed to start Raise network interfaces.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Unit entered failed state.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Failed with result 'exit-code'.

...and the meaningful part of sudo journalctl -xe:

dic 21 14:55:29 ubuntuMachine sudo[8922]:   myUser : TTY=pts/0 ; PWD=/home/myUser ; USER=root ; COMMAND=/etc/init.d/networking restart
dic 21 14:55:29 ubuntuMachine sudo[8922]: pam_unix(sudo:session): session opened for user root by myUser(uid=0)
dic 21 14:55:29 ubuntuMachine systemd[1]: Stopped Raise network interfaces.
-- Subject: Unit networking.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit networking.service has finished shutting down.
dic 21 14:55:29 ubuntuMachine systemd[1]: Starting Raise network interfaces...
-- Subject: Unit networking.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit networking.service has begun starting up.
dic 21 14:55:29 ubuntuMachine ifup[8940]: RTNETLINK answers: File exists
dic 21 14:55:29 ubuntuMachine ifup[8940]: Failed to bring up eth1.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Main process exited, code=exited, status=1/FAILURE
dic 21 14:55:29 ubuntuMachine systemd[1]: Failed to start Raise network interfaces.
-- Subject: Unit networking.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit networking.service has failed.
--
-- The result is failed.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Unit entered failed state.
dic 21 14:55:29 ubuntuMachine systemd[1]: networking.service: Failed with result 'exit-code'.
dic 21 14:55:29 ubuntuMachine sudo[8922]: pam_unix(sudo:session): session closed for user root

I googled a lot about being able to found some related information but none fully answering my question:

An explanation of "onlink" flag that seemed to me it were pointing out the possibilitity that the "onlink" flag were responsible of a "wrong back routing" in the meaning that «tells the kernel that the it does not have to check if the gateway is reachable directly by the current machine» so (I figured out) the kernel may thought it could (or should) route the answers of incomming connections from VLAN C to the default gateway instead of thought the same NIC from where the connection was started.
- But, as I said, removing the "onlink" flag didn't seem to change anything.
This unix StackExchange answer seems to solve the problem (I didn't tested it yet) by using multiple routing tables and rules (to tell the kernel which table to use). But it doesn't explain why Debian machines are working well (I checked /etc/iproute2/rt_tables file of both machines and they are identical too:

myUser@bothMachines:~$ sudo cat /etc/iproute2/rt_tables
#
# reserved values
#
255     local
254     main
253     default
0       unspec
#
# local
#
#1      inr.ruhep

So my final hypothesis is that it could be just an implementation difference between kernel versions and, having that ubuntu one is much more recent, this could be the correct behaviour so, in modern kernels, I need to use two different routing tables (but I'm not sure and don't know why...).

myUser@debianMachine:~$ sudo uname -a
Linux debianMachine 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

myUser@ubuntuMachine:~$ sudo uname -a
Linux ubuntuMachine 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

And, hence, the question is:

Are we doing something wrong (or there is some bug in them) in the Ubuntu machines? Or, conversely, this is the correct behaviour and we are forced to setup more complex routing schema (either by per-vlan routes or by using two routing tables to make two default gateway's to work again)?

EDIT:

Now I tried to add static route to fix the problem:

myUser@ubuntuMachine:~$ sudo ip route add 192.168.1.0/24 via 192.168.222.254 dev eth1

...but that freezed my ssh connection (thought NIC A) even I could then connect thought NIC B (at 192.168.111.200)

Both rules at the same time seems to not being possible:

myUser@ubuntuMachine:~$ sudo ip route add 192.168.1/24 via 102.168.111.254 dev eth0
myUser@ubuntuMachine:~$ sudo ip route add 192.168.1/24 via 192.168.222.254 dev eth1
RTNETLINK answers: File exists

EDIT 2:

I finally found the Linux Advanced Routing & Traffic Control HOWTO which seems to be more accurate than all other documentation I found and specifically in its Chapter 4. Rules - routing policy database I see following text:

If you want to use this feature, make sure that your kernel is compiled with the "IP: advanced router" and "IP: policy routing" features

...so I thing all points to that my previous hypothesis of a kernel implementation difference was right and that difference is concretely is those two features being compiled in.

Solution

Not an authoritative answer, but my first working attempt (applying what I managed to understand):

sudo ip route add 192.168.1.0/24 via 192.168.222.254 from 192.168.222.200 dev eth1 table 253 
sudo ip rule add from 192.168.222.200 table 253

Update: from and devarguments in the ip route command aren't required (it works perfetly well without them).

...after isuinng first command I couldn't connect yet, but after issuing second one yes.

The logic behind that comes from this text i found in this document:

Linux-2.x can pack routes into several routing tables identified by a number in the range from 1 to 255 or by name from the file /etc/iproute2/rt_tables By default all normal routes are inserted into the main table (ID 254) and the kernel only uses this table when calculating routes.

Actually, one other table always exists, which is invisible but even more important. It is the local table (ID 255). This table consists of routes for local and broadcast addresses. The kernel maintains this table automatically and the administrator usually need not modify it or even look at it.

In fact, I finally ended up using another routing table, identified by its id (253) instead of what I now understand it is just an alias (defined in /etc/iproute2/rt_tables file).

...and checking again that file, I now see that there was an alias ("default") already defined for that routing table (next to the "main" one which is indeed 254 as the text fragment I pasted previously says.

What I don't know yet is which is the logic behind this naming (the "default" for 253 routing table I mean) and if, for any reason, is better to use lower routing tables (1, 2, 3...) like this solution (already mentioned in the question) does.

But, for the sake of simplicity, if we aren't going to build complex routing policies and just want to fix this connectivity issue, I guess it could be a good solution to use something like (not yet tested):

gateway 192.168.222.254 table 253
post-up ip rule add from 192.168.222.200 table 253

I still need to test and check if I need an additional via 192.168.222.254 in the gateway row or if it won't work at all and need to add it with another post-up command instead.

I will update this answer with the results.

Edit 1: Same works with default routes:

sudo ip route add default from 192.168.222.200 via 192.168.222.254 table 253
sudo ip rule add from 192.168.222.200 table 253

Edit 2: First (now fully¹) working approach

After playing for a while with a testing machine, I think that the best solution is to add following rows to the second NIC configuration in /etc/network/interfaces file:

gateway 192.168.222.254 table 1
post-up ip rule add from 192.169.222.200 table 1
pre-down ip rule del from 192.168.222.200 table 1
post-up ip route add 192.188.222.0/24 dev eth1 src 192.168.222.200 table 1

Comments:

Adding table 1 to the gateway keyword worked well so additional (less readable) post-up command to add that default route was not necessary.
- ...in fact, using specific table (other than main) for first NIC together with a similar rule than what we used for our second NIC would be a bad idea because, that that rule will only apply when 192.168.111.200 is going to be used as source address so there will not be any "default default gateway". Leaving first NIC configuration in the main routing table, will make all ("locally generated") outgoing connections to remote LANs will go though our first default gateway by default.
First post-up command adds a rule that packets with the source address of that NIC, should be routed using table 1 (otherwise our new default gateway wouldn't be used).
pre-down command removes that rule. It is not mandatory but, without it, multiple network service restarts will duplicate this rule every time.
I also tried to use dev eth1 instead of from 192.169.222.200 (to avoid having to duplicate network address) but it didn't work. I guess which NIC to use to for "response" packets were "not yet decided".
I used table 1 for eth1 (our second NIC) and I could use table 2 for an eventual third one and so on. It wasn't needed to specify any table/rule for first NIC because it comes to the main table (not "default": see below note).
Finally(¹) the second post-up command make all things work well because (as I now realize) only (first matching) one routing table is used so the default network route (automatically created when the interface brought up) doesn't apply because it was created in table main.
- I still don't know if there is a way to force it to be crated directly into table 1.

NOTE: By command sudo ip rule list we can see current routing rules as follows:
0:      from all lookup local 
32765:  from 192.168.222.200 lookup 1 
32766:  from all lookup main 
32767:  from all lookup default
As I can understand, they are added decreasingly from 32767 to 0 and tried increasingly until one matches. Last two ones and the "0" were already defined by default. The former because of the logic I previously cited from this document but that documents says that rules starts from "1" so I guess "0" should also be some predefined "default starting point".

Edit 3:

As I said in the Edit 2 (of the question), I found this Linux Advanced Routing & Traffic Control HOWTO that helped me a lot in clarifying things.

Concretely the Routing for multiple uplinks/providers chapter was very useful to me in the task of understanding setups having "network loops" (even in our case we aren't acting as a router to Internet).