Bonded ethernet network
If one bonds a number of NIC, is it a requirement that they be connected to the network via a switch? Say that you connect two computers to each other, with 4 NIC on one computer directly connected to 4 NIC on the other. On each these are bonded (called a Team on Windows). I know that if you do use a switch, it must support Static Link Aggregation. But is a switch required to make this work at all? -- Roger Oberholtzer
No. It may even be better in that a switch's support for balancing may be [usually is] limited. e.g. if it supports only L2/MAC or L3 hashing, you couldn't saturate the combined links. Even L3+L4 hashing would require multiple TCP streams to saturate the links, and that's not guaranteed. Most switches don't like to do round robin balancing as it requires state to be maintained. YMMV of course depending on your gear. Sent from myPhone.
On Sep 9, 2024, at 4:35 AM, Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
If one bonds a number of NIC, is it a requirement that they be connected to the network via a switch? Say that you connect two computers to each other, with 4 NIC on one computer directly connected to 4 NIC on the other. On each these are bonded (called a Team on Windows).
I know that if you do use a switch, it must support Static Link Aggregation. But is a switch required to make this work at all?
-- Roger Oberholtzer
On Mon, Sep 9, 2024 at 10:43 AM <tabris@tabris.net> wrote:
No. It may even be better in that a switch's support for balancing may be [usually is] limited. e.g. if it supports only L2/MAC or L3 hashing, you couldn't saturate the combined links. Even L3+L4 hashing would require multiple TCP streams to saturate the links, and that's not guaranteed. Most switches don't like to do round robin balancing as it requires state to be maintained. YMMV of course depending on your gear.
I have set up such a directly connected system. One side is openSUSE 15.6, and the other is Windows 10. My aim is increasing throughput speed for data transfer between the two systems. The situation is that the Windows computer can access the Linux computer via this network (via ssh). The Linux system, otoh, cannot access the Windows computer. The route information is: Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 10.2.192.65 0.0.0.0 UG 0 0 0 eth1 10.1.3.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.1.4.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 10.1.8.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3 10.1.9.0 0.0.0.0 255.255.255.0 U 0 0 0 eth4 10.1.10.0 0.0.0.0 255.255.255.0 U 0 0 0 bond0 10.2.192.64 0.0.0.0 255.255.255.224 U 0 0 0 eth1 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2 192.168.100.0 0.0.0.0 255.255.255.0 U 0 0 0 eth5 195.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 Where network 10.1.10 is the bonded network. There is no gateway on that network as there are only these two machines on the 10.1.10 network. The gateway to unknown networks is eth1. But the bonded network is known. The network is configured (via yast2) as: IPADDR='10.1.10.1/24' BOOTPROTO='static' STARTMODE='auto' BONDING_MASTER='yes' BONDING_SLAVE0='eth11' BONDING_SLAVE1='eth9' BONDING_SLAVE2='eth12' BONDING_SLAVE3='eth10' BONDING_MODULE_OPTS='mode=active-backup miimon=100' and each slave is configured as: IPADDR='0.0.0.0' MTU='0' BOOTPROTO='none' STARTMODE='auto' The mode (active-backup) is curious. I have tried others (e.g. balance-alb) to no effect. Of course the first thing was to see if Windows was blocking something. We do not run any firewall. These are systems in road measurement vehicles. When the network is one standard NIC, all works. Doing this bonding does not. It is only this network that is acting this way. On a different network connection to this computer (10.1.4), all things work. I usually point the finger at the Windows system when these situations arise. I really do not feel that is the case now. But I'm not total sure about that. The system is a high-speed data collection system. It has many network-based transducers. Thus all the networks.
Sent from myPhone.
On Sep 9, 2024, at 4:35 AM, Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
If one bonds a number of NIC, is it a requirement that they be connected to the network via a switch? Say that you connect two computers to each other, with 4 NIC on one computer directly connected to 4 NIC on the other. On each these are bonded (called a Team on Windows).
I know that if you do use a switch, it must support Static Link Aggregation. But is a switch required to make this work at all? kip the discu -- Roger Oberholtzer
-- Roger Oberholtzer
On Mon, Sep 9, 2024 at 12:39 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
The situation is that the Windows computer can access the Linux computer via this network (via ssh). The Linux system, otoh, cannot access the Windows computer.
What does "access the Windows" mean? How exactly are you accessing Windows? Are bond modes the same on Windows and Linux? ...
The mode (active-backup) is curious. I have tried others (e.g. balance-alb) to no effect.
Active-backup means that only one link is active. If Windows sends packets to a different link, they are lost. Where Windows sends packets is determined by Windows. It is entirely possible that Windows selects different links for outgoing ssh packets and replies to incoming requests. You did not describe how the Windows side is configured. I once spent several days chasing connectivity issues (and it took quite some time to realize there *were* connectivity issues, and not some higher level problems). It turned out two sides of a static aggregate did not agree on which links were active.
On Mon, Sep 9, 2024 at 11:58 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
On Mon, Sep 9, 2024 at 12:39 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
The situation is that the Windows computer can access the Linux computer via this network (via ssh). The Linux system, otoh, cannot access the Windows computer.
What does "access the Windows" mean? How exactly are you accessing.
ping as start. And ssh, as it will be rsync via sh that this network will primarily be doing.
Windows? Are bond modes the same on Windows and Linux?
That's always the fun part. Windows and Linux use different names for these things. What I am after is maximizing throughput. On Linux, it seems like balance-alb is what I would like. So what is the compatible Windows NIC Team setting to match this? I've not figured that out yet. -- Roger Oberholtzer
On Mon, Sep 9, 2024 at 2:25 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
On Mon, Sep 9, 2024 at 11:58 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
On Mon, Sep 9, 2024 at 12:39 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
The situation is that the Windows computer can access the Linux computer via this network (via ssh). The Linux system, otoh, cannot access the Windows computer.
What does "access the Windows" mean? How exactly are you accessing.
ping as start. And ssh, as it will be rsync via sh that this network will primarily be doing.
What makes you think bond will give you better throughput? rsync over ssh is single threaded. Will there be a lot of concurrent streams?
Windows? Are bond modes the same on Windows and Linux?
That's always the fun part. Windows and Linux use different names for these things.
What's in a name? You are expected to understand what they *do*, not how they are named.
What I am after is maximizing throughput. On Linux, it seems like balance-alb is what I would like. So what is the compatible Windows NIC Team setting to match this? I've not figured that out yet.
There are none. Any switch independent mode relies on the *switch* to send packets via the correct port based on the learned MAC. But the Linux bonding driver is not a switch. It does not have a mode that always sends replies to the same link via which requests have been received. Start with tcpdump/wireshark/tshark/dumpcap - whatever you are familiar with - and check what happens on the Linux side. Capture traffic on each *physical* interface when attempting to ping windows. The only two bond modes that will definitely work point to point are static etherchannel (balance-xor) and LACP (802.3ad). They correspond to the Windows Static Teaming and LACP. LACP is preferred in all cases when it is available. And they are completely independent from load balancing mode. For point to point link the only mode that makes sense is the one involving TCP/UDP port numbers, because both MAC and IP do not change. That is xmit_hash_policy=layer3+4 for Linux and Address Hash for Windows. You cannot start discussing "how to increase bandwidth" without understanding your workload first.
On Mon, Sep 9, 2024 at 3:28 PM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
The only two bond modes that will definitely work point to point are static etherchannel (balance-xor) and LACP (802.3ad). They correspond to the Windows Static Teaming and LACP. LACP is preferred in all cases when it is available.
And they are completely independent from load balancing mode.
Correction. balance-rr actually combines aggregation and load balancing and is a static etherchannel. This still applies only to the packets from Linux to Windows - distribution in another direction is entirely up to Windows and it does not have round robin (at least, standard teaming driver; it is quite possible that additional software for your NIC has it).
Why ALB? Assuming that outgoing throughput is a concern, balance-rr is better. The upside to stuff like ALB and other L3+L4 or L2 hashing methods is to avoid packet reordering issues. But I did the math a few years back, and iirc at 10Gbit there has to be a difference of ~50ft to reorder a stream of 64byte packets. Checking with WolframAlpha: 64bytes/(10Gbit/sec) = ~51nsec or ~50ft. For most purposes and something running on a standard CPU, balance-rr is simpler. This doesn't apply to ASICs and other network silicon though... Sent from myPhone.
On Sep 9, 2024, at 7:25 AM, Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
On Mon, Sep 9, 2024 at 11:58 AM Andrei Borzenkov <arvidjaar@gmail.com> wrote:
On Mon, Sep 9, 2024 at 12:39 PM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
The situation is that the Windows computer can access the Linux computer via this network (via ssh). The Linux system, otoh, cannot access the Windows computer.
What does "access the Windows" mean? How exactly are you accessing.
ping as start. And ssh, as it will be rsync via sh that this network will primarily be doing.
Windows? Are bond modes the same on Windows and Linux?
That's always the fun part. Windows and Linux use different names for these things.
What I am after is maximizing throughput. On Linux, it seems like balance-alb is what I would like. So what is the compatible Windows NIC Team setting to match this? I've not figured that out yet.
-- Roger Oberholtzer
On Mon, Sep 9, 2024 at 6:43 PM <tabris@tabris.net> wrote:
Why ALB? Assuming that outgoing throughput is a concern, balance-rr is better. The upside to stuff like ALB and other L3+L4 or L2 hashing methods is to avoid packet reordering issues. But I did the math a few years back, and iirc at 10Gbit there has to be a difference of ~50ft to reorder a stream of 64byte packets. Checking with WolframAlpha: 64bytes/(10Gbit/sec) = ~51nsec or ~50ft. For most purposes and something running on a standard CPU, balance-rr is simpler. This doesn't apply to ASICs and other network silicon though...
I will try balance-rr. I still think that the issue is elsewhere. But I really do not know. The mode must only effect the packet sender logic. The reading logic must at all times be accepting data on any of the bonded connections. The reader has no way of knowing what decision the sender made about which NIC to send the data on. So it must always be reading on all bonded NIC. -- Roger Oberholtzer
On 2024-09-10 11:00, Roger Oberholtzer wrote:
On Mon, Sep 9, 2024 at 6:43 PM <tabris@tabris.net> wrote:
Why ALB? Assuming that outgoing throughput is a concern, balance-rr is better. The upside to stuff like ALB and other L3+L4 or L2 hashing methods is to avoid packet reordering issues. But I did the math a few years back, and iirc at 10Gbit there has to be a difference of ~50ft to reorder a stream of 64byte packets. Checking with WolframAlpha: 64bytes/(10Gbit/sec) = ~51nsec or ~50ft. For most purposes and something running on a standard CPU, balance-rr is simpler. This doesn't apply to ASICs and other network silicon though...
I will try balance-rr. I still think that the issue is elsewhere. But I really do not know.
The mode must only effect the packet sender logic. The reading logic must at all times be accepting data on any of the bonded connections. The reader has no way of knowing what decision the sender made about which NIC to send the data on. So it must always be reading on all bonded NIC.
IIRC on the receiving end there are issues like how to handle packages out of order (trigger an error, or wait a bit). Or, is it possible that the next package arrives on a different cable. -- Cheers / Saludos, Carlos E. R. (from 15.5 x86_64 at Telcontar)
On Tue, Sep 10, 2024 at 11:52 AM Carlos E. R. <robin.listas@telefonica.net> wrote:
On 2024-09-10 11:00, Roger Oberholtzer wrote:
On Mon, Sep 9, 2024 at 6:43 PM <tabris@tabris.net> wrote:
Why ALB? Assuming that outgoing throughput is a concern, balance-rr is better. The upside to stuff like ALB and other L3+L4 or L2 hashing methods is to avoid packet reordering issues. But I did the math a few years back, and iirc at 10Gbit there has to be a difference of ~50ft to reorder a stream of 64byte packets. Checking with WolframAlpha: 64bytes/(10Gbit/sec) = ~51nsec or ~50ft. For most purposes and something running on a standard CPU, balance-rr is simpler. This doesn't apply to ASICs and other network silicon though...
I will try balance-rr. I still think that the issue is elsewhere. But I really do not know.
The mode must only effect the packet sender logic. The reading logic must at all times be accepting data on any of the bonded connections. The reader has no way of knowing what decision the sender made about which NIC to send the data on. So it must always be reading on all bonded NIC.
IIRC on the receiving end there are issues like how to handle packages out of order (trigger an error, or wait a bit). Or, is it possible that the next package arrives on a different cable.s T
I cannot see that it is a problem so much as how bonded interfaces in general must work. By their very design, packets move between the NIC. There must be a standard definition for how this works, or bonded interfaces could not be a thing. In 'standard' TCP, packers can come out of order. If a packet seems not to have arrived ad at the destination, a request for that missing packet is made. When this happens, the packets will have arrived out of order. The sender can never forget a packet until it knows the other size has received it. This happened much more often with the old COAX cables. I think it is much less common with RJ45-type connections. But that is what TCP provides. UDP is more the wild west. -- Roger Oberholtzer
On Mon, Sep 9, 2024 at 11:34 AM Roger Oberholtzer <roger.oberholtzer@gmail.com> wrote:
If one bonds a number of NIC, is it a requirement that they be connected to the network via a switch?
In general, no, but see below.
Say that you connect two computers to each other, with 4 NIC on one computer directly connected to 4 NIC on the other. On each these are bonded (called a Team on Windows).
I know that if you do use a switch, it must support Static Link Aggregation.
No. It depends on the bond type.
But is a switch required to make this work at all?
Define "this". Some solutions (including Linux bond driver) may use network probing to determine individual link health. This requires switch to forward traffic to all links. So, it depends on the exact configuration and the exact implementation.
participants (4)
-
Andrei Borzenkov
-
Carlos E. R.
-
Roger Oberholtzer
-
tabris@tabris.net