^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) NET_FAILOVER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) The net_failover driver provides an automated failover mechanism via APIs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) to create and destroy a failover master netdev and manages a primary and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) standby slave netdevs that get registered via the generic failover
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) infrastructure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) The failover netdev acts a master device and controls 2 slave devices. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) original paravirtual interface is registered as 'standby' slave netdev and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) a passthru/vf device with the same MAC gets registered as 'primary' slave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) netdev. Both 'standby' and 'failover' netdevs are associated with the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) 'pci' device. The user accesses the network interface via 'failover' netdev.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) The 'failover' netdev chooses 'primary' netdev as default for transmits when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) it is available with link up and running.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) This can be used by paravirtual drivers to enable an alternate low latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) datapath. It also enables hypervisor controlled live migration of a VM with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) direct attached VF by failing over to the paravirtual datapath when the VF
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) is unplugged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) virtio-net accelerated datapath: STANDBY mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) net_failover enables hypervisor controlled accelerated datapath to virtio-net
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) enabled VMs in a transparent manner with no/minimal guest userspace changes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) feature on the virtio-net interface and assign the same MAC address to both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) virtio-net and VF interfaces.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) Here is an example XML snippet that shows such configuration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) <interface type='network'>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) <mac address='52:54:00:00:12:53'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) <source network='enp66s0f0_br'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) <target dev='tap01'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) <model type='virtio'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) <driver name='vhost' queues='4'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) <link state='down'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) </interface>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) <interface type='hostdev' managed='yes'>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) <mac address='52:54:00:00:12:53'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) <source>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) </source>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) </interface>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) Booting a VM with the above configuration will result in the following 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) netdevs created in the VM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) 4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) valid_lft 42482sec preferred_lft 42482sec
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) valid_lft forever preferred_lft forever
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) 5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) 7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) 'standby' and 'primary' netdevs respectively.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) ==================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) net_failover also enables hypervisor controlled live migration to be supported
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) with VMs that have direct attached SR-IOV VF devices by automatic failover to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) the paravirtual datapath when the VF is unplugged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) Here is a sample script that shows the steps to initiate live migration on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) the source hypervisor.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) # cat vf_xml
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) <interface type='hostdev' managed='yes'>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) <mac address='52:54:00:00:12:53'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) <source>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) </source>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) </interface>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) # Source Hypervisor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) DOMAIN=fedora27-tap01
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) PF=enp66s0f0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) VF_NUM=5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) TAP_IF=tap01
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) VF_XML=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) MAC=52:54:00:00:12:53
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) ZERO_MAC=00:00:00:00:00:00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) virsh domif-setlink $DOMAIN $TAP_IF up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) bridge fdb del $MAC dev $PF master
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) virsh detach-device $DOMAIN $VF_XML
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) ip link set $PF vf $VF_NUM mac $ZERO_MAC
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) # Destination Hypervisor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) virsh attach-device $DOMAIN $VF_XML
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) virsh domif-setlink $DOMAIN $TAP_IF down