^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) IPVLAN Driver HOWTO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Initial Release:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) Mahesh Bandewar <maheshb AT google.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) 1. Introduction:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) This is conceptually very similar to the macvlan driver with one major
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) exception of using L3 for mux-ing /demux-ing among slaves. This property makes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) the master device share the L2 with it's slave devices. I have developed this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) driver in conjunction with network namespaces and not sure if there is use case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) outside of it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) 2. Building and Installation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) =============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) In order to build the driver, please select the config item CONFIG_IPVLAN.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) (CONFIG_IPVLAN=m).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) 3. Configuration:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) There are no module parameters for this driver and it can be configured
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) using IProute2/ip utility.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) ip link add link <master> name <slave> type ipvlan [ mode MODE ] [ FLAGS ]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) MODE: l3 (default) | l3s | l2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) FLAGS: bridge (default) | private | vepa
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) e.g.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) (a) Following will create IPvlan link with eth0 as master in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) L3 bridge mode::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) bash# ip link add link eth0 name ipvl0 type ipvlan
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) (b) This command will create IPvlan link in L2 bridge mode::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) bash# ip link add link eth0 name ipvl0 type ipvlan mode l2 bridge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) (c) This command will create an IPvlan device in L2 private mode::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) bash# ip link add link eth0 name ipvlan type ipvlan mode l2 private
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) (d) This command will create an IPvlan device in L2 vepa mode::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) bash# ip link add link eth0 name ipvlan type ipvlan mode l2 vepa
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) 4. Operating modes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) IPvlan has two modes of operation - L2 and L3. For a given master device,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) you can select one of these two modes and all slaves on that master will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) operate in the same (selected) mode. The RX mode is almost identical except
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) that in L3 mode the slaves wont receive any multicast / broadcast traffic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) L3 mode is more restrictive since routing is controlled from the other (mostly)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) default namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) 4.1 L2 mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) In this mode TX processing happens on the stack instance attached to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) slave device and packets are switched and queued to the master device to send
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) 4.2 L3 mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) In this mode TX processing up to L3 happens on the stack instance attached
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) to the slave device and packets are switched to the stack instance of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) master device for the L2 processing and routing from that instance will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) used before packets are queued on the outbound device. In this mode the slaves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) will not receive nor can send multicast / broadcast traffic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) 4.3 L3S mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) This is very similar to the L3 mode except that iptables (conn-tracking)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) works in this mode and hence it is L3-symmetric (L3s). This will have slightly less
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) performance but that shouldn't matter since you are choosing this mode over plain-L3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) mode to make conn-tracking work.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) 5. Mode flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) At this time following mode flags are available
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) 5.1 bridge:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) This is the default option. To configure the IPvlan port in this mode,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) user can choose to either add this option on the command-line or don't specify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) anything. This is the traditional mode where slaves can cross-talk among
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) themselves apart from talking through the master device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 5.2 private:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) If this option is added to the command-line, the port is set in private
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) mode. i.e. port won't allow cross communication between slaves.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 5.3 vepa:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) If this is added to the command-line, the port is set in VEPA mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) i.e. port will offload switching functionality to the external entity as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) described in 802.1Qbg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) Note: VEPA mode in IPvlan has limitations. IPvlan uses the mac-address of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) master-device, so the packets which are emitted in this mode for the adjacent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) neighbor will have source and destination mac same. This will make the switch /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) router send the redirect message.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 6. What to choose (macvlan vs. ipvlan)?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) =======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) These two devices are very similar in many regards and the specific use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) case could very well define which device to choose. if one of the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) situations defines your use case then you can choose to use ipvlan:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) (a) The Linux host that is connected to the external switch / router has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) policy configured that allows only one mac per port.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) (b) No of virtual devices created on a master exceed the mac capacity and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) puts the NIC in promiscuous mode and degraded performance is a concern.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) (c) If the slave device is to be put into the hostile / untrusted network
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) namespace where L2 on the slave could be changed / misused.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 6. Example configuration:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) +=============================================================+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) | Host: host1 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) | +----------------------+ +----------------------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) | | NS:ns0 | | NS:ns1 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) | | ipvl0 | | ipvl1 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) | +----------#-----------+ +-----------#----------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) | # # |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) | ################################ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) | # eth0 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) +==============================#==============================+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) (a) Create two network namespaces - ns0, ns1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) ip netns add ns0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) ip netns add ns1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) (b) Create two ipvlan slaves on eth0 (master device)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) ip link add link eth0 ipvl0 type ipvlan mode l2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) ip link add link eth0 ipvl1 type ipvlan mode l2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) (c) Assign slaves to the respective network namespaces::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) ip link set dev ipvl0 netns ns0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) ip link set dev ipvl1 netns ns1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) (d) Now switch to the namespace (ns0 or ns1) to configure the slave devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) - For ns0::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) (1) ip netns exec ns0 bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) (2) ip link set dev ipvl0 up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) (3) ip link set dev lo up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) (4) ip -4 addr add 127.0.0.1 dev lo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) (5) ip -4 addr add $IPADDR dev ipvl0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) (6) ip -4 route add default via $ROUTER dev ipvl0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) - For ns1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) (1) ip netns exec ns1 bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) (2) ip link set dev ipvl1 up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) (3) ip link set dev lo up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) (4) ip -4 addr add 127.0.0.1 dev lo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) (5) ip -4 addr add $IPADDR dev ipvl1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) (6) ip -4 route add default via $ROUTER dev ipvl1