^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) Kernel TLS offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Kernel TLS operation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Linux kernel provides TLS connection offload infrastructure. Once a TCP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) connection is in ``ESTABLISHED`` state user space can enable the TLS Upper
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Layer Protocol (ULP) and install the cryptographic connection state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) For details regarding the user-facing interface refer to the TLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) documentation in :ref:`Documentation/networking/tls.rst <kernel_tls>`.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) ``ktls`` can operate in three modes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) * Software crypto mode (``TLS_SW``) - CPU handles the cryptography.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) In most basic cases only crypto operations synchronous with the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) can be used, but depending on calling context CPU may utilize
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) asynchronous crypto accelerators. The use of accelerators introduces extra
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) latency on socket reads (decryption only starts when a read syscall
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) is made) and additional I/O load on the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) * Packet-based NIC offload mode (``TLS_HW``) - the NIC handles crypto
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) on a packet by packet basis, provided the packets arrive in order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) This mode integrates best with the kernel stack and is described in detail
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) in the remaining part of this document
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) (``ethtool`` flags ``tls-hw-tx-offload`` and ``tls-hw-rx-offload``).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) * Full TCP NIC offload mode (``TLS_HW_RECORD``) - mode of operation where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) NIC driver and firmware replace the kernel networking stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) with its own TCP handling, it is not usable in production environments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) making use of the Linux networking stack for example any firewalling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) abilities or QoS and packet scheduling (``ethtool`` flag ``tls-hw-record``).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) The operation mode is selected automatically based on device configuration,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) offload opt-in or opt-out on per-connection basis is not currently supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) TX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) At a high level user write requests are turned into a scatter list, the TLS ULP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) intercepts them, inserts record framing, performs encryption (in ``TLS_SW``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) mode) and then hands the modified scatter list to the TCP layer. From this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) point on the TCP stack proceeds as normal.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) In ``TLS_HW`` mode the encryption is not performed in the TLS ULP.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) Instead packets reach a device driver, the driver will mark the packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) for crypto offload based on the socket the packet is attached to,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) and send them to the device for encryption and transmission.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) On the receive side if the device handled decryption and authentication
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) successfully, the driver will set the decrypted bit in the associated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) :c:type:`struct sk_buff <sk_buff>`. The packets reach the TCP stack and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) are handled normally. ``ktls`` is informed when data is queued to the socket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) and the ``strparser`` mechanism is used to delineate the records. Upon read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) request, records are retrieved from the socket and passed to decryption routine.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) If device decrypted all the segments of the record the decryption is skipped,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) otherwise software path handles decryption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) .. kernel-figure:: tls-offload-layers.svg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) :alt: TLS offload layers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) :align: center
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) :figwidth: 28em
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) Layers of Kernel TLS stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) Device configuration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) During driver initialization device sets the ``NETIF_F_HW_TLS_RX`` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) ``NETIF_F_HW_TLS_TX`` features and installs its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) :c:type:`struct tlsdev_ops <tlsdev_ops>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) pointer in the :c:member:`tlsdev_ops` member of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) :c:type:`struct net_device <net_device>`.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) When TLS cryptographic connection state is installed on a ``ktls`` socket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) (note that it is done twice, once for RX and once for TX direction,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) and the two are completely independent), the kernel checks if the underlying
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) network device is offload-capable and attempts the offload. In case offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) fails the connection is handled entirely in software using the same mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) as if the offload was never tried.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) Offload request is performed via the :c:member:`tls_dev_add` callback of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) :c:type:`struct tlsdev_ops <tlsdev_ops>`:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) enum tls_offload_ctx_dir direction,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) struct tls_crypto_info *crypto_info,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) u32 start_offload_tcp_sn);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) ``direction`` indicates whether the cryptographic information is for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) the received or transmitted packets. Driver uses the ``sk`` parameter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) to retrieve the connection 5-tuple and socket family (IPv4 vs IPv6).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) Cryptographic information in ``crypto_info`` includes the key, iv, salt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) as well as TLS record sequence number. ``start_offload_tcp_sn`` indicates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) which TCP sequence number corresponds to the beginning of the record with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) sequence number from ``crypto_info``. The driver can add its state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) at the end of kernel structures (see :c:member:`driver_state` members
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) in ``include/net/tls.h``) to avoid additional allocations and pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) dereferences.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) TX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) After TX state is installed, the stack guarantees that the first segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) of the stream will start exactly at the ``start_offload_tcp_sn`` sequence
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) number, simplifying TCP sequence number matching.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) TX offload being fully initialized does not imply that all segments passing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) through the driver and which belong to the offloaded socket will be after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) the expected sequence number and will have kernel record information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) In particular, already encrypted data may have been queued to the socket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) before installing the connection state in the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) In RX direction local networking stack has little control over the segmentation,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) so the initial records' TCP sequence number may be anywhere inside the segment.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) Normal operation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) At the minimum the device maintains the following state for each connection, in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) each direction:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) * crypto secrets (key, iv, salt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) * crypto processing state (partial blocks, partial authentication tag, etc.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) * record metadata (sequence number, processing offset and length)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) * expected TCP sequence number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) There are no guarantees on record length or record segmentation. In particular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) segments may start at any point of a record and contain any number of records.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) Assuming segments are received in order, the device should be able to perform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) crypto operations and authentication regardless of segmentation. For this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) to be possible device has to keep small amount of segment-to-segment state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) This includes at least:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) * partial headers (if a segment carried only a part of the TLS header)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) * partial data block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) * partial authentication tag (all data had been seen but part of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) authentication tag has to be written or read from the subsequent segment)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) Record reassembly is not necessary for TLS offload. If the packets arrive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) in order the device should be able to handle them separately and make
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) forward progress.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) TX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) The kernel stack performs record framing reserving space for the authentication
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) tag and populating all other TLS header and tailer fields.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) Both the device and the driver maintain expected TCP sequence numbers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) due to the possibility of retransmissions and the lack of software fallback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) once the packet reaches the device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) For segments passed in order, the driver marks the packets with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) a connection identifier (note that a 5-tuple lookup is insufficient to identify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) packets requiring HW offload, see the :ref:`5tuple_problems` section)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) and hands them to the device. The device identifies the packet as requiring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) TLS handling and confirms the sequence number matches its expectation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) The device performs encryption and authentication of the record data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) It replaces the authentication tag and TCP checksum with correct values.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) Before a packet is DMAed to the host (but after NIC's embedded switching
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) and packet transformation functions) the device validates the Layer 4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) checksum and performs a 5-tuple lookup to find any TLS connection the packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) may belong to (technically a 4-tuple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) lookup is sufficient - IP addresses and TCP port numbers, as the protocol
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) is always TCP). If connection is matched device confirms if the TCP sequence
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) number is the expected one and proceeds to TLS handling (record delineation,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) decryption, authentication for each record in the packet). The device leaves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) the record framing unmodified, the stack takes care of record decapsulation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) Device indicates successful handling of TLS offload in the per-packet context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) (descriptor) passed to the host.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) Upon reception of a TLS offloaded packet, the driver sets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) the :c:member:`decrypted` mark in :c:type:`struct sk_buff <sk_buff>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) corresponding to the segment. Networking stack makes sure decrypted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) and non-decrypted segments do not get coalesced (e.g. by GRO or socket layer)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) and takes care of partial decryption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) Resync handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) In presence of packet drops or network packet reordering, the device may lose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) synchronization with the TLS stream, and require a resync with the kernel's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) TCP stack.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) Note that resync is only attempted for connections which were successfully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) added to the device table and are in TLS_HW mode. For example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) if the table was full when cryptographic state was installed in the kernel,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) such connection will never get offloaded. Therefore the resync request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) does not carry any cryptographic connection state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) TX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) Segments transmitted from an offloaded socket can get out of sync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) in similar ways to the receive side-retransmissions - local drops
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) are possible, though network reorders are not. There are currently
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) two mechanisms for dealing with out of order segments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) Crypto state rebuilding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) ~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) Whenever an out of order segment is transmitted the driver provides
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) the device with enough information to perform cryptographic operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) This means most likely that the part of the record preceding the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) segment has to be passed to the device as part of the packet context,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) together with its TCP sequence number and TLS record number. The device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) can then initialize its crypto state, process and discard the preceding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) data (to be able to insert the authentication tag) and move onto handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) the actual packet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) In this mode depending on the implementation the driver can either ask
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) for a continuation with the crypto state and the new sequence number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) (next expected segment is the one after the out of order one), or continue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) with the previous stream state - assuming that the out of order segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) was just a retransmission. The former is simpler, and does not require
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) retransmission detection therefore it is the recommended method until
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) such time it is proven inefficient.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) Next record sync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) ~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) Whenever an out of order segment is detected the driver requests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) that the ``ktls`` software fallback code encrypt it. If the segment's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) sequence number is lower than expected the driver assumes retransmission
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) and doesn't change device state. If the segment is in the future, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) may imply a local drop, the driver asks the stack to sync the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) to the next record state and falls back to software.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) Resync request is indicated with:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) void tls_offload_tx_resync_request(struct sock *sk, u32 got_seq, u32 exp_seq)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) Until resync is complete driver should not access its expected TCP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) sequence number (as it will be updated from a different context).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) Following helper should be used to test if resync is complete:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) bool tls_offload_tx_resync_pending(struct sock *sk)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) Next time ``ktls`` pushes a record it will first send its TCP sequence number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) and TLS record number to the driver. Stack will also make sure that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) the new record will start on a segment boundary (like it does when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) the connection is initially added).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) A small amount of RX reorder events may not require a full resynchronization.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) In particular the device should not lose synchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) when record boundary can be recovered:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) .. kernel-figure:: tls-offload-reorder-good.svg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) :alt: reorder of non-header segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) :align: center
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) Reorder of non-header segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) Green segments are successfully decrypted, blue ones are passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) as received on wire, red stripes mark start of new records.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) In above case segment 1 is received and decrypted successfully.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) Segment 2 was dropped so 3 arrives out of order. The device knows
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) the next record starts inside 3, based on record length in segment 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) Segment 3 is passed untouched, because due to lack of data from segment 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) the remainder of the previous record inside segment 3 cannot be handled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) The device can, however, collect the authentication algorithm's state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) and partial block from the new record in segment 3 and when 4 and 5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) arrive continue decryption. Finally when 2 arrives it's completely outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) of expected window of the device so it's passed as is without special
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) handling. ``ktls`` software fallback handles the decryption of record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) spanning segments 1, 2 and 3. The device did not get out of sync,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) even though two segments did not get decrypted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) Kernel synchronization may be necessary if the lost segment contained
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) a record header and arrived after the next record header has already passed:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) .. kernel-figure:: tls-offload-reorder-bad.svg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) :alt: reorder of header segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) :align: center
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) Reorder of segment with a TLS header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) In this example segment 2 gets dropped, and it contains a record header.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) Device can only detect that segment 4 also contains a TLS header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) if it knows the length of the previous record from segment 2. In this case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) the device will lose synchronization with the stream.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) Stream scan resynchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) When the device gets out of sync and the stream reaches TCP sequence
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) numbers more than a max size record past the expected TCP sequence number,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) the device starts scanning for a known header pattern. For example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) for TLS 1.2 and TLS 1.3 subsequent bytes of value ``0x03 0x03`` occur
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) in the SSL/TLS version field of the header. Once pattern is matched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) the device continues attempting parsing headers at expected locations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) (based on the length fields at guessed locations).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) Whenever the expected location does not contain a valid header the scan
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) is restarted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) When the header is matched the device sends a confirmation request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) to the kernel, asking if the guessed location is correct (if a TLS record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) really starts there), and which record sequence number the given header had.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) The kernel confirms the guessed location was correct and tells the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) the record sequence number. Meanwhile, the device had been parsing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) and counting all records since the just-confirmed one, it adds the number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) of records it had seen to the record number provided by the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) At this point the device is in sync and can resume decryption at next
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) segment boundary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) In a pathological case the device may latch onto a sequence of matching
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) headers and never hear back from the kernel (there is no negative
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) confirmation from the kernel). The implementation may choose to periodically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) restart scan. Given how unlikely falsely-matching stream is, however,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) periodic restart is not deemed necessary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) Special care has to be taken if the confirmation request is passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) asynchronously to the packet stream and record may get processed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) by the kernel before the confirmation request.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) Stack-driven resynchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) The driver may also request the stack to perform resynchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) whenever it sees the records are no longer getting decrypted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) If the connection is configured in this mode the stack automatically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) schedules resynchronization after it has received two completely encrypted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) records.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) The stack waits for the socket to drain and informs the device about
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) the next expected record number and its TCP sequence number. If the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) records continue to be received fully encrypted stack retries the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) synchronization with an exponential back off (first after 2 encrypted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) records, then after 4 records, after 8, after 16... up until every
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 128 records).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) Error handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) TX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) Packets may be redirected or rerouted by the stack to a different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) device than the selected TLS offload device. The stack will handle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) such condition using the :c:func:`sk_validate_xmit_skb` helper
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) (TLS offload code installs :c:func:`tls_validate_xmit_skb` at this hook).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) Offload maintains information about all records until the data is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) fully acknowledged, so if skbs reach the wrong device they can be handled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) by software fallback.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) Any device TLS offload handling error on the transmission side must result
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) in the packet being dropped. For example if a packet got out of order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) due to a bug in the stack or the device, reached the device and can't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) be encrypted such packet must be dropped.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) If the device encounters any problems with TLS offload on the receive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) side it should pass the packet to the host's networking stack as it was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) received on the wire.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) For example authentication failure for any record in the segment should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) result in passing the unmodified packet to the software fallback. This means
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) packets should not be modified "in place". Splitting segments to handle partial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) decryption is not advised. In other words either all records in the packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) had been handled successfully and authenticated or the packet has to be passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) to the host's stack as it was on the wire (recovering original packet in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) driver if device provides precise error is sufficient).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) The Linux networking stack does not provide a way of reporting per-packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) decryption and authentication errors, packets with errors must simply not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) have the :c:member:`decrypted` mark set.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) A packet should also not be handled by the TLS offload if it contains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) incorrect checksums.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) Performance metrics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) TLS offload can be characterized by the following basic metrics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) * max connection count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) * connection installation rate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) * connection installation latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) * total cryptographic performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) Note that each TCP connection requires a TLS session in both directions,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) the performance may be reported treating each direction separately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) Max connection count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) The number of connections device can support can be exposed via
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) ``devlink resource`` API.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) Total cryptographic performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) Offload performance may depend on segment and record size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) Overload of the cryptographic subsystem of the device should not have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) significant performance impact on non-offloaded streams.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) Statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) Following minimum set of TLS-related statistics should be reported
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) by the driver:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) * ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) which were part of a TLS stream.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) which were successfully decrypted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) * ``rx_tls_ctx`` - number of TLS RX HW offload contexts added to device for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) decryption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) * ``rx_tls_del`` - number of TLS RX HW offload contexts deleted from device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) (connection has finished).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) * ``rx_tls_resync_req_pkt`` - number of received TLS packets with a resync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) request.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) * ``rx_tls_resync_req_start`` - number of times the TLS async resync request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) was started.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) * ``rx_tls_resync_req_end`` - number of times the TLS async resync request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) properly ended with providing the HW tracked tcp-seq.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) * ``rx_tls_resync_req_skip`` - number of times the TLS async resync request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) procedure was started by not properly ended.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) * ``rx_tls_resync_res_ok`` - number of times the TLS resync response call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) the driver was successfully handled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) * ``rx_tls_resync_res_skip`` - number of times the TLS resync response call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) the driver was terminated unsuccessfully.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) * ``rx_tls_err`` - number of RX packets which were part of a TLS stream
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) but were not decrypted due to unexpected error in the state machine.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) for encryption of their TLS payload.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) passed to the device for encryption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) * ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) encryption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) * ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) but did not arrive in the expected order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) * ``tx_tls_skip_no_sync_data`` - number of TX packets which were part of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) a TLS stream and arrived out-of-order, but skipped the HW offload routine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) and went to the regular transmit flow as they were retransmissions of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) connection handshake.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) * ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) a TLS stream dropped, because they arrived out of order and associated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) record could not be found.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) * ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) stream dropped, because they contain both data that has been encrypted by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) software and data that expects hardware crypto offload.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) Notable corner cases, exceptions and additional requirements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) ============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) .. _5tuple_problems:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) 5-tuple matching limitations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) ----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) The device can only recognize received packets based on the 5-tuple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) of the socket. Current ``ktls`` implementation will not offload sockets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) routed through software interfaces such as those used for tunneling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) or virtual networking. However, many packet transformations performed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) by the networking stack (most notably any BPF logic) do not require
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) any intermediate software device, therefore a 5-tuple match may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) consistently miss at the device level. In such cases the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) should still be able to perform TX offload (encryption) and should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) fallback cleanly to software decryption (RX).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) Out of order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) Introducing extra processing in NICs should not cause packets to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490) transmitted or received out of order, for example pure ACK packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) should not be reordered with respect to data segments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493) Ingress reorder
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) A device is permitted to perform packet reordering for consecutive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) TCP segments (i.e. placing packets in the correct order) but any form
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) of additional buffering is disallowed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500) Coexistence with standard networking offload features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) -----------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) Offloaded ``ktls`` sockets should support standard TCP stack features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) transparently. Enabling device TLS offload should not cause any difference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) in packets as seen on the wire.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) Transport layer transparency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) ----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) The device should not modify any packet headers for the purpose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) of the simplifying TLS offload.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) The device should not depend on any packet headers beyond what is strictly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) necessary for TLS offload.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) Segment drops
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) Dropping packets is acceptable only in the event of catastrophic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) system errors and should never be used as an error handling mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) in cases arising from normal operation. In other words, reliance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) on TCP retransmissions to handle corner cases is not acceptable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) TLS device features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) Drivers should ignore the changes to TLS the device feature flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) These flags will be acted upon accordingly by the core ``ktls`` code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529) TLS device feature flags only control adding of new TLS connection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) offloads, old connections will remain active after flags are cleared.