Internet-Draft | OPT-M | May 2023 |
Song, et al. | Expires 1 December 2023 | [Page] |
The document describes an on-path telemetry method using packet-marking, referred to as PBT-M. Similar to IOAM DEX, PBT-M does not carry the telemetry data in user packets but sends the telemetry data through a dedicated packet. However, PBT-M does not require an extra instruction header but claims a bit in existing header fields or uses some other equivalent means as a trigger for telemetry data processing and collection. Due to this feature, PBT-M raises some unique issues that need to be considered for its application in different networks. This document describes the high level scheme, summarizes the common requirements and issues, and provides recommendations for solutions. PBT-M is complementary to the other on-path telemetry schemes.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 1 December 2023.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
To gain detailed data plane visibility to support effective network OAM, it is essential to be able to examine the trace of user packets along their forwarding paths. Such on-path flow data reflect the state and status of each user packet's real-time experience and provide valuable information for network monitoring, measurement, and diagnosis.¶
The telemetry data include but not limited to the detailed forwarding path, the timestamp/latency at each network node, and, in case of packet drop, the drop location and reason. The emerging programmable data plane devices allow user-defined data collection or conditional data collection based on trigger events. Such on-path flow data are from and about the live user traffic, which complements the data acquired through other passive and active OAM mechanisms such as IPFIX [RFC7011] and ICMP [RFC4560].¶
This document describes PBT-M, a new on-path telemetry technique which complements IOAM Trace [RFC9197] and IOAM DEX [RFC9326]. PBT-M does not require a telemetry instruction header but a trigger bit in some existing header fields or some equivalent means. Due to this feature, the seemingly simple scheme raises some unique issues that need to be considered for successful application. This document serves as a central location to archive the challenges common to PBT-M and provides solution recommendations, aiming to eliminate duplicated efforts when applying PBT-M in different network scenarios.¶
As the name suggests, PBT-M only needs a marking-bit in the existing headers of user packets (or some equivalent means) to trigger the telemetry data collection and export. The sketch of PBT-M is as follows. If some on-path data need to be collected for a user packet, the user packet is marked at the path head node. At each PBT-M-aware node on the path, if the mark is detected, a telemetry data packet (i.e., the dedicated OAM packet triggered by the marked user packet) is generated and sent to a collector. Meanwhile, the original user packet is forwarded without delay and alteration. The telemetry data packet contains the data requested by the management plane. The requested data are configured by the management plane. Once the collector receives the postcards for a single user packet from different path nodes, it can infer the packet's forwarding path and analyze the data set. The path end node is configured to un-mark the packets to its original format if necessary.¶
The overall architecture of PBT-M is depicted in Figure 1.¶
The advantages of PBT-M are summarized as follows.¶
Although PBT-M is simple and has many advantages, it also introduces a few new requirements and challenges due to its unique feature.¶
To address the above requirements and challenges, we propose the considerations and recommendations for implementing and applying PBT-M.¶
To trigger the path-associated data collection, in most cases, a single bit from some existing header field is sufficient. While no such bit is available, other packet-marking techniques can be needed. We discuss several possible application scenarios.¶
The marking method for other protocols (e.g., IPv6) is subject to further study and is out of scope of this document.¶
In case the path that a flow traverses is unknown in advance, all PBT-M-aware nodes in an application domain should by default be configured to react to the marked packets by exporting some basic data, such as node ID and TTL before a data set template for that flow is configured. This way, the management plane can learn the flow path dynamically from the postcard packets and delay the decision on collecting more comprehensive data by configuring only the relevant nodes.¶
If the management plane wants to collect the on-path data for some flow, in order to reduce the data redundancy, workload for network devices and data collectors, and network bandwidth consumption, it is unnecessary to mark every flow packet. Instead, it is recommended to configure the head node(s) with a sampling probability or time interval for the flow packet marking. When the first marked packet is forwarded in the network, the PBT-M-aware nodes will export the basic data set to the collector. Hence, the flow path is identified. If other data types need to be collected, the management plane can further configure the data set's template to the target nodes on the flow's path. The PBT-M-aware nodes collect and export data accordingly if the packet is marked and a data set template is present.¶
If the flow path is changed for any reason, the new path can be quickly learned by the collector. Consequently, the management plane controller can be directed to configure the nodes on the new path. The outdated configuration can be automatically timed out or explicitly revoked by the management plane controller.¶
For a marked user packet, each PBT-M-aware node will send an independent OAM packet. The collector needs to correlate all the OAM packets corresponding to the user packet. Once this is done, the TTL (or the timestamp, if the network time is synchronized) can be used to infer the flow forwarding path. Due to the lack of some explicit identifiers as in IOAM DEX, the OAM packet correlation needs to take different measures.¶
The first possible approach is to require that the exported data in the OAM packets must include the flow ID plus the user packet ID extracted for the marked user packet. For example, the flow ID can be the 5-tuple IP header of the user traffic, and the user packet ID can be some unique information pertaining to a user packet (e.g., the sequence number of a TCP packet). Alternatively, a hashing digest of the invariant part of the packet during the forwarding (e.g., excluding the TTL and checksum fields of an IPv4 header) can serve as both the flow ID and the packet ID. The possibility of hash collision is negligible since the set of correlated OAM packets can only appear in a very short time frame.¶
If the packet marking interval is made large enough, the flow ID alone is enough to identify a user packet. As a result, it can be safely assumed that all the exported OAM packets for the same flow during a short time interval belong to the same user packet.¶
The second approach requires the network to be synchronized. In this case, the flow ID plus the timestamp at each node can also infer the OAM packet affiliation. For the OAM packets from the same flow, the collector only needs to sort them based on the timestamp. However, some errors may occur under some circumstances. For example, two consecutive user packets from the same flows are marked, but one exported OAM packet from a node is lost. It is difficult for the collector to decide to which user packet the remaining OAM packet is related. In many cases, such a rare error has no catastrophic consequence. Therefore it is tolerable. Again, a larger sampling gap can mitigate this problem.¶
PBT-M should not be applied to all the packets all the time. It is better to be used in an interactive environment where the network telemetry applications dynamically decide which subset of traffic is under scrutiny. The network devices can limit the packet marking rate through sampling and metering. The OAM packets can be distributed to different servers to balance the processing load.¶
Because PBT-M sends telemetry data by dedicated OAM packets, it allows data aggregation and compression. Each node can process the generated raw data according to the configured local data-export policies. Such policies may specify how raw data is used to calculate performance metrics, e.g., max, min, mean, percentile, etc.¶
It is also possible to customize the data collection on each node to reduce the data exporting load. For example, if only end-to-end latency rather than the per-hop delay is of interest to the application, then only the head and tail nodes need to be configured to export the timestamps while the other on-path nodes are just configured to collect the other routine data.¶
Combining the above recommendations, PBT-M can be made flexible and efficient.¶
Given that even an incomplete set of OAM packets for a user packet are useful for network monitoring and measurement, PBT-M is ideal for incremental deployment. A node which is node updated to support PBT-M SHOULD ignore the trigger and continue to forward any marked packet normally.¶
It is also possible for a node to not export certain data items for various reasons (e.g., node busy or data unavailable).¶
Access lists with an optional sampler, [RFC5476], should be configured and attached at the ingress of the PBT-M encapsulation node's to select the intended flows for PTB-M. A flow packet sampling policy meeting the application requirement should also be configured.¶
A telemetry data template pertaining to a flow or a node should be configured to define the type and format of the data to be collected.¶
The OAM packet format should also be configured. Particularly, the flow data should be exported at each participating node using IPFIX [RFC7011].¶
The data decomposition can be achieved on the PBT-M-aware node exporting the data or on the IPFIX data collection. [I-D.spiegel-ippm-ioam-rawexport] describes how data is being exported when decomposed at IPFIX data collection. When being decomposed on the PBT-M-aware node the data can be aggregated according to section 5 of [RFC7015]. The following IPFIX entities are of interest to describe the relationship to the forwarding topology and the control-plane.¶
PBT-M has been used for SRv6 OAM [RFC9259]. Currently, the MPLS Open Design Team is investigating network action support on the MPLS data plane [I-D.andersson-mpls-mna-fwk]. The challenge has been to continue to support existing MPLS architecture, backwards compatibility as well as not excessively increase the depth of the MPLS label stack with a variety of functional special purpose labels and network action indicators similar in concept to the MPLS Entropy label ELI, EL added to the label stack, as well as the MPLS extension headers being in stack or post stack.¶
Reference Augmented Forwarding (RAF) [I-D.raszuk-mpls-raf-fwk] utilizes In Stack Data (ISD) with parity to Entropy Label stack {TL,RFI,RFV,AL} and control plane extension to distribute special network actions and forwarding behaviors.¶
The MPLS Design Team may come up with other alternatives to carry network actions and PBT-M can be supported as a use case.¶
With Segment Routing SR-MPLS and SRv6 as Maximum SID Depth(MSD) as well as PMTU in SR Policy are critical issues for SR path instantiation by a controller, PBT-M can become a critical solution to ensure that OPT can be viable for operators by eliminating telemetry data from being carried in-situ in the SR-TE policy path.¶
This draft provides a critical optimization that fills the gaps with IOAM DEX related to packet marking triggers using existing mechanisms as well as flow path discovery mechanisms to avoid data plane complexity and helps mitigate SR MSD and PMTU issues.¶
Several security issues need to be considered.¶
No requirement for IANA is identified.¶
We thank Clarence Filsfils, Ahmed Abdelsalam, Robert Raszuk, Alfred Morton who provided valuable suggestions and comments helping improve this draft.¶