Internet-Draft | SRv6 and MPLS interworking | July 2023 |
Agrawal, et al. | Expires 11 January 2024 | [Page] |
This document describes SRv6 and MPLS/SR-MPLS interworking and co-existence procedures.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 11 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The incremental deployment of SRv6 into existing networks require SRv6 to interwork and co-exist with SR-MPLS/MPLS. This document introduces interworking scenarios and building blocks for solutions to inter connect them.¶
This document assumes SR-MPLS-IPv4 for MPLS domains but the design equally works for SR-MPLS-IPv6, LDP-IPv4/IPv6 and RSVP-TE-MPLS label binding protocols.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
A multi-domain network (Figure 1) can be generalized as a central domain C with many leaf domains around it. Specifically, the document looks at a service flow from an ingress PE in an ingress leaf domain (LI), through the C domain and up to an egress PE of the egress leaf domain (LE). Each domain runs its own IGP instance. A domain has a single data plane type applicable both for its overlay and its underlay.¶
There are various SRv6 and SR-MPLS-IPv4 interworking scenarios possible.¶
Below scenarios cover various cascading of SRv6 and MPLS networks, e.g., SR-MPLS-IPv4 <-> SRv6 <-> SR-MPLS-IPv4 <-> SRv6 <-> SR-MPLS-IPv4 etc, though not all combinations are described for brevity.¶
Provider edge devices run MPLS based [RFC4364] or SRv6 Service SID based [RFC9252] BGP L3(e.g.VPN) or L2(e.g.EVPN) services through service Route Reflectors. Service endpoint signaling through borders routers and corresponding forwarding state provide interworking over intermediate transport domain.¶
SRv6 over MPLS (6oM)¶
MPLS over SRv6 (Mo6)¶
Note: Easiest and most probable deployment is ships in the night i.e. supporting dual stack and IPv4 MPLS in each domain.¶
L3/L2 service signaling discontinuity i.e. SRv6 service SID based PE interworks with BGP MPLS based PE for service connectivity. L3/L2 service BGP signaling and forwarding state provide interworking over intermediate domain.¶
The following terms used within this document are defined in [RFC8402]: Segment Routing, SR-MPLS, SRv6, SR Domain, Segment ID (SID), SRv6 SID, Prefix-SID.¶
Domain: Without loss of the generality, domain is assumed to be instantiated by a single IGP instance or a network within IGP if there is clear separation of data plane.¶
Node k has a classic IPv6 loopback address Ak::1/128.¶
A SID at node k with locator block B and function F is represented by B:k:F::¶
A SID list is represented as <S1, S2, S3> where S1 is the first SID to visit, S2 is the second SID to visit and S3 is the last SID to visit along the SR path.¶
(SA,DA) (S3, S2, S1; SL) represents an IPv6 packet with:¶
IPv6 header with source address SA, destination addresses DA and SRH as next-header¶
SRH with SID list <S1, S2, S3> with SegmentsLeft = SL¶
Note the difference between the <> and () symbols: <S1, S2, S3> represents a SID list where S1 is the first SID and S3 is the last SID to traverse. (S3, S2, S1; SL) represents the same SID list but encoded in the SRH format where the rightmost SID in the SRH is the first SID and the leftmost SID in the SRH is the last SID. When referring to an SR policy in a high-level use-case, it is simpler to use the <S1, S2, S3> notation. When referring to an illustration of the detailed packet behavior, the (S3, S2, S1; SL) notation is more convenient.¶
This document introduces a new SRv6 SID behavior. This behavior is executed on border routers between the SRv6 and MPLS domain.¶
The "Endpoint with decapsulation and MPLS table lookup" behavior.¶
The End.DTM SID MUST be the last segment in a SR Policy, and a SID instance is associated with an MPLS table.¶
When N receives a packet destined to S and S is a local End.DTM SID, N does:¶
S01. When an SRH is processed { S02. If (Segments Left != 0) { S03. Send an ICMP Parameter Problem to the Source Address, Code 0 (Erroneous header field encountered), Pointer set to the Segments Left field, interrupt packet processing and discard the packet. S04. } S05. Proceed to process the next header in the packet S06. } When processing the Upper-layer header of a packet matching a FIB entry locally instantiated as an End.DTM SID, N does: S01. If (Upper-Layer Header type == 137(MPLS) ) { S02. Remove the outer IPv6 Header with all its extension headers S03. Set the packet's associated FIB table to T S04. Submit the packet to the MPLS FIB lookup for transmission according to the lookup result. S05. } Else { S06. Process as per [ietf-spring-srv6-network-programming] section 4.1.1 S07. }¶
The "Endpoint with decapsulation and MPLS label push" behavior.¶
The End.DPM SID MUST be the last segment and a SID instance is associated with label stack.¶
When N receives a packet destined to S and S is a local End.DPM SID, N does:¶
S01. When an SRH is processed { S02. If (Segments Left != 0) { S03. Send an ICMP Parameter Problem to the Source Address, Code 0 (Erroneous header field encountered), Pointer set to the Segments Left field, interrupt packet processing and discard the packet. S04. } S05. Proceed to process the next header in the packet S06. } When processing the Upper-layer header of a packet matching a FIB entry locally instantiated as an End.DPM SID, N does: S01. Remove the outer IPv6 Header with all its extension headers S02. Push the MPLS label stack associated with S S03. Submit the packet to the MPLS engine for transmission¶
The H.Encaps.M behavior encapsulates a received MPLS Label stack [RFC3032] packet in an IPv6 header with an SRH. Together MPLS label stack and its payload becomes the payload of the new IPv6 packet. The Next Header field of the SRH MUST be set to 137 [RFC4023].¶
The H.Encaps.M.Red behavior is an optimization of the H.Encaps.M behavior. H.Encaps.M.Red reduces the length of the SRH by excluding the first SID in the SRH of the pushed IPv6 header. The first SID is only placed in the Destination Address field of the pushed IPv6 header. The push of the SRH MAY be omitted when the SRv6 Policy only contains one segment and there is no need to use any flag, tag or TLV. In such case, the Next Header field of the IPv6 header MUST be set to 137 [RFC4023].¶
Binding Segment (BSID) is bound to SR policy [RFC8402]. Further an SR-MPLS label can be bound to an SRv6 Policy and an SRv6 SID can be bound to an SR-MPLS Policy. The IW SR-PCE solution Section 7.1.1 leverage these BSIDs as segments of SR policy on headend domain to represent intermediate domain of different dataplane type. In summary, an intermediate domain of different data plane type is represented by BSID of ingress domain data plane type in SID list.¶
Figure 1 shows reference multi-domain network topology and Section 2 its description. The procedure in this section are illustrated using the topology.¶
Following is assumed for data plane support of various nodes:¶
A VPN route is advertised via service RRs (S-RR) between an egress PE(node 10) and an ingress PE (node 1).¶
For illustrations, the SRGB range starts from 16000 and prefix SID of a node is 16000 plus node number¶
As described in Section 2.1.1, transport IW requires:¶
This draft enhances two well-known solutions to achieve above:¶
This procedure provides a best-effort path as well as a path that satisfies the intent (e.g. low latency), across multiple domains. Service routes (VPN/EVPN) are received on ingress PE with color extended community from egress PE. A Color is a 32-bit numerical value that associates an SR Policy with an intent [RFC9256]. Ingress PE does not know how to compute the traffic engineered path through the multi-domain network to egress PE and requests SR-PCE for it. The SR-PCE is aware of interworking requirement at border nodes as its fed with BGP-LS topological information from each domain. It programs intermediate domain data plane specific policy on border nodes for the given intent and represents it in end to end path SID list on ingress PE leveraging Section 6.¶
Below sections describe 6oM and Mo6 IW with SR-PCE¶
Service prefix (e.g. VPN or EVPN) is received on head-end (node 1) with color extended community (C1) from egress PE (node 10) with SRv6 service SID. The PCE computes (C1,10) path via node 2, 5 and 8. It programs an SR policy at border node 4 with segment list node 5 and 7 bounded to an End.BM BSID [RFC8986]. SR-PCE responds back to node 1 with SRv6 segments along required SLA including End.BM at node 4 to traverse SR-MPLS-IPv4 C domain.¶
For example, SR-PCE create SR-MPLS policy (C1,7) at node 4 with segments <16005,16007>. It is bound to End.BM behavior with SRv6 BSID as B:4:BM-C1-7::¶
The data plane operations for the above-mentioned interworking example are:¶
Node 1 performs SRv6 function H.Encaps.Red with VPN service SID and SRv6 Policy (C1,10):¶
Packet leaving node 1 IPv6 ((A:1::, B:2:E::) (B:10::DT4, B:8:E::, B:4:BM-C1-7:: ; SL=3))¶
Node 2 performs End function¶
Packet leaving node 2 IPv6 ((A:1::, B:4:BM-C1-7::) (B:10::DT4, B:8:E::, B:4:BM-C1-7:: ; SL=2))¶
Node 4(border rout4er) performs End.BM function¶
Packet leaving node 4 MPLS (16005,16007,2)((A:1::, B:8:E::) (B:10::DT4, B:8:E::, B:4:BM-C1-7-:: ; SL=1)).¶
Node 7 performs a native IPv6 lookup on due PHP behavior for 16007¶
Packet leaving node 7 IPv6 ((A:1::, B:8:E::) (B:10::DT4, B:8:E::, B:4:BM-C1-7:: ; SL=1))¶
Node 8 performs End(PSP) function¶
Packet leaving node 8 IPv6 ((A:1::, B:10::DT4))¶
Refer Section 2.1.1 for Mo6 scenario. MPLS Service prefix (e.g. VPN or EVPN) is received on head-end(node 1) with color extended community(C1) from egress PE(node 10). The PCE computes color-aware C1 path via node 2, 5 and 8. It programs a SRv6 policy bound to MPLS BSID at border node 4 with SRv6 segment list along required color-aware path with last segment of behavior End.DTM Section 4.1. SR-PCE responds back to node 1 with MPLS segment list including MPLS BSID of SRv6 policy at node 4 to traverse SRv6 core domain.¶
For example, SR-PCE create SRv6 policy (C1,7) at node 4 with segments <B:5:E::,B:7:DTM::>. It is bound to MPLS BSID 24407.¶
The data plan operations for the above-mentioned interworking example are:¶
Node 1 performs MPLS label stack encapsulation with VPN label and SR-MPLS Policy (C1,10):¶
Packet leaving node 1 towards 2 (Note: PHP of node 2 prefix SID): MPLS packet (16004,24407,16008,16010,vpn_label)¶
Node 2 forwards traffic towards 4 (PHP of 16004)¶
Packet leaving node 2 MPLS packet (24407,16008,16010,vpn_label)¶
Node 4 steers MPLS traffic into SRv6 policy bound to 24407¶
Packet leaving node 4 IPv6(A:4::, B:5:E::) (B:7:DTM:: ; SL=1)NH=137) MPLS((16008,16010,vpn_label)¶
Node 7 receive IPv6 packet with DA=B:7:DTM::. It performs DTM behavior to remove IPv6 header and perform 16008 lookup in MPLS table.¶
Packet leaves node 7 towards node 8(PHP of 16008) MPLS packet (16010,vpn_label)¶
Node 8 forwards traffic towards 10 (PHP of 16010)¶
Packet leaving node 8 MPLS packet (vpn_label)¶
Procedures described below build upon BGP 3107 [I-D.ietf-mpls-seamless-mpls] and [RFC4798] to advertise transport reachability for PE IPv4 loopbacks or SRv6 locators across a multi-domain network. The procedures leverage existing SAFI IPv6 Unicast (2/1) and BGP-LU (1/4, 2/4). Nexthop self on border routers provide independence of intra domain tunnel technology in different domains.¶
The sections below describe 6oM and Mo6 IW with BGP procedures for best effort paths to a locator or loopback prefix. The procedures are equally applicable to intent aware paths, i.e., locator assigned for a given intent, for instance from an IGP-FlexAlgo. They are also applicable to color-aware routes [I-D.ietf-idr-bgp-car] recursing over intent aware intra-domain paths.¶
Refer Section 2.1.1 for 6oM scenario. SRv6 based L3/L2 BGP services are signaled with SRv6 Service SID between PEs through Service RRs with no color extended community. Ingress PEs need reachability to remote locator to send traffic to SRv6 service SID.¶
Ingress border router advertise remote locators or its summary in LI domain. Options to advertise are:¶
Control plane example:¶
Routing Protocol(RP) @10:¶
RP @ 7:¶
RP @ 4:¶
RP @ 1:¶
FIB state¶
@1: IPv4 VRF V/v => H.Encaps.red <B:4:END::, B:10:DT4::> with SRH, SRH.NH=IPv4 @4: IPv6 Table: B:4:END:: => Update DA with B:10:DT4::,set IPv6.NH=IPv4, pop the SRH @4: IPv6 Table: B:10::/48 => push MPLS label 2 (Explicit NULL), push MPLS Label 16007 @7: MPLS label 2 => pop and lookup next IPv6 DA @7: IPv6 Table B:10::/48 => forward via ISIS path to 10 @10: IPv6 Table B:10:DT4:: => pop the outer header and lookup the inner IPv4 DA in the VRF¶
Refer Section 2.1.1 for Mo6 scenario. MPLS based L3/L2 BGP services are signaled with IPv4 next-hop of PE through Service RRs with no color extended community. Ingress PE need labelled reachability to remote PE IPv4 loopback address advertised as next hop with service routes.¶
BGP LU [RFC8277] advertise IPv4 PE loopbacks. Next hop self-performed on border routers.¶
Following are options and protocol extensions to tunnel IPv4 PE loopback LSP through SRv6 C domain¶
Intuitive solution for an MPLS-minded operator¶
Existing BGP LU updates between border routers signal SRv6 SID associated with DTM behavior. [I-D.agrawal-bess-bgp-srv6-mpls-interworking] proposes "SRv6 tunnel for label route" TLV of the BGP Prefix-SID Attribute to signal SRv6 SID to tunnel MPLS packet with label in NLRI at the top of its label stack through SRv6/IPv6 domain. Below describes the control plane and corresponding FIB state to achieve such tunneling:¶
Control plane example¶
Routing Protocol(RP) @10:¶
RP @ 7:¶
RP @ 4:¶
RP @ 1:¶
Forwarding state at different nodes:¶
@1: IPv4 VRF: V/v => out label=vpn_label, next hop=IPv4 address of node 10 @1: IPv4 table: IPv4 address of node 10 => out label=16010, next hop=node4 @1: IPv4 table: IPv4 address of node 4 => out label=16004, next hop=interface to reach 2 @4: MPLS Table: 16010 => out label=16010, H.Encaps.M.red with DA=B:7:DTM:: @4: IPv6 table: B:7::/48 => next hop=interface to reach 5 @7: SRv6 My SID table: B:7:DTM:: => decaps IPv6 header and lookup top label. @7: MPLS table: 16010 => out label=16010, next hop=interface to reach 8 @10: MPLS table: vpn label => pop label and lookup the inner IPv4 DA in the VRF¶
During transition when MPLS data plane is still enabled in C domain, an ABR that does not understand "SRv6 tunnel for label route" TLV in BGP Prefix-SID Attribute or based on operator configured local policy can continue MPLS encapsulation using label in NLRI and LSP to next hop.¶
For each PE IPv4 loopback address, existing BGP 3107 label cross-connect on area border router is replaced by label to SRv6 SID cross-connect or vice versa. In effect, it creates a translation between from 3107 label to SRv6 SID at ingress of SRv6 domain and SRv6 SID to 3107 label on egress.¶
Section 2.2 of [I-D.agrawal-bess-bgp-srv6-mpls-interworking] describes how existing BGP advertisement can signal SRv6 SID associated with DPM behavior from egress to ingress border router.¶
As described in Section 2.1.2 Service IW need BGP SRv6 based L2/L3 PE interworking with BGP MPLS based L2/L3 PE.¶
There are a number of different ways of handling this scenario as detailed below.¶
Gateway is router which supports both BGP SRv6 based L2/L3 services and BGP MPLS based L2/L3 services for a service instance (e.g. L3 VRF, EVPN EVI). It terminates service encapsulation and perform L2/L3 destination lookup in service instance.¶
Couple of border routers can act as gateway for redundancy. It can scale horizontally by distributing service instance among them.¶
This is similar to inter-as option B procedures described in [RFC4364] just that service label cross-connect on border router is replaced with service label to SRv6 service SID or vice verse translation on IW node.¶
Certain L2 service specific information (eg. control word) translation is out of the scope. It will be covered in separate document.¶
In addition, the draft also addresses migration and coexistence of the SRv6 and SR-MPLS-IPv4. Co-existence means a network that supports both SRv6 and MPLS in a given domain. This may be a transient state when brownfield SR-MPLS-IPv4 network upgrades to SRv6 (migration) or permanent state when some devices are not capable of SRv6 but supports native IPv6 and SR-MPLS-IPv4.¶
These procedures would be detailed in a future revision¶
Convergence on failure of border routers can be achieved by well known methods for BGP inter domain routing approach:¶
This document introduces a new SRv6 Endpoint behaviors "End.DTM" and "End.DPM". IANA is requested to assign identifier value in the "SRv6 Endpoint Behaviors" sub-registry under "Segment Routing Parameters" registry.¶
+-------------+--------+-------------------------+------------------+ | Value | Hex | Endpoint behavior | Reference | +-------------+--------+-------------------------+------------------+ | TBD | TBD | End.DTM | <this document> | +-------------+--------+-------------------------+------------------+ | TBD | TBD | End.DPM | <this document> | +-------------+--------+-------------------------+------------------+¶
The authors would like to acknowledge Kamran Raza, Dhananjaya Rao, Stephane Litkowski, Pablo Camarillo, Ketan Talaulikar¶