Internet-Draft | Data Generation and Optimization for DTN | July 2023 |
Li, et al. | Expires 8 January 2024 | [Page] |
Digital Twin Network (DTN) can be used as a secure and cost-effective environment for network operators to evaluate network performance in various what-if scenarios. Recently, AI models, especially neural networks, have been applied for DTN performance modeling. The quality of deep learning models mainly depends on two aspects: model architecture and data. This memo focuses on how to improve the model from the data perspective.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 8 January 2024.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Digital twin is a virtual instance of a physical system (twin) that is continually updated with the latter's performance, maintenance, and health status data throughout the physical system's life cycle. Digital Twin Network (DTN) is a digital twin that is used in the context of networking [I-D.irtf-nmrg-network-digital-twin-arch]. DTN can be used as a secure and cost-effective environment for network operators to evaluate network performance in various what-if scenarios. Recently, AI models, especially neural networks, have been applied for DTN performance modeling.¶
The quality of AI models mainly depends on two aspects: model architecture and data. This memo focuses on the impact of training data on the model. The quality of training data will directly affect the accuracy and generalization ability of the model. This memo focuses on how to design data generation and optimization methods for DTN performance modeling, which can generate simulated network data to solve the problem of practical data shortage and select high-quality data from various data sources. Using high-quality data for training can improve the accuracy and generalization ability of the model.¶
Performance modeling is vital in DTN, which is involved in typical network management scenarios such as planning, operation, optimization, and upgrade. Recently, some studies have applied AI models to DTN performance modeling, such as RouteNet [RouteNet] and MimicNet [MimicNet]. AI is a data-driven technology whose performance heavily depends on data quality.¶
Network data sources are diverse and of varying quality, making it difficult to directly serve as training data for DTN performance models:¶
Therefore, data generation and optimization methods for DTN performance modeling are needed, which can generate simulated network data to solve the problem of practical data shortage and select high-quality data from multi-source data. High-quality data meets the requirements of high accuracy, diversity, and fitting the actual situation of practical data. Training with high-quality data can improve the accuracy and generalization of DTN performance models.¶
The framework of data generation and optimization for DTN performance modeling is shown in Figure 1, which includes two stages: the data generation stage and the data optimization stage.¶
The data generation stage aims to generate candidate data (simulated network data) to solve the problem of the shortage of practical data from production networks. This stage first generates network configurations and then imports them into data generators to generate the candidate data.¶
The data optimization stage aims to optimize the candidate data from various sources to select high-quality data.¶
Several topics related to data generation and optimization for DTN performance modeling require further discussion.¶
This document has no requests to IANA.¶