/ Aswentong/Guangzhou, July 01, 2022 -- Why do 5G core networks and telecom clouds need observability? In the past year of 2021, in fact, the 5G core network has suffered several failures with a wide range of influence, lasting for a long time and affecting society. In April 2021, a Canadian enterprise experienced a nationwide mobile communication network failure lasting 26 hours. After the failure occurred, the lack of rapid positioning means made it difficult to locate and eliminate the fault in a short time.
The stable operation of 5G communication network is an important foundation to ensure the stable operation of society, and the 5G core network is the hub and brain of 5G communication network, and the key to the operation, maintenance and guarantee of the whole communication network. Based on the above events, it can be sensed that the operational stability of 5G core network needs to be improved. On the other hand, there are shortcomings in the existing fault monitoring, rapid positioning and rapid recovery capabilities.
The operation and maintenance dilemma of 5G core network
Spruce Network found that the current operation and maintenance pain points of 5G core network are mainly: (1) black box of cloud network; (2) high operation and maintenance technical difficulty; (3) fault responsibility and boundary determination is difficult; (4) Difficulty in ensuring business-oriented cloud platform. Here are some practical examples.
Example 1: When the PCF goes wrong, the core network operation and maintenance may contact the cloud platform, "Our PCF1 service is not normal, there is an alarm on the server, please deal with it immediately", while the cloud platform operation and maintenance may think, "The alarm on the server seems to have nothing to do with the abnormal service of PCF, is there something wrong with the PCF software?"
Example 2: After a VNF is upgraded, the service is abnormal. The core network operation and maintenance check for a long time but cannot find out the cause. Finally, the virtual machine has an alarm and reports it to the network cloud operation and maintenance.
Investigate the reason, 5G core network and telecom cloud through general x86+ cloud technology, container technology, to provide hard and soft decoupling solution, stability is uncertain, more need to rely on cloud, container dynamic, elastic ability to provide relative reliability. At the same time, 5G core network is completely carried on an Overlay Underlay network, and complex micro-services inside nes are interconnected through a Full-Mesh network. The boundaries and paths between nes are not clear, and the network is basically in a "black box" state. Therefore, it is not difficult to see that observability is very important for the reliable operation of 5G core network!
In recent years, cloud native observability has become an important theory to solve business reliability in the IT field, and "observability = reliability" has basically become the common cognition of IT operation and maintenance. The so-called cloud native observability is simply a quick and effective diagnosis of the internal running state of complex business systems. After nearly ten years of development, Spruce network has gradually moved from the core technology of SDN to network automation and observability, and is committed to solving the core pain points of cloud native application diagnosis difficulties. DeepFlow products have accumulated a lot of practical experience in all walks of life, and successfully helped several enterprises to build multi-dimensional and integrated observability platforms.
In 5G core network, the number of network IP nodes in its own operation has increased by more than one hundred times. Containerized micro-service POD forms a Full-Mesh network. Make the internal network of the entire 5G core network "black box".
For 5G core networks, the DeepFlow collector technology can be used to achieve comprehensive observability of network + applications, instead of relying on 5GC developer's log output capability, index output capability and user tracking data output capability. Meanwhile, DeepFlow's powerful data analysis capability enables it to monitor the panoramic performance of 5G core network elements and cloud platforms from macro to micro, at different levels and in different dimensions.
For example, cloud platform operation and maintenance personnel can monitor the traffic exchange topology and service access performance between cloud resource pools and hosts from the observation view of hosts on the whole network. From the host observation view of a single NE, you can monitor the distribution of microservices/modules in the VNF software of an NE (such as AMF or SMF) on the host, the communication topology and communication performance between hosts, and quickly discover anomalies in the host dimension.
For example, core network operation and maintenance personnel: Observe the service communication relationship and service communication performance of SBI ports on VNF nes in the whole network based on VNF dimensions. Based on the micro-service POD dimension inside VNF, the communication relationship and communication performance of the POD granularity inside VNF are observed. Through Pod-to-POD network full stack link tracking, observe the flow path of any client to server in the cloud network.
3. End-to-end tracing of service access from application to network
In addition to full-stack link tracing, DeepFlow implements end-to-end tracing of application + network for each service access at the application layer. Operation and maintenance personnel can search, analyze and track any access through the system to quickly detect application anomalies. "From client process -- > Service 1 process -- > Service 2 process -- >... The end-to-end service call relationship of -- > Service n ". The correlation analysis of the delay and anomaly of each service call is realized, including the correlation analysis of the critical path and timely delay index of the traffic of each service call in the cloud network. It can be said that DeepFlow realizes the observability of the unity of application and network, and goes deep and detailed to the granularity of each business visit.
< /p>In the field of 5G core network, monitoring, operation and maintenance, security are new challenges. As an IT network solution manufacturer, Spruce Network will intensify innovation efforts, focus on the development of 5G business, and continue to contribute to the high-quality development of 5G in China.