AI-based automated assurance: a necessity for reliable 5G
Bill Kaufmann, director of product management, network assurance and analytics at Blue Planet, explores the role of AI and automation in a 5G world.
In the 5G world, automation is a must. Communications Service Providers (CSPs) are striving to deliver the dynamic and ultra-fast services we have come to know and want; however, this means that traditional and manual processes can no longer suffice. Instead, automation is becoming a necessity, especially as the roll out of 5G continues. CSPs are looking to deliver on the cloud-like network experiences customers want, and this means leveraging Artificial Intelligence (AI) and Machine Learning (ML) to automate the trouble-to-resolve process.
CSPs face several challenges as they work to roll out 5G and transition to cloud-based, virtual infrastructure. Firstly, technologies like 5G, edge computing and SD-WAN, are creating much more dynamic networks with more distributed workloads. These networks, services and technologies therefore need completely different types of support. To further complicate the issue, the newly introduced protocols and network architectures can also increase the time it takes to remediate issues..
CSPs are also managing more continuous software releases in their networks and operations. As a result, interoperability issues are surfacing between different suppliers, application providers, and the underlying infrastructure creating more outages for end users. It doesn’t help that CSPs are facing a shortage of skilled software operators and existing staff must adapt to all the changes that are occurring. Operators normally build excess capacity into their networks for peak loads, but cloud fundamentally changes this, requiring the adoption of agile DevOps practices and rapid introduction of new services.
These factors are making it more difficult than ever before for operators to address issues using traditional, manually intensive processes. Not to mention, these factors also cause an increase in operational costs. To address this, CSPs need to shift from traditional methods of assurance, such as root-cause analysis and event-based monitoring, to automated assurance, which uses AI to harness large data sets within the operator’s environment and then applies ML algorithms for continuous learning. This can help CSPs identify problems before they affect customers.
Taking steps towards automation
In this new 5G era, CSPs still need to collect, correlate, visualise network events and fix problems. But now they must do it on a massive scale, which requires an additional level of intelligent automation that can be provided only by AI and ML. Incorporating these technologies into service assurance won’t happen instantly, however, CSPs can gradually introduce the new capabilities.
The first step is providing analytics insights to operations teams. Then they can verify that the results are accurate and implement responses manually. Over time as the intelligence is validated, it can be built into automated workflows. ML tools will be key to the automation step. These work by analysing the rapidly growing number of performance metrics, alarms, and user actions that telco networks generate daily and observing patterns over time.
Based on the continuous learning, closed-loop management can be implemented to automate the processing of new events as they occur. Closed-loop automation continuously assesses network conditions, traffic and performance demands, and resource availability, to determine the best path for traffic to take. The process, which introduces ML into operations, relies on analytics, policy, and orchestration to enable self-optimisation.
Network slicing as a use case
Automated assurance becomes especially important as CSPs virtualise their core networks and introduce 5G network slicing, which allows operators to deliver dynamic services that have different requirements for throughput, latency, and availability over the same underlying infrastructure. As CSPs deploy IP/MPLS in their backhaul networks, changes in routers at the infrastructure layer can have unintended consequences on network slices and, as a result, on end-user experiences.
With networks being disrupted and degraded without an easy-to-identify cause, then the question becomes how to make changes to the network and routing in real-time, and effectively resolve issues while ensuring different types of users, services, and their SLAs are not impacted.
ML tools can help efficiently troubleshoot and resolve these types of issues. They can help CSPs identify network configuration changes that might be risky, recommend which services to shift to a preferred path, and how to implement the change. They can also facilitate faster problem resolution by creating correlations between a problem seen in the network and similar issues that have occurred in the past.
The good news is that help is on hand for CSPs looking to work toward this automated assurance nirvana. By combining domain orchestration with automated network analytics through the right solutions, operators can build a unified view of what their network looks like. From the application layer to the virtual infrastructure and servers that support them, down to the transport network that interconnects all the devices – and push any changes that are required into the network. The result is an unprecedented level of end-to-end visibility and control.