The success of IoT is dependent on two key elements of the IoT ecosystem: performance of the things (sensored devices) and performance of the networks that connect them. The latter offers a greater risk to the Service Provider, as the devices often are required to have very reliable and high quality connectivity to support the services offered to some high-revenue industry verticals. This is due to the fact that these customers (industry verticals) have zero or extremely low tolerance towards connectivity issues because they are running IoT services that are life-critical (e.g. remote healthcare and autonomous cars), mission-critical (e.g. agriculture, fleet management and robotized factories) or other relevant consumer services (e.g. utilities, energy savings and Smart Home). Such services would depend on 100% reliable connections between devices which, in most cases, do not expect failures of links, call drops, loss of signal, poor coverage, noise, latency, errors and faults; the list of possible connectivity issues is long and a challenge for Service Providers.
IoT Service Providers will need high assurance of connectivity, device to device, thing to thing.
In a hyper- connected world, failed devices or connections might not only violate SLAs with financial implications but, more importantly, they might impact life-critical or mission-critical communications, as mentioned. Certain network mesh topologies with high availability and in-built redundancy could reduce the impact of such failures. However, IoT Service Providers need assurance systems to discover, interpret and manage connectivity issues in real-time. Traditional network and service assurance will need to be raised to a higher level for IoT connectivity assurance. Some quick approaches to achieve this would be:
- Proactive approach to problem-solving
- Pre-emptive maintenance
- Uptime protection to support SLAs
- Alternative routes for device connections
- Reliability improvements
- Inter-technology interoperability
- Maintaining low latency
Keeping an eye on the above key areas will largely ensure high connectivity, whether the IoT communication happens over WiFi, NB-IoT, LoRa, mobile or broadband networks, or a combination of these.
To dive into finer details of what is required to be done, IoT connectivity assurance requires all of the following assurance components:
- Network performance management
- Network fault management
- Service quality management
- Vertical (customer) experience management
- Device (customer) experience management
- Analytics (including Machine Learning and statistical analysis)
However, to set a priority in the early days of IoT operations, IoT connectivity assurance could begin with real-time network fault management, followed by proactive device and network performance management, followed by corrective systems using closed-loop automation. Simple analytics, in some cases, will need to be introduced right from the beginning to ensure QoS, followed by machine learning based analytics to derive higher intelligence for assurance and new business purposes. The above sequence of IoT assurance management will ensure that the IoT network performs at a very high level, is faultless, always connected and delivering on the promise of 100% connectivity and 100% reliability.
Another aspect to consider is that managing the inter-connections of billions of IoT devices will require almost as many connections to assure, which means the volumes of data monitored, stored and filtered is much higher. To do this, the Operations Center will require support from other teams and significant collaboration between different functions. The commercial teams (Enterprise Sales, IoT Partnership team, Customer Care) will need to build up their capabilities to address the different needs of the diverse industry verticals whom they serve. SLAs and feedback from the IoT Service Providers will ensure that suitable industry-specific or customer-specific KPIs, dashboards and analytics are built by the operations teams.
The technical operations team will typically be involved in monitoring the following:
- Devices classified by their industry vertical
- Volume and nature of communication between the devices, between devices and controllers and between humans and devices
- Performance of IoT gateways (data aggregation levels)
- End-to-end IoT network/service analytics
- End-to-end network capacity/congestion metrics
Some key connectivity assurance revamps involve the use of analytics and automation in all aspects of assurance. The following Operation Center techniques highlight the need for analytics and automation in assuring IoT networks/services:
- Predicting IoT failures: Managing IoT traffic by using fault data and analytics, to forecast patterns and prevent IoT network/service/device failures, includes building dashboards for service availability, incident/unavailability breakdown by region/location, and also geolocation-based service impact
- Automating root cause analysis: Automated root cause analysis identifies the parent alarm and its relationship with network elements, reducing mean time to repair
- Predictive assurance to protect SLAs: Machine learning, when integrated with fault and performance data, offers powerful predictive management capability to anticipate problems and helps the CSPs in protecting their customer SLAs
- Service Quality Management with automation: With SQM able to identify policy violations and triggering closed-loop actions, many service related problems can be proactively identified and corrective actions in the network can be taken
- Automating fault management: Automating network outage recoveries and integrating fault management with the OSS ecosystem (Trouble-Ticket, Inventory, Orchestrators, SQM, CRM, Work Force Management, etc.) will lead to an automated, Zero-touch Operation Center
- Automation and analytics for network optimization: Capacity and congestion trends in the multi-technology IoT networks can help the network planners identify capacity needs well in time, and also dynamically re-distribute capacity using automation techniques
The above cases of integration of analytics and automation with connectivity assurance will drive changes in the way assurance information is visualized and actioned in the Operation Centers. IoT Operation Centers will thus be rendered more agile, flexible and self-healing, and will quickly evolve to predictive fault/performance management – much needed to maintain the high reliability and high availability of IoT connections.
In the assurance systems, changes to the underlying mediation and processing engines will be introduced to ensure stability and reliability, as data volumes shoot us and the need to derive higher intelligence from data goes up. The assurance systems need to accept new data types, process them by type of vertical, be more real-time and with cloud capabilities.
In summary, for a successful launch and long-term connectivity assurance of IoT services, an overhaul of assurance systems is mandatory, as is the introduction of analytics and automation.