eRAN7.0
Capacity Monitoring Guide
Issue
DraftA
Date
2014-1-20
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2014. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders.
Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd. Address:
Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China
Website:
http://www.huawei.com
Email:
[email protected]
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
i
eRAN7.0 Capacity Monitoring Guide
About This Document
About This Document Purpose Growing traffic in mobile networks requires more and more resources. Lack of resources will affect user experience. This document provides guidelines on LTE FDD capacity monitoring including details on how to identify resource allocation problem and on how to monitor network resource usage. Capacity monitoring provides data reference for network reconfiguration and capacity expansion and enables maintenance personnel to take measures before resources insufficiency affects network QoS and user experience. NOTE
For definitions of the man-machine language (MML) commands, parameters, alarms, and performance counters mentioned in this document, see the "Operation and Maintenance" part in 3900 Series LTE eNodeB Product Documentation for eNodeB base station, BTS3202E Product Documentation for BTS3202E base station, and BTS3203E LTE Product Documentation for BTS3203E base station.
For the BTS3202E and the BTS3203E LTE, the main control unit, transmission unit, and baseband unit share the CPU because they are integrated into the same board, called BTS3202E board or BTS3203E LTE board. The main control board and the baseband board mentioned in this document correspond to the BTS3202E board or BTS3203E LTE board, and the CPU usage of the main control board corresponds to that of the BTS3202E board or BTS3203E LTE board.
This document is not applicable to scenarios with large capacity and heavy traffic. For guidelines in such scenarios, contact Huawei technical support.
Product Versions The following table lists the product version related to this document. Product Name
Product Version
DBS3900
V100R009C00
BTS3900
The mapping single-mode base station version is:
BTS3900A
eNodeB: V100R007C00
BTS3900L BTS3900AL BTS3202E BTS3203E
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
ii
eRAN7.0 Capacity Monitoring Guide
About This Document
Intended Audience This document is intended for:
Field engineers
Network planning engineers
Change History This section describes changes in each issue of this document.
Draft A (2014-1-20)
Draft A (2014-1-20) This is the first draft.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
iii
eRAN7.0 Capacity Monitoring Guide
Contents
Contents About This Document .................................................................................................................... ii 1 Overview......................................................................................................................................... 1 1.1 Network Resources .......................................................................................................................................... 1 1.2 Capacity Monitoring Methods.......................................................................................................................... 3
2 Capacity Monitoring..................................................................................................................... 4 2.1 Introduction ...................................................................................................................................................... 4 2.2 Downlink User Perception ............................................................................................................................... 5 2.2.1 Monitoring Principles ............................................................................................................................. 5 2.2.2 Monitoring Methods ............................................................................................................................... 6 2.2.3 Suggested Measures ................................................................................................................................ 6 2.3 PRACH Resource Usage .................................................................................................................................. 6 2.3.1 Monitoring Principles ............................................................................................................................. 6 2.3.2 Monitoring Methods ............................................................................................................................... 6 2.3.3 Suggested Measures ................................................................................................................................ 7 2.4 PDCCH Resource Usage .................................................................................................................................. 7 2.4.1 Monitoring Principles ............................................................................................................................. 7 2.4.2 Monitoring Methods ............................................................................................................................... 8 2.4.3 Suggested Measures ................................................................................................................................ 8 2.5 Connected User License Usage ........................................................................................................................ 8 2.5.1 Monitoring Principles ............................................................................................................................. 8 2.5.2 Monitoring Methods ............................................................................................................................... 8 2.5.3 Suggested Measures ................................................................................................................................ 9 2.6 Paging Resource Usage .................................................................................................................................... 9 2.6.1 Monitoring Principles ............................................................................................................................. 9 2.6.2 Monitoring Methods ............................................................................................................................... 9 2.6.3 Suggested Measures ................................................................................................................................ 9 2.7 Main-Control-Board CPU Usage ................................................................................................................... 10 2.7.1 Monitoring Principles ........................................................................................................................... 10 2.7.2 Monitoring Methods ............................................................................................................................. 10 2.7.3 Suggested Measures .............................................................................................................................. 10 2.8 LBBP CPU Usage .......................................................................................................................................... 11 2.8.1 Monitoring Principles ........................................................................................................................... 11
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
iv
eRAN7.0 Capacity Monitoring Guide
Contents
2.8.2 Monitoring Methods ............................................................................................................................. 11 2.8.3 Suggested Measures .............................................................................................................................. 11 2.9 Transport Resource Group Usage ................................................................................................................... 12 2.9.1 Monitoring Principles ........................................................................................................................... 12 2.9.2 Monitoring Methods ............................................................................................................................. 12 2.9.3 Suggested Measures .............................................................................................................................. 13 2.10 Ethernet Port Traffic ..................................................................................................................................... 13 2.10.1 Monitoring Principles ......................................................................................................................... 13 2.10.2 Monitoring Methods ........................................................................................................................... 13 2.10.3 Suggested Measures ............................................................................................................................ 14
3 Resource Allocation Problem Identification ......................................................................... 15 3.1 Resource Congestion Indicators ..................................................................................................................... 15 3.1.1 RRC Resource Congestion Rate ........................................................................................................... 16 3.1.2 E-RAB Resource Congestion Rate ....................................................................................................... 16 3.2 Resource Allocation Problem Identification Process ..................................................................................... 16
4 Related Counters ......................................................................................................................... 18
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
v
eRAN7.0 Capacity Monitoring Guide
1 Overview
1
Overview
This chapter describes the types of network resources to be monitored and the method of performing capacity monitoring.
1.1 Network Resources Figure 1-1 shows the network resources to be monitored. Figure 1-1 Network resources to be monitored
Table 1-1 describes the types of network resources to be monitored and impacts of resource insufficiency on the system.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
1
eRAN7.0 Capacity Monitoring Guide
1 Overview
Table 1-1 Network resources Resource Type
Meaning
Impact of Resource Insufficiency on the System
Monitoring Item
Cell resources
Physical resource blocks (PRBs)
Bandwidth consumed on the air interface
Users may fail to be admitted, and experience of admitted users is affected.
Downlink User Perception
Physical random access channel (PRACH) resources
Random access preambles carried on the PRACH
Access delays are prolonged, or even access attempts fail.
PRACH Resource Usage
Physical downlink control channel (PDCCH) resources
Downlink control channel resources
Uplink and downlink scheduling delays are prolonged, and user experience is affected.
PDCCH Resource Usage
Connected user license
Maximum permissible number of users in RRC_CONNECTED mode
New services cannot be admitted, and experience of admitted users is affected.
Connected User License Usage
Paging resources
eNodeB paging capacity
Paging messages may be lost, affecting user experience.
Paging Resource Usage
Main-control-board CPU
Processing capability of the main control board of the eNodeB
KPIs deteriorate.
Main-Control-Board CPU Usage
LTE baseband process unit (LBBP) CPU
Processing capability of the LBBP board
KPIs deteriorate.
LBBP CPU Usage
Transport resource groups
eNodeB logical transport resources
Packets may be lost, affecting user experience.
Transport Resource Group Usage
Ethernet ports
eNodeB physical transport resources
Packets may be lost, affecting user experience.
Ethernet Port Traffic
eNodeB resources
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
2
eRAN7.0 Capacity Monitoring Guide
1 Overview
1.2 Capacity Monitoring Methods Capacity monitoring can be implemented using the following two methods:
Daily monitoring for prediction: Counters are used to indicate the load or usage of various types of resources on the LTE network. Thresholds for resource consumption are specified so that preventive measures such as reconfiguration and expansion can be taken to prevent network congestion when the consumption of a type of resource continually exceeds the threshold. For details, see chapter 2 "Capacity Monitoring."
Problem-driven analysis: This method helps identify whether a problem indicated by counters is caused by network congestion through in-depth analysis. With this method, problems can be precisely located so that users can work out a proper network optimization and expansion solution. For details, see chapter 3 "Resource Allocation Problem Identification."
Thresholds defined for capacity monitoring in this document are generally lower than those for alarm triggering so that risks of resource insufficiency can be detected as early as possible.
Thresholds given in this document apply to networks experiencing a steady growth. Thresholds are determined based on experiences. For example, the connected user license usage threshold 60% is specified based on the peak-to-average ratio (about 1.5:1). When the average usage reaches 60%, the peak usage approaches 100%. Threshold determining considers both average and peak values. Telecom operators can define thresholds based on the actual situation.
Telecom operators are encouraged to formulate an optimization solution for resource capacity based on prediction and analysis for networks that are experiencing fast development, scheduled to deploy new services, or about to employ new charging plans. If you require services related to resource capacity optimization, such as prediction, evaluation, optimization, reconfiguration, and capacity expansion, contact Huawei technical support.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
3
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
2
Capacity Monitoring
This chapter describes monitoring principles and methods, as well as related counters, of all types of service resources. Information about how to locate resource bottlenecks and the related handling suggestions are also provided. Note that resource insufficiency may be determined by usage of more than one type of service resource. For example, a resource bottleneck can be claimed only when both connected user license usage and main-control-board CPU usage exceed the predefined thresholds.
2.1 Introduction You need to determine busy hours of the system for accurate monitoring of counters. You are advised to define busy hours as a period when the system or a cell is undergoing the maximum resource consumption of a day.
Table 2-1 describes types of resources to be monitored, thresholds, and handling suggestions. Table 2-1 Types of resources to be monitored, thresholds, and handling suggestions Resource Type
Monitoring Item Downlink Perception
User
PRACH Resource Usage Cell resources
Issue DraftA (2014-1-20)
Conditions
Handling Suggestions
Downlink PRB usage ≥ 70% and downlink user-perceived rate < 2 Mbit/s (default value, user-configurable)
Add carriers eNodeBs.
Usage of preambles contention-based access ≥ 75%
for
Enable the adaptive Backoff or resource adjustment algorithm for the PRACH.
Usage of preambles for non-contention-based access ≥ 75%
Enable the PRACH resource adjustment algorithm and reuse of dedicated preambles between UEs.
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
or
4
eRAN7.0 Capacity Monitoring Guide
Resource Type
2 Capacity Monitoring
Monitoring Item
Conditions
PDCCH Resource Usage
CCE usage ≥ 80%
Connected User License Usage
Connected user license usage ≥ 60%
Handling Suggestions Uplink or downlink PRB usage < 90%
Set PDCCH Symbol Number Adjust Switch to On.
Uplink or downlink PRB usage ≥ 90%
No handling is required.
Main-control-board CPU usage < 60%
Add licenses.
Main-control-board CPU usage ≥ 60%
Add eNodeBs.
Percentage of paging messages received on the S1 interface ≥ 60% or number of paging messages ≥ 1500
Decrease the number of cells in the tracking area list (TAL) that the congested cell belongs to.
Main-Control-Board CPU Usage
Average main-control-board CPU usage ≥ 60% or percentage of times that the CPU usage reaches or exceeds 85% ≥ 5%
Expand the control-plane capacity of the eNodeB.
LBBP CPU Usage
Average LBBP CPU usage ≥ 60% or percentage of times that the CPU usage reaches or exceeds 85% ≥ 5%
Expand the user-plane capacity of the eNodeB.
Transport Group Usage
Packet loss rate ≥ 0.05%, proportion of average transmission rate to configured bandwidth ≥ 80%, or proportion of maximum transmission rate to configured bandwidth ≥ 90%
Expand the bandwidth of the transport resource group.
Proportion of average transmission rate to allocated bandwidth ≥ 70% or Proportion of maximum transmission rate to allocated bandwidth ≥ 85%
Expand the eNodeB transmission capacity.
Paging Resource Usage
eNodeB resources
Resource
Ethernet Port Traffic
2.2 Downlink User Perception 2.2.1 Monitoring Principles Growing traffic leads to a continuous increase in PRB usage. When the PRB usage approaches to 100%, user-perceived rates will decrease. As downlink is a major concern in an LTE network, this document describes only how to monitor downlink user perception. The monitoring principles also apply to uplink.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
5
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
2.2.2 Monitoring Methods The following items are used in monitoring this case:
Downlink PRB usage L.ChMeas.PRB.DL.Used.Avg/L.ChMeas.PRB.DL.Avail x 100%
Downlink user-perceived rate (Mbit/s) = L.Thrp.bits.DL/L.Thrp.Time.DL/1000
where
L.ChMeas.PRB.DL.Used.Avg indicates the average number of used downlink PRBs.
L.ChMeas.PRB.DL.Avail indicates the number of available downlink PRBs.
L.Thrp.bits.DL indicates the total throughput of downlink data transmitted at the PDCP layer in a cell.
L.Thrp.Time.DL indicates the duration for transmitting downlink data at the PDCP layer in a cell.
2.2.3 Suggested Measures Add carriers or eNodeBs if both of the following conditions are met:
Downlink PRB usage ≥ 70%
Downlink user-perceived rate < a user-defined threshold (default value: 2 Mbit/s)
2.3 PRACH Resource Usage 2.3.1 Monitoring Principles The PRACH transmits preambles during random access procedures. If the number of contention-based random access attempts in a second reaches or exceeds N, the preamble conflict probability and access delay increase. The values of N are determined during preamble design, considering factors such as that the preamble conflict probability should be less than 1%. If more than 100 non-contention-based random access attempts are initiated per second, dedicated preambles will become insufficient and the eNodeB will instruct the UE to initiate contention-based random access instead, increasing the access delay for the UE. In handover scenarios, the handover procedure is prolonged.
2.3.2 Monitoring Methods The following items are used in monitoring this case:
Random preamble usage = (L.RA.GrpA.Att + L.RA.GrpB.Att)/3600/N x 100%
Dedicated preamble usage = L.RA.Dedicate.Att/3600/100 x 100%
where
L.RA.GrpA.Att indicates the number of times that random preambles in group A are received.
L.RA.GrpB.Att indicates the number of times that random preambles in group B are received.
L.RA.Dedicate.Att indicates the number of times that dedicated preambles are received.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
6
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
The value of N varies as follows: −
If the system bandwidth is 15 MHz or 20 MHz, N is 100.
−
If the system bandwidth is 5 MHz or 10 MHz and the PRACH resource adjustment algorithm is disabled, N is 50.
−
If the system bandwidth is 5 MHz or 10 MHz and the PRACH resource adjustment algorithm is enabled, N is 100. To check whether the PRACH resource adjustment algorithm is enabled, run the LST CELLALGOSWITCH command to query the value of the RachAlgoSwitch.
2.3.3 Suggested Measures You are advised to take the following measures:
If the random preamble usage reaches or exceeds 75% for X days (three days by default) in a week, enable the adaptive backoff function by running the following command to help reduce the peak RACH load and average access delay: MOD CELLALGOSWITCH: LocalCellId=x, RachAlgoSwitch=BackOffSwitch-1;
If the system bandwidth is 5 MHz or 10 MHz, it is good practice to enable the PRACH resource adjustment algorithm by running the following command: MOD CELLALGOSWITCH: LocalCellId=x,RachAlgoSwitch=RachAdjSwitch-1;
If the dedicated preamble usage reaches or exceeds 75% for X days (three days by default) in a week, enable the PRACH resource adjustment algorithm and reuse of dedicated preambles between UEs by running the following command: MOD CELLALGOSWITCH: LocalCellId=x,RachAlgoSwitch= RachAdjSwitch-1,RachAlgoSwitch=MaksIdxSwitch-1;
This helps reduce the probability of UEs initiating contention-based random access in the case of dedicated preamble insufficiency and therefore helps reduce the access delay.
2.4 PDCCH Resource Usage 2.4.1 Monitoring Principles This capacity indicator measures the number of control channel elements (CCEs) that can be used by the PDCCH. In each radio frame, CCEs must be allocated to uplink and downlink UEs to be scheduled and common control signaling. PDCCH CCEs must be properly configured and allocated to minimize downlink control overheads as well as to ensure satisfactory user-plane throughput.
If PDCCH symbols are insufficient, CCEs may fail to be allocated to UEs to be scheduled, which will result in a long service delay and unsatisfactory user experience.
If PDCCH symbols are excessive, which indicates that the usage of PDCCH CCEs is low, the resources that can be used by the PDSCH decreases. This will also result in low spectral efficiency.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
7
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
If the value of PDCCH Symbol Number Adjust Switch is On, you do not need to monitor PDCCH resource usage. The reason is that the eNodeB automatically adjusts the number of PDCCH symbols based on the CCE load to meet the CCE requirement while preventing excessive PDSCH resource consumption. You can run the LST CELLPDCCHALGO command to query the setting of PDCCH Symbol Number Adjust Switch.
2.4.2 Monitoring Methods The following item is used in monitoring this case: CCE usage = (L.ChMeas.CCE.CommUsed + L.ChMeas.CCE.ULUsed + L.ChMeas.CCE.DLUsed)/L.ChMeas.CCE.Avail x 100% where
L.ChMeas.CCE.CommUsed indicates the number of PDCCH CCEs used for common signaling.
L.ChMeas.CCE.ULUsed indicates the number of PDCCH CCEs used for uplink scheduling.
L.ChMeas.CCE.DLUsed indicates the number of PDCCH CCEs used for downlink scheduling.
L.ChMeas.CCE.Avail indicates the number of available CCEs.
2.4.3 Suggested Measures Measures to be taken also depend on the PRB usage. If the CCE usage reaches or exceeds 80% and the uplink or downlink PRB usage is less than 90% for X days (three days by default) in a week:
If the value of PDCCH Symbol Number Adjust Switch is Off, turn on the switch by running the following command: MOD CELLPDCCHALGO: LocalCellId=x, PdcchSymNumSwitch=ON;
If the uplink or downlink PRB usage reaches or exceeds 90%, no handling is required.
For details about uplink or downlink PRB usage, see section 2.2 "Downlink User Perception".
2.5 Connected User License Usage 2.5.1 Monitoring Principles The connected user license specifies the maximum permissible number of users in RRC_CONNECTED mode. If the connected user license usage exceeds a preconfigured threshold, users may fail to access the network.
2.5.2 Monitoring Methods The following item is used in monitoring this case: Connected user license usage = ∑L.Traffic.User.Avg/Licensed number of connected users x 100% where
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
8
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
L.Traffic.User.Avg indicates the average number of connected users in a cell. ∑L.Traffic.User.Avg indicates the sum of the average number of connected users in all cells under an eNodeB.
The licensed number of connected users can be queried by running the following command: DSP LICENSE: FUNCTIONTYPE=eNodeB;
In the command output, the value of LLT1ACTU01 in the Allocated column is the licensed number of connected users.
2.5.3 Suggested Measures Measures to be taken also depend on the main-control-board CPU usage. If the connected user license usage reaches or exceeds 60% for X days (three days by default) in a week, you are advised to take the following measures:
If the main-control-board CPU usage is less than 60%, increase the licensed limit.
If the main-control-board CPU usage reaches or exceeds 60%, add an eNodeB.
For details about main-control-board CPU usage, see section 2.7 "Main-Control-Board CPU Usage."
2.6 Paging Resource Usage 2.6.1 Monitoring Principles The eNodeB and BTS3202E or BTS3203E LTE can process a maximum of 750 and 500 paging messages per second, respectively. If the number of paging messages exceeds that capacity, paging messages sent from the eNodeB to UEs may be discarded, which leads to a decrease in the call completion rate.
2.6.2 Monitoring Methods The following items are used in monitoring this case:
Percentage of paging messages received over the S1 interface = L.Paging.S1.Rx/3600/Maximum number of paging messages that can be processed per second x 100%
L.Paging.Dis.Num
where
L.Paging.S1.Rx indicates the number of paging messages received over the S1 interface.
L.Paging.Dis.Num indicates the number of paging messages discarded over the Uu interface.
2.6.3 Suggested Measures You are advised to decrease the number of cells in the tracking area list (TAL) that the congested cell belongs to if either of the following conditions is met for X days (three days by default) in a week:
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
9
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
The percentage of paging messages received by the eNodeB over the S1 interface reaches or exceeds 60%.
1500 or more paging messages from the mobility management entity (MME) to UEs are discarded in a day.
2.7 Main-Control-Board CPU Usage 2.7.1 Monitoring Principles The CPU usage reflects the busy level of the eNodeB. If the main-control-board CPUs are busy processing control plane or user plane data, signaling-related KPIs may deteriorate, and users may experience a low access success rate, low E-RAB setup success rate, or high service drop rate. Operators can determine whether KPI deterioration is caused by insufficient main-control-board CPU processing capability or poor radio conditions. The evaluation is as follows:
If the MCS measurement and initial-transmission failure measurement indicate that the channel quality is poor, KPI deterioration may not be caused by main-control-board CPU overload but by deterioration in channel quality.
If the KPIs deteriorate and the main-control-board CPU usage exceeds a preconfigured threshold, you are advised to perform capacity expansion according to section 2.7.3 "Suggested Measures."
2.7.2 Monitoring Methods The following items are used in monitoring this case:
VS.Board.CPUload.Mean
Percentage of times that the main-control-board CPU usage reaches or exceeds a preconfigured threshold (85%) = VS.Board.CPULoad.CumulativeHighloadCount/3600 x 100%
where
VS.Board.CPUload.Mean indicates the average main-control-board CPU usage.
VS.Board.CPULoad.CumulativeHighloadCount indicates the number of times that the main-control-board CPU usage exceeds a preconfigured threshold.
2.7.3 Suggested Measures The main-control-board CPU becomes overloaded if either of the following conditions is met for X days (three days by default) in a week:
The average main-control-board CPU usage reaches or exceeds 60%.
The percentage of times that the main-control-board CPU usage reaches or exceeds 85% is greater than or equal to 5%.
When the main-control-board CPU is overloaded, you are advised to add an eNodeB and connect it to the evolved packet core (EPC) through a new S1 interface.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
10
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
2.8 LBBP CPU Usage 2.8.1 Monitoring Principles If the eNodeB receives too much traffic volume, which is expressed either in bit/s or packet/s, the LBBP CPU responsible for user plane processing is heavily loaded. As a result, the eNodeB has a low RRC connection setup success rate, low E-RAB setup success rate, low handover success rate, and high service drop rate.
2.8.2 Monitoring Methods The following items are used in monitoring this case:
VS.Board.CPUload.Mean
Percentage of times that the LBBP CPU usage reaches or exceeds a preconfigured threshold (85%) = VS.Board.CPULoad.CumulativeHighloadCount/3600 x 100%
where
VS.Board.CPUload.Mean indicates the average LBBP CPU usage.
VS.Board.CPULoad.CumulativeHighloadCount indicates the number of times that the LBBP CPU usage exceeds a preconfigured threshold.
2.8.3 Suggested Measures The LBBP CPU becomes overloaded if either of the following conditions is met for X days (three days by default) in a week:
The average LBBP CPU usage reaches or exceeds 60%.
The percentage of times that the LBBP CPU usage reaches or exceeds 85% is greater than or equal to 5%.
When the LBBP CPU is overloaded, you are advised to perform capacity expansion on the eNodeB user plane as follows:
If the LBBP is an LBBPc, replace the LBBPc with an LBBPd.
Add an LBBP to share the network load, and then determine whether to move existing cells or add new cells based on the number of UEs. The capacity expansion methods are as follows:
−
If the radio resources are sufficient (that is, the usage of each type of radio resources is lower than the threshold), move cells from the existing LBBP to the new LBBP.
−
If the radio resources are insufficient, set up new cells on the new LBBP.
If the eNodeB has multiple LBBPs and one of them is overloaded, move cells from the overloaded LBBP to an LBBP with a lighter load. LBBP load can be indicated by the following:
−
Average CPU usage
−
Percentage of times that the CPU usage reaches or exceeds a preconfigured threshold
−
Number of cells established on an LBBP
If the eNodeB already has a maximum of six LBBPs and more LBBPs are required, add an eNodeB.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
11
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
2.9 Transport Resource Group Usage 2.9.1 Monitoring Principles A transport resource group carries a set of data streams, which can be local data or forwarded data. Local data is classified into control plane, user plane, operation and maintenance (OM), and IP clock data. Forwarded data is not divided into different types. If a transport resource group is congested, it cannot transmit or forward data, which affects service provision. A transport resource group for user plane data is a monitored object. Figure 2-1 shows the position of transport resource group in the TCP/IP model. Figure 2-1 The position of the transport resource group
2.9.2 Monitoring Methods The following items are used in monitoring this case:
Packet loss rate = VS.RscGroup.TxDropPkts/VS.RscGroup.TxPkts x100%
Proportion of the average transmission rate to the configured bandwidth = VS.RscGroup.TxMeanSpeed/Bandwidth configured for the transport resource group x 100%
Proportion of the maximum transmission rate to the configured bandwidth = VS.RscGroup.TxMaxSpeed/Bandwidth configured for the transport resource group x 100%
where
VS.RscGroup.TxDropPkts indicates the number of packets discarded because of transmission failures of a transport resource group.
VS.RscGroup.TxPkts indicates the number of packets transmitted by a transport resource group.
VS.RscGroup.TxMeanSpeed indicates the average transmission rate of a transport resource group.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
12
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
VS.RscGroup.TxMaxSpeed indicates the maximum transmission rate of a transport resource group.
The bandwidth configured for a transport resource group can be queried by running the following command: DSP RSCGRP: CN=x, SRN=x, SN=x, BEAR=xx, SBT=xxxx, PT=xxx;
In the command output, the value of Tx Bandwidth is the bandwidth configured for the transport resource group.
2.9.3 Suggested Measures A transport resource group is congested if one of the following conditions is met:
The packet loss rate reaches or exceeds 0.05% for five days in a week
The proportion of the average transmission rate to the configured bandwidth reaches or exceeds 80% for five days in a week.
The proportion of the maximum transmission rate to the configured bandwidth reaches or exceeds 90% for two days in a week.
When a transport resource group is congested, you are advised to expand the bandwidth of the transport resource group. The following is an example command: MOD RSCGRP: CN=x, SRN=x, SN=x, BEAR=IP, SBT=BASE_BOARD, PT=ETH, PN=x, RSCGRPID=x, RU=x, TXBW=xxxx, RXBW=xxxx;
If the problem persists after the bandwidth adjustment, you are advised to expand the eNodeB bandwidth.
2.10 Ethernet Port Traffic 2.10.1 Monitoring Principles The Ethernet port traffic is the channel traffic at the physical layer, including uplink and downlink traffic. The eNodeB Ethernet port traffic reflects the throughput and communication quality of the Ethernet ports on the main control board of the eNodeB. Based on the monitoring results, you can determine whether the transmission capacity allocated by an operator for the S1 and X2 interfaces on the eNodeB meet the requirements for uplink and downlink transmissions.
2.10.2 Monitoring Methods The following items are used in monitoring this case:
(Item 1) Proportion of the average uplink transmission rate to the allocated bandwidth = VS.FEGE.TxMeanSpeed/Allocated bandwidth x 100%
(Item 2) Proportion of the maximum uplink transmission rate to the allocated bandwidth = VS.FEGE.TxMaxSpeed/Allocated bandwidth x 100%
(Item 3) Proportion of the average downlink reception rate to the allocated bandwidth = VS.FEGE.RxMeanSpeed/Allocated bandwidth x 100%
(Item 4) Proportion of the maximum downlink reception rate to the allocated bandwidth = VS.FEGE.RxMaxSpeed/Allocated bandwidth x 100%
where
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
13
eRAN7.0 Capacity Monitoring Guide
2 Capacity Monitoring
VS.FEGE.TxMeanSpeed indicates the average transmission rate of an Ethernet port.
VS.FEGE.TxMaxSpeed indicates the maximum transmission rate of an Ethernet port.
VS.FEGE.RxMeanSpeed indicates the average reception rate of an Ethernet port.
VS.FEGE.RxMaxSpeed indicates the maximum reception rate of an Ethernet port.
The allocated bandwidth can be queried by referring to Table 2-2.
Table 2-2 Allocated bandwidth Value of LR Switch
Main Control Board
Allocated Bandwidth
Disable
UMPT
1 Gbit/s
LMPT
For items 1 and 2: 360 Mbit/s
For items 3 and 4: 540 Mbit/s
For items 1 and 2: 60 Mbit/s
For items 3 and 4: 178 Mbit/s
UMPT
For items 1 and 2: value of UL Committed Information Rate (Kbit/s)
LMPT
For items 3 and 4: value of DL Committed Information Rate (Kbit/s)
BTS3202E board BTS3203E LTE board Enable
or
You can run the LST LR command to query the values of LR Switch, UL Committed Information Rate (Kbit/s), and DL Committed Information Rate (Kbit/s).
The types of main control boards can be queried by running the following command: DSP BRD: CN=x, SRN=x, SN=x;
In the command output, the value of Config Type is the type of the main control board.
2.10.3 Suggested Measures You are advised to perform transmission capacity expansion if either of the following conditions is met:
The proportion of the average uplink transmission rate (or downlink reception rate) to the allocated bandwidth reaches or exceeds 70% for at least five days in a week. The allocated bandwidth is 750 Mbit/s by default. The actually allocated bandwidth can be obtained from the operator.
The proportion of the maximum uplink transmission rate (or downlink reception rate) to the allocated bandwidth reaches or exceeds 85% for at least two days in a week.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
14
eRAN7.0 Capacity Monitoring Guide
3
3 Resource Allocation Problem Identification
Resource Allocation Problem Identification
This chapter describes how to identify resource allocation problems. Network abnormalities can be found through KPI monitoring. If a KPI is deteriorated, users can analyze the access counters (RRC resource congestion rate and E-RAB resource congestion rate) to check whether the deterioration is caused by resource congestion.
3.1 Resource Congestion Indicators Resource congestion indicators (such as the RRC resource congestion rate and E-RAB resource congestion rate) can be used to check whether the network is congested. Table 3-1 lists the counters related to KPIs. Table 3-1 Counters related to KPIs Performance Counter
Description
L.RRC.ConnReq.Att
Number of RRC Connection Request messages received from UEs in a cell (excluding retransmitted messages)
L.RRC.ConnReq.Succ
Number of RRC Connection Setup Complete messages received from UEs in a cell
L.E-RAB.AttEst
Number of E-UTRAN radio access bearer (E-RAB) setup attempts initiated by UEs in a cell
L.E-RAB.SuccEst
Number of successful E-RAB setups initiated by UEs in a cell
L.E-RAB.AbnormRel
Number of times that the eNodeB abnormally releases E-RABs that are transmitting data in a cell
L.E-RAB.NormRel
Number of times that the eNodeB normally releases E-RABs in a cell
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
15
eRAN7.0 Capacity Monitoring Guide
3 Resource Allocation Problem Identification
3.1.1 RRC Resource Congestion Rate The RRC resource congestion rate is a cell-level indicator. It is calculated using the following formula: RRC resource congestion rate = L.RRC.SetupFail.ResFail/L.RRC.ConnReq.Att x 100% where
L.RRC.SetupFail.ResFail indicates the number of RRC connection setup failures due to resource allocation failures.
L.RRC.ConnReq.Att indicates the number of RRC connection setup requests.
If the RRC resource congestion rate is higher than 0.2%, KPI deterioration is caused by resource congestion.
3.1.2 E-RAB Resource Congestion Rate The E-RAB resource congestion rate is a cell-level indicator. It is calculated using the following formula: E-RAB resource congestion rate = L.E-RAB.FailEst.NoRadioRes/L.E-RAB.AttEst x 100% where
L.E-RAB.FailEst.NoRadioRes indicates the number of E-RAB setup failures due to radio resource insufficiency.
L.E-RAB.AttEst indicates the number of E-RAB setup attempts.
If the E-RAB resource congestion rate is higher than 0.2%, KPI deterioration is caused by resource congestion.
3.2 Resource Allocation Problem Identification Process Figure 3-1 shows the Resource Allocation Problem Identification Process.
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
16
eRAN7.0 Capacity Monitoring Guide
3 Resource Allocation Problem Identification
Figure 3-1 Resource allocation problem identification process
The fault location procedure begins with the identification of abnormal KPIs, followed up by selecting and performing a KPI analysis on the top N cells. Cell congestion mainly results from insufficient system resources. Bottlenecks can be detected by analyzing the access counters (RRC resource congestion rate and E-RAB resource congestion rate).
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
17
eRAN7.0 Capacity Monitoring Guide
4 Related Counters
4
Related Counters
Table 4-1 lists counters involved in capacity monitoring. Table 4-1 Counters involved in capacity monitoring. Resource Type
Counter Name
Description
PRBs
L.ChMeas.PRB.DL.Used.Avg
Average number of used downlink PRBs
L.ChMeas.PRB.DL.Avail
Number of available downlink PRBs
L.Thrp.bits.DL
Total downlink traffic volume for PDCP SDUs in a cell
L.Thrp.Time.DL
Total transmit duration of downlink PDCP SDUs in a cell
L.RA.GrpA.Att
Number of times the contention preamble in group A is received
L.RA.GrpB.Att
Number of times the contention preamble in group B is received
L.RA.Dedicate.Att
Number of times the non-contention-based preamble is received
L.ChMeas.CCE.CommUsed
Number of PDCCH CCEs used for common DCI
L.ChMeas.CCE.ULUsed
Number of PDCCH CCEs used for uplink DCI
L.ChMeas.CCE.DLUsed
Number of PDCCH CCEs used for downlink DCI
L.ChMeas.CCE.Avail
Number of available CCEs
Connected user
L.Traffic.User.Avg
Average number of users in a cell
Paging resources
L.Paging.S1.Rx
Number of received paging messages over the S1 interface in a cell
PRACH resources
PDCCH resources
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
18
eRAN7.0 Capacity Monitoring Guide
Resource Type
4 Related Counters
Counter Name
Description
L.Paging.Dis.Num
Number of discarded paging messages from the MME to UEs due to flow control in a cell
Board CPU resources
VS.Board.CPUload.Mean
Average Board CPU Usage
VS.Board.CPULoad.Cumulative HighloadCount
Number of Times that the CPU Usage of Boards Exceeds the Preconfigured Threshold
Transport resource groups
VS.RscGroup.TxPkts
Number of packets successfully transmitted by the resource group
VS.RscGroup.TxDropPkts
Number of packets discarded by the resource group due to transmission failures
VS.RscGroup.TxMaxSpeed
Maximum transmit rate of the resource group
VS.RscGroup.TxMeanSpeed
Average transmit rate of the resource group
VS.FEGE.TxMaxSpeed
Maximum transmit rate on the Ethernet port
VS.FEGE.TxMeanSpeed
Average transmit rate on the Ethernet port
VS.FEGE.RxMaxSpeed
Maximum receive rate on the Ethernet port
VS.FEGE.RxMeanSpeed
Average receive rate on the Ethernet port
Ethernet ports
Issue DraftA (2014-1-20)
Huawei Proprietary and Confidential Copyright © Huawei Technologies Co., Ltd.
19