EVALUATING AND IMPROVING THE QUALITY OF SERVICE OF SECONDGENERATION CELLULAR SYSTEMS Issue Date: September 2004 Abstract—This paper provides an insight into network performance management and quality of service (QoS) of matured second generation (2G) cellular systems (after the pre-/post-launch testing and optimization phase). It identifies the components of QoS and the available mechanisms to analyze and evaluate them. The paper also identifies important key performance indicators (KPIs) that need to be monitored and optimized and provides a way to collect and classify data for analysis. Finally, the most common QoS shortfalls and possible solutions are discussed.
y 1992, many European countries had operational second generation (2G) global system for mobile communication (GSM) systems, and GSM started to attract interest worldwide. GSM proved to be a major commercial success for system manufacturers and network operators, many of which enjoyed exponential growth until the end of the decade. The most valuable and limited resource of GSM is the available frequency spectrum, which limits the system capacity. The successful take-up of GSM services led to continuous development of sophisticated algorithms to maximize the system capacity. This caused a substantial technological evolution of GSM, with annual (and often biannual) releases of new functionality, which have increased the complexity of the system. The evolution of the Ericsson™ “locating” algorithm, which is very capacity-efficient but quite complex, with several operators unable to fully exploit the benefits of this functionality, is an example. Underutilization of available functionality, coupled with an exponential increase in subscriber numbers, resulted in many operators overdimensioning their base station subsystems (BSSs) with continuous aggressive deployment of new base stations. Thus, constant change and evolution of GSM networks have necessitated the continuous optimization of the offered quality of service (QoS).
Michael Pipikakis [email protected]
Many publications on GSM describe the system, its architecture, and its evolution. However, limited sources document QoS, network performance management, and optimization. Many European operators currently enjoy very good network
© 2004 Bechtel Corporation. All rights reserved.
performance, and the industry has developed GSM optimization expertise (mainly through trial and error), but this expertise is not fully documented. There is usually more than one solution to a problem, which (unlike for design or site acquisition) makes it difficult to proceduralize optimization techniques and problem solutions. Engineers need to be open-minded, with good analytical skills and good understanding of the overall system and its individual components. Performance management and QoS optimization are subjects that cannot be fully taught. Expertise must be gained through trial and error, in an attempt to maintain optimum and constant QoS offered by dynamic and ever-changing GSM networks. This paper focuses on 2G QoS, as well as the advantages and disadvantages of each mechanism available to monitor, analyze, and improve it. The paper also describes the most common QoS shortfalls and provides improvement recommendations, which serve as a useful reference in performance analysis and optimization for specific projects.
WHAT IS QUALITY OF SERVICE?
verall QoS for 2G, 2.5G, and 3G systems comprises three important components, all of which need to be constantly monitored and optimized as networks change in response to increasing coverage and capacity demands: • Accessibility – getting on the system • Retainability – staying on the system • Connection quality – having a good service experience while using the system 1
QUALITY OF SERVICE EVALUATION ABBREVIATIONS, ACRONYMS, AND TERMS
he three mechanisms available to monitor, analyze, and evaluate QoS and take corrective actions are customer complaints, drive tests, and network statistics, all three of which are described below. Each mechanism has certain advantages and disadvantages, usually with conflicting priorities for limited optimization resources.
broadcast control channel
block error rate
base station controller
base station identity code
base station subsystem
base transceiver station
call completion success rate
congestion failure rate
cell reselect hysteresis
cell reselect offset
cell traffic recording
dropped call rate
dropped traffic channel rate
digital cross connect
equipment identity register
general packet radio service
• Cause of failure can be identified
global system for mobile communication
• Good for benchmarking
hopping sequence number
handover success rate
key performance indicator
location area code
• One terminal type
link access protocol on the D-channel
• Only ground level and in-car service
mean holding time
mobile switching center
Office of Telecommunications
quality of service
standalone dedicated control channel
SDCCHSR SDCCH success rate SMS
short message service
temporary block flow
Customer Complaints Advantages • Real problems experienced by customers using the service • Decision-forming/influential Disadvantages • Subjective • Often vague with little supporting data • Often received too late to react to the situation • Require filtering by customer service before being handled by the engineering department Drive Tests Advantages • Real calls
• Good for network pre-launch tuning (startups and new deployment projects) Disadvantages • Low volumes/statistically insignificant
• Predetermined routes, calling patterns only • Labor-intensive analysis Network Statistics Advantages • All calls can be monitored • Trends can be measured, by specific geographical areas of interest or for the entire network • Trends are stable Disadvantages • Indicate problems but not their causes or solutions • Do not differentiate customer value
Bechtel Telecommunications Technical Journal
Established GSM operators use clearly defined network QoS key performance indicators (KPIs) with target thresholds to be achieved. The KPI thresholds are usually revised once a year, and new goals are set as the business priorities change. Network performance management and optimization activities ensure that QoS targets are met. For underperforming areas (sections of the network failing the KPI thresholds), optimization projects are initiated. Using all available methods, these projects fully analyze the performance of the area to understand the problems and take corrective actions. In such optimization projects, a combination of customer complaints, drive tests, and network statistics is used. Usually, statistical analysis and customer complaints are used to identify problems, while drive tests are used to verify them and/or the solution(s). However, drive tests alone cannot be relied on to provide insight into the offered service. Drive tests can only provide an indicator of QoS for traffic that is highly mobile and at ground level. A large proportion of traffic offered via mature networks is static and often originates at higher-thanground levels. In several mature European networks there is, on average, only one handover per call, which indicates the static nature of traffic. This makes statistics the most useful mechanism for identifying QoS shortfalls. However, experience is required in recognizing problem trends, identifying the causes, and taking corrective actions. This, in turn, requires good knowledge of the system, analytical skills, and experience in network performance management and optimization. Nevertheless, using statistical analysis properly and to the fullest extent possible can significantly improve QoS.
WHAT NEEDS TO BE MONITORED AND OPTIMIZED?
he trends of several KPIs must be closely monitored. A summary of the most important KPIs that can have an impact on the offered QoS follows. Circuit Switched (CS) – Voice • DCR: The dropped call rate (DCR) provides the customer-perceived dropout performance. It is calculated over an area of the entire network or a geographical area and not on a per-cell basis, because a call cannot be statistically related to just one cell, due to handovers.
September 2004 • Volume 2, Number 2
• Minute-Erlang/Drop: This KPI indicates the average time between dropped calls. It is a division of traffic expressed in minuteErlangs divided by the total drops and is inversely proportional to DCR. It is a good way to evaluate the effectiveness of optimization activities because it takes into account the carried traffic and is more sensitive to changes than DCR. • CFR: The congestion failure rate (CFR) indicates the failure rate of assignments due to congestion and can be used on a cell basis for engineering, planning, and troubleshooting purposes and on an area basis to provide a measure of the customer-perceived traffic congestion. GSM operators have developed sophisticated CFR formulas to account for the effects of features such as directed retry and cell load-sharing when measuring customer-experienced congestion. • CCSR: The call completion success rate (CCSR) can be derived either from network statistics or from drive test statistics. It takes into account the fact that all failures are either drops or unsuccessful call set-ups. The total number of failures is divided by the total number of call attempts. It is a good method to use to evaluate the network accessibility and retainability as perceived by the customers. In the United Kingdom, the Office of Telecommunications (OFTEL), a governing body, uses CCSR from drive tests to declare the best network for QoS. Every 6 months, all network operators make approximately 22,000 calls while driving 305 pre-defined routes with clearly defined call patterns. At the end of the cycle, the operators submit a summary of the results and all drive-test files to OFTEL .
Performance management and QoS optimization cannot be fully taught. Expertise must be gained through trial and error.
• DTCHR: The dropped traffic channel rate (DTCHR) indicates the drops at the cell level. It is used for engineering purposes only (and not for reporting), to identify cells with high drops. Optimizing these cells improves DCR and CCSR. • SDCCHSR: The standalone dedicated control channel success rate (SDCCHSR) indicates the rate of successful air interface signaling channel assignments and is used for engineering purposes only, to optimize cells with high failure rate. Optimizing such cells improves CCSR. • HSR: Handover success rate (HSR) indicates the success of handovers. Minimizing handover failures improves DCR.
Packet Switched (PS) – Data (GPRS) • Cell Throughput: Cell throughput is an end-to-end KPI used at the cell and network levels to indicate data throughput. • RTT: Decreasing roundtrip time (RTT) delay increases throughput. • TBF Multiplexing: Temporary block flow (TBF) multiplexing indicates the number of users per time slot usage of general packet radio service (GPRS) resources. A high number of users per time slot decreases the data throughput.
ORGANIZING STATISTICAL DATA PRIOR TO ANALYSIS
s shown in Figure 1, performance and configuration data are collected in the switching nodes and usually aggregated into a statistics database and a configuration database, respectively. The statistics database is divided into “object types,” which correspond to different equipment or system function blocks. Each object type contains several event counters. The basic time unit for data collection is 15 minutes, i.e., the base station controller (BSC) uploads the entire object counter data to the statistics database every 15 minutes. Proposed methods for organizing and classifying the available data follow. Observation Time Intervals When manipulating statistical data, it is important to define appropriate time frames within which the data will be gathered and processed. The following observation time intervals are suggested for statistical evaluation: • Hour: Hourly statistics give a detailed picture of network performance. They are useful to help spot temporary problems and identify trends. Performance Evaluation Performance Configuration Database Database
• Peak Hour: Peak hour statistics are of great significance, because they correspond to the time of heavy utilization of network resources. In a way, they provide the “worstcase” scenario. • Day: Daily statistics are introduced to provide a way of averaging temporary fluctuations of hourly data. Problems can be identified and corrective actions triggered with more confidence. Trends with daily values are also used for reporting and benchmarking. • Online: Online statistics provide almost real-time monitoring of the network, if this is necessary. Statistics can be obtained directly from the switching node, where outputs are available every 15 minutes. Classification by Network Level As shown in Table 1, the monitoring process and statistical analysis take place at different levels: • Network-wide: The entire network (to provide a “global” overview) • Geographical Area or Region: All cells belonging to specified geographical regions (to obtain and compare results for performance in different areas) • City: All cells belonging to specified major cities (to obtain and compare results for performance in different cities) • BSC: All cells belonging to certain switching nodes (to obtain switching node-related statistics and compare performance of different nodes) • Cell: Individual cells as well as neighboring cell relationships Classification by Resource Type or Event Statistics can be classified by resource type or the events they refer to. Both user-defined formulas and “raw” counters are grouped into one of the following categories: • Random access channel measurements
• Standalone dedicated control channel (SDCCH) measurements • TCH measurements • Idle channel measurements • Handover measurements • Subscriber disconnection measurements • Link access protocol on the D-channel (LAPD) signaling measurements
Figure 1. Collection of Performance and Configuration Data
• BSC measurements
Bechtel Telecommunications Technical Journal
Table 1. Examples of QoS KPIs and Target Thresholds
Performance Target Range
CCSR from Drive Tests
CCSR from Drive Tests
Area or Region
CCSR from Drive Tests
% of Cells with Dropped TCH >2%
% of Cells with BH CFR >10%
% of Cells with HSR