Access Failures Troubleshooting Workshop
October 26, 2020 | Author: Anonymous | Category: N/A
Short Description
Download Access Failures Troubleshooting Workshop...
Description
Security Level:
Huawei Workshop Troubleshooting Access Failures
May 17th, 2011 www.huawei.com
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Contents •
Call Setup Procedure (step by step & all protocols)
•
General Causes of failures
•
How to chase and to solve specific access failures: RRC Access Failure Troubleshooting. Paging Access Failure Troubleshooting RACH Access Failure Troubleshooting
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 2
Mobile Terminated Call Setup Procedure (I) UE
Node B
RNC
Here the UE will start to send the PRACH and wait for AICH and then send RACH message
MSC / VLR
ISUP
MGW 1. IAM
2. PAGING RANAP
RANAP
3. PCH: PCCH: PAGING TYPE 1 RRC
RRC 4. RACH: CCCH: RRC CONNECTION REQUEST
RRC
RRC
NBAP
5. RADIO LINK SETUP REQUEST
NBAP
Start RX
NBAP
ALCAP
Here the Node-B will start RL with DL transmission
ALCAP
DCH-FP
Can be either RRC Connection setup (to this cell and or inter freq to another one when DRD) or Reject.
6. RADIO LINK SETUP RESPONSE
7. ESTABLISHMENT REQUEST (AAL2)
8. ESTABLISHMENT CONFIRM (AAL2)
9. DOWNLINK SYNCHRONISATION
10. UPLINK SYNCHRONISATION
DCH-FP
Here the RNC will perform a DRD decision and CAC decision for RRC
NBAP
ALCAP
ALCAP
DCH-FP
DCH-FP
Start TX
11. FACH: CCCH: RRC CONNECTION SETUP RRC L1
RRC 12. SYNCH IND
Here the UE will do DL synchronization (using N312=1, T312=1, N313=20 andT313=3) . Then the UE will start UL TX transmission and the Node-B will detect UL SYNCH (based on N_INSYNCIND=8, N_OUTOFSYNCIND=8,TRLFAILURE=20)
L1
NBAP
13. RADIO LINK RESTORE INDICATION
NBAP
14. DCCH: RRC CONNECTION SETUP COMPLETE RRC
HUAWEI TECHNOLOGIES CO., LTD.
RRC
Huawei Confidential
Page 3
Mobile Terminated Call Setup Procedure (II) UE
Node B
RNC
MSC / VLR
15. DCCH: INITIAL DT [ PAGING RESPONSE ] RRC
MGW
16. SCCP CONNECTION RQ [ INITIAL UE MESSAGE [ PAGING RESPONSE ] ]
RRC SCCP
SCCP 17. SCCP CONNECTION CONFIRM
SCCP
SCCP
18. COMMON ID RANAP
RANAP
19. SECURITY MODE COMMAND RANAP
RRC RRC
20. SECURITY MODE COMMAND
21. SECURITY MODE COMPLETE
RANAP
RRC RRC
RANAP
22. SECURITY MODE COMPLETE
RANAP
23. DT [ SETUP ] RANAP
RRC
RRC
24. DCCH: DLDT [ SETUP ]
25. DCCH: ULDT [ CALL CONFIRMED ]
RANAP
RRC
RRC 26. DT [ CALL CONFIRMED ] RANAP
RANAP 27. BINDING ID, SPEECH CODE TYPE, B PARTY ROUTE
Here the RNC will perform a DRD decision and CAC decision for RAB
HUAWEI TECHNOLOGIES CO., LTD.
28. RAB ASSIGNMENT REQUEST RANAP
Huawei Confidential
RANAP
Page 4
Mobile Terminated Call Setup Procedure (III) UE
Node B
RNC
MSC / VLR
MGW
29. ESTABLISHMENT REQUEST ( AAL2 ) ALCAP
ALCAP 30. ESTABLISHMENT CONFIRM ( AAL2 )
ALCAP
NBAP
NBAP
ALCAP
31. RADIO LINK RECONFIG PREPARE
32. RADIO LINK RECONFIG READY
33. ESTABLISHMENT REQUEST (AAL2)
ALCAP
NBAP
NBAP
ALCAP
34. ESTABLISHMENT CONFIRM (AAL2) ALCAP
NBAP
ALCAP 35. RADIO LINK RECONFIG COMMIT
NBAP
36. DCCH: RADIO BEARER SETUP RRC
RRC 37. RAB ASSIGNMENT RESPONSE RANAP
RRC
38. DCCH: RADIO BEARER SETUP COMPLETE
RANAP
RRC
39. DCCH: ULDT [ ALERTING ] RRC
RRC 40. DT [ ALERTING ] RANAP
RANAP 41. ACM ISUP
42. DCCH: ULDT [ CONNECT ] RRC
RRC
Terminating UEs are considered to be in a call after CC Connect ACK
Originating UEs are considered to be in a call after CC Connect message
43. DT [ CONNECT ] RANAP
RANAP 44. OPEN CONNECTION 46. DT [ CONNECT ACK ]
RANAP
RANAP
RRC
ISUP
47. ANS (CONNECT)
46. DCCH: DLDT [ CONNECT ACK ] RRC
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 5
General Causes of failures (I)
• RF Reasons • Radio Parameter Problems • Miscellaneous causes
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 6
General Causes of failures - RF reasons (II)
Poor DL coverage. The “fake coverage” phenomenon (the user sees the 3G icon on the screen in idle but cannot connect to any service). The cause could be overshooting cells but also excessive values of Qqualmin like -22 dB. Solution: Adjust the antenna azimuth and down tilt, add repeaters and RRUs, add micro cells. Any user should get a better signal than EcIo = -18 dB.
Lack of Dominance (no clear Best server): Continuous change of best server leads to RRC failures and RAB failures.Solution: Establish a best server everywhere. Clear dominance.
Poor UL coverage: The UE has not enough TX power to communicate with Node-B (even when there is low UL traffic on the cell). Solution: Adjust the antenna azimuth and down tilt, add repeaters, reduce CPICH power.
Strong UL interference: Due to external interference or high UL traffic (the cell shrinking phenomenon). The UE will not be able to increase to more than 21 dBm for the preamble power and the RACH will fail - or synch will fail later. Solution: Up to operator‟s decision (implement more tilt ,CPICH power reduction, chase external source of interference or increase the number of Node-Bs to cope with traffic)
Strong DL interference: Usually due to overshooting cell, external interference, high DL traffic on this cell and surrounding cells. The UE will miss the AI message for RACH and will fail to establish a call - or will fail to get synch in DL. Solution: Improve best server area (strong dominance)
RF radiating system problems:
Antenna‟s footprint not touching the ground properly: sites with over 120 m height and tilts around 3 degrees. More than 3/4 of the antenna pattern will not be touching the ground with a decent level of signal. Most calls are handled on side lobes. RF jumpers (feeding the antennas with RRU signal) are too long (should be no more than 3 meters, we‟ve seen cases in --- with 10 meters of ½” jumpers). This definitely leads to high noise factors and call setup failures. Also UL and DL coverage is very much limited.
Missing neighbours: Leads to call setup failures due to poor signal. HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 7
Poor DL coverage.
The “Fake coverage” phenomenon (user gets the 3G icon on his screen in idle but either cannot pass an RRC or a RAB). Cause is overshooting cells but also excessive values of Qqualmin like -22 dB. Solution: Adjust the antenna azimuth and down tilt, add repeaters and RRUs, add micro cells, improve best server, change Qqualmin. Any user should get a better signal than EcIo=-18 dB. If this level cannot be achieved it is better to display ” no service” on. user screen. User experience: 3G icon,3G signal User experience: 3G icon,3G signal
User experience: 3G icon, no 3G bar, no service accessibility. User „s perception: Very negative.
bar, good service accessibility. User „s perception: Positive.
bars, great service accessibility. User „s perception: Very Positive.
When -22>EcIo>-18 ; 80%>RRC_SR>20%
When -18>EcIo>-16 ; 95%>RRC_SR>80%
When -16>EcIo>-2 then RRC_SR>95%
Qqualmin
PRO
CONS
Comments
-22 dB
• User always see the 3G icon on his phone‟s screen (although it‟s a “fake” coverage the user does not always attempt to use the service) •Maximum traffic possible
• Bad customer experience but less NW signalling. •Not all call attempts are counted (not a clear perception of accessibility).
Will grab all extreme traffic leading quickly into DL Power congestion and accessibility issues.
-18 dB
• The user will not always have the 3G Icon on his phone‟s screen (but when icon is present service is 100% accessible) • Potential traffic decrease
• Great and real customer experience but increased signalling (coverage lost); • All “Call attempts” are counted (better performance perception of accessibility) due to this RRC_SR KPI may (or may not) be improved.
No more “fake coverage”. Decrease in DL Power Congestion.
-20 dB
Qqualmin=-20dB is suggested as a trade-off solution by Huawei.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 8
General Causes of failures – Radio Parameter Problems(II) • Excessive values in object UCELLSELRESEL: Examples: Qqualmin1, IDLEQHYST1S>3.
• Improper settings of access parameters: No discrepancies found in UCELLACCESSRESTRICT • Inappropriate settings of preamble power ramp step and retransmission times: Current set of parameters is NOK (PREAMBLERETRANSMAX=20, CONSTANTVALUE-20, PowerRampStep=2, Mmax=8).
• Inappropriate setting of adjacent cells for UINTRAFREQNCELL: Qoffset1sn, Qoofset2sn out of the range (-4dB;+4dB). Wrong settings for Sintra (like 0 dB), Sinter( also like 0 dB).
• Inappropriate settings of synchronization parameters: Synch and Out-Of-Synch parameters for UL (N_INSYNC_IND=8, N_OUTSYNC_IND=8,and T_RLFAILURE=20), DL (T312=1, N312=1, N313=D20 ,T313=3 and N315=D20). Please remember that call re-establishment is activated for both UL and DL (great KPIs but acceptable user perception)
• Unsuitable power allocation rate for DL common channel:
No discrepancies found (PSCHPower, SSCHPower, BCHPower , MaxFachPower, PCHPower, AICHPowerOffset, PICHPowerOffset)
• Unsuitable initial power of uplink and downlink dedicated channel:
No discrepancies found for UL (DPCCH_Initial_Power = PCPICHPower - CPICH_RSCP + Uplink interference + DefaultConstantValue) and DL initial SIR target
• Unsuitable setting of uplink Initial SIR target value of dedicated channel: No discrepancies found for DL initial SIR Target
• Inappropriate setting of adjacent cells for UINTERFREQNCELL: • When 1900 and 850 MHz have significant azimuth difference why there is DRD just towards one 1900 cell and not for the other 1900 cell as well? • Why Qoffset1sn, Qoofset2sn are out of the range (-4;+4) on top of the IdleQhyst1s >2?
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 9
General Causes of failures –Miscellaneous causes(II)
•Transmission issues (fluctuating PATH, high BER, reduced capacity, routers down in the IP cloud). •Alarms on cells, on Node-Bs, on RNC, on transmission •Planning issues: traffic not properly shared between layers and NodeB, lack of a clear best server( no dominance), paging congestion due to LAC splitting issue. •Radio Congestion: • CE • DL Power • UL Power • R99 Codes • Iub bandwidth • SPU bottleneck • WMPT board bottleneck
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 10
How to identify and solve different issues?
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 11
RRC Access Failure Troubleshooting (I) Is cell/NodeB/RNC configuration the correct one?
YES
NO
Should be done daily (automatically and network wide) based on a defined template.
Where there any alarms on investigated cells (or any of it's neighbouring cells, intra or inter) ?
YES
NO
Every morning there should be an email with cells unavailable on previous day and duration of unavailability.
Is it a repetitive failure or a "one time" event?
YES
NO
If one time event, please wait one more day before to conclude. Could be a social event
If a repetitive failure (according to KPI values in the past) is it a slowly degradation (with traffic increase) or an event one (degraded seriously from a specific moment)?
YES
NO
If event one, go back to that day and see what was changed at that time and reconsider that change
Is the SHO factor less than 50%?
YES
NO
If not, please review its best server area, tilt, azimuth and CPICH power
Is this cell having full overlapping with other neighbouring cells? ( i.e. there's no direction user can move without having good coverage). Is any user, in any indoor environment within the footprint of this cell, able to get a decent RSCP and EcIo?
YES
NO
If no review your targeted coverage and accept current limitations and constraints due to location and/or number of Node-Bs.
Is the height of the antenna less than 100m?
YES
NO
If No, please do not expect a good RRC Success rate.
Is the total tilt of the cell more than 3 degree downtilt?
YES
NO
If no (and footprint is on a plain terrain) please take immediate actions to increase downtilt. Antenna RF pattern is hardly touching the ground, users are handled on side lobes. DL Power issues will occur.
is the cell Idle sintrasearch=127?
YES
NO
If no, please do not expect a good RRC Success rate.
is the cell idleQoffset1sn4dB
is the cell Idle idleQoffset2sn4dB
is the cell idleQhyst14dB
is the cell idleQhyst24dB
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 12
RRC Access Failure Troubleshooting (II) Date----
RNC1 RNC2 RNC3 RNC4
Sum of VS.RRC.AttConnEstab.Sum
6154003 7115377 5397822 1920647
Sum of Cell.RRC.Att.Fail
46702
73768
87275
12471
Sum of VS.RRC.Rej.Redir.Service
0
0
0
0
Sum of VS.RRC.Rej.ULIUBBand.Cong
0
0
0
0
Sum of VS.RRC.Rej.ULPower.Cong
14
10
0
0
Sum of VS.RRC.Rej.DLPower.Cong
135
118
1965
290
Sum of VS.RRC.Rej.DLIUBBand.Cong
0
0
0
0
Sum of VS.RRC.Rej.ULCE.Cong
1144
1352
721
507
Sum of VS.RRC.Rej.DLCE.Cong
822
11
0
0
Sum of VS.RRC.Rej.Code.Cong
12
41
372
0
Sum of VS.RRC.Rej.RL.Fail
30
50
419
0
Sum of VS.RRC.Rej.TNL.Fail
0
0
0
0
Sum of VS.RRC.FailConnEstab.Cong
2343
1552
3070
802
Sum of VS.RRC.Rej.Sum
2373
1602
3489
802
Sum of VS.RRC.SetupConnEstab Sum of VS.RRC.FailConnEstab.NoReply Sum of RRC.SuccConnEstab.sum
6151630 7113775 5394333 1919845 44006
71818
Conclusion: Most RRC failures (over 90%) are due to RRC no reply. For RRC issues, focus on overshooting cells first (to solve No reply), second on congested cells.
Huawei Confidential
11667
6107301 7041609 5310547 1908176
Here are most of the RRC failures occurring indicating poor UL coverage (overshooting)
HUAWEI TECHNOLOGIES CO., LTD.
83566
Page 13
RRC Access Failure Troubleshooting (III) Identify top N cells (more than 2000 RRC attempts per day and success rate is less than 98%)
Identify if RRC failures for a cell are due to SPU : (check ADD NODEB command to find the SPU for a cell/Node-B) .
YES
NO
SPU board is the issue when (VS.RRC.SuccConnEstabCPU / VS.RRC.AttConnEstabCPU) > VS.RRC.AttConnEstab.Sum
Solution 1: Recheck configuration( IPPATHs of Nodeb has same capacity of transmission one;same for pair one) Solution 2: Run the PING IP command on the IP of the NodeB to detect congestion on the IuB.
Solution 1: (After SPH226) MOD UCELLALGOSWITCH: CellId=xxxxx, RsvdPara1=RSVDBIT5-1; (will improve CS success rate, will degrade PS success rate) Solution 2: Offload traffic
VS.CellFACHUEs>25
Solution: Offload traffic
VS.CRNCIubBytesFACH.Tx or VS.CRNCIubBytesPSR99.CCH.Tx are flat in time( limited)
Solution1: Offload traffic
FACH Channel utilisation>80%
Solution1: Offload traffic
NO
Identify if more than 10% of failures for a cell are due to VS.RRC.FailConnEstab.NoReply :
YES VS.MaxRTWP - VS.MeanRTWP > 10 dB
Check missing neighbours
NO If RRC Estab SR for whole RNC128 frames Frame offset =-7860 chips PBP=1 K=1 (there„s only one S-CCPCH that carries PCH) PI=Np=36 PO= (IMSI)mod128+ n* 128 -7860 chips
Paging Access Failure Troubleshooting-(V) PICH frame structure : • A group of bi=1 means there‟s a paging and UE should read it‟s very first paging occasion. • A group of bi=0 means there‟s no paging and UE could go back to idle till next paging indicators.
•
• •
More bits inside a PI means a greater probability to decode the paging indicator but less capacity of the paging channel and power consumption for UE. Less bits means a lower probability for the UE to decode the paging indicator but longer battery life of the UE. Best solution is a mid-way one: PI=36.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Paging Access Failure Troubleshooting-(VII)
•
From all this information what do you need to know?: • there can be several places where paging could get congested: Iu interface, IuB interface, RNC boards, or PCH interface . PICH channel is the only channel that is never congested! • Check with CN how many paging repetitions have, how do they page: by IMSI or by TMSI. If first paging fails how many repetitions? Last paging is network wide or LAC wide only? • --- is currently facing PCH channel load: all smart phones are in cell PCH state. In this state can only receive paging but can not transmit any data. Any paging for a UE it is sent specifically to that cell. How RNC knows where is such an UE? By cell update!. Every time UE changes the cell in cell PCH there is a cell update+cellupdate confirm, utran mobility information confirm. That means that the RNC is aware about new location of the UE. • How much is the paging success now in --- network? • What solutions we have to offload the PCH channel?: • LAC split. • Page by TMSI • Reduce ping-pongs (and reselections) • Improve best server area and reduce overlapping
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
RACH Access Failure Troubleshooting (I) •Why are RACH parameter VERY important? Because it impacts strongly user experience (also called E2E=end-to-end user experience) No performance indicators. Only estimation by RTWP, load of the RACH channel etc..
Enough performance counters
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 26
RACH Access Failure Troubleshooting (II) The answer on AICH must be a specific positive response for the specific RACH sent Max_TX_power_on_PRACH NB01
Power_Ramp_Step :
….
Preamble_Initial_Power :
Uplink/UE/PRACH
Preamble 1
Pp-m :
….
Message part
Preamble n AICH_Transmission_Timing
Preamble_Retrans_Max :
MMax Parameters for RACH/PRACH: •NBO1( 0 NBO1min NBO1 NBO1max ) is the time between 2 ramping power of the preamble within the same preamble cycle. •Preamble_Retrans_Max is the maximum number of preamble that can be sent in a cycle. •Mmax is the maximum number of preamble cycles. •Preamble_Initial_Power = Primary CPICH TX power – CPICH_RSCP + UL interference + Constant Value •Constant value is an initial value to start the first preamble power usually is -24. •UL interference is the latest value broadcasted by the NodeB in SIB7. Ue needs to decode this value before being able to transmit RACH. •Power_Ramp_Step is the how much the preamble power should be increased after each No ack received on AICH. •Power offset P p-m = Pmessage-control – Ppreamble, measured in dB, between the power of the last transmitted preamble and the control part of the random-access message. •AICH_Transmission_Timing is the time when the RACH message must be transmitted after positive AICH was received( there are other parameters too) RACH is a common type transport channel in the uplink. RACHs are always mapped one-to-one onto physical channels (PRACHs), i.e. there is no physical layer multiplexing of RACHs, and there can only be one RACH TrCH and no other TrCH in a RACH CCTrCH. Service multiplexing is handled by the MAC layer. In one cell several RACHs/PRACHs may be configured. If more than one PRACH is configured in a cell, the UE performs PRACH selection RACH message mandatory parameters: -UE identity( IMSI,IMSI+LAI, TMSI, IMEI-when no USIM is inserted) -RRC establishment cause (31 causes) -radio bearer ID( AS or NAS, UM or TM or AM) -release5 indicator -measurements results on RACH(like EcNo of the serving cell).
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 27
RACH Access Failure Troubleshooting-(III) From all this information what do you need to know?: • Current RACH parameters are not optimal: allows the UE to increase the power 20 dBm more than the RTWP(CONSTANTVALUE=-20, PREAMBLERETRANSMAX=20, POWERRAMPSTEP=2). Due to this RTWP increase, due to this RACH increases and so on(it creates an avalanche effect). Better have longer call setup time for one UE (RACH failures due to missing NB relations of overshooting cells) instead of having entire cell shrinked due to one UE not being able to transmit RACH message.
•
•
Missing neighbours, lack of best server area and poor UL coverage influence a lot the RACH success rate.
•
Cell radius is now at 29.000 km. Make sure there are no UE from a larger distance(path distance) else will fail on RACH.
•
Spreaders inside the Node-B are limited. Multipath ( long distance) is not good for resource consumptions and so RACH messages might be missed.
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Preliminary conclusions •
Most attempt failures are related to planning
•
Plenty of attempts failures not recorded within the performance file (When EcIo is worse than -18 very few RACH “reach” the Node-Bs)
HUAWEI TECHNOLOGIES CO., LTD.
Huawei Confidential
Page 29
Thank you www.huawei.com
View more...
Comments