ERAB Failure Reason
ERAB Failure Reason
ERAB Failure Reason
accessibility. After the UE has completed the RRC Connection which has been explained
in my previous article, LTE KPI Optimization: RRC Success Rate, it needs to get a
Bearer assigned to it to initiate services. The bearer can be default (usually Data QCI9)
or dedicated (VoLTE QCI1). During initial access, the default bearer is added and that
constitutes the major portion of the total ERABs.
Firstly, lets understand the definition and points where the ERAB KPI is pegged. After the
UE sends the RRC Setup Complete message to the eNB, the eNB sends a S1 Initial UE
Message to the MME indicating the purpose of the UE (Attach, TAU, CSFB, Service
Request etc) and its credentials. Once the MME receives this message and it decides
that a bearer is required, it will send an Initial Context Setup Request to the eNB. This
message is considered as the ERAB Attempt as it contains the bearers to be added
along with their QCI values. The eNB receives this message and adds the DRB (Data
Radio Bearer) based on the bearer profile in Initial Context Setup Request. But before
the eNB can add bearers, it needs to activate the security for the connection. This is
done by the Security Mode Command which carries the ciphering and integrity protection
algorithms. After this the eNB sends a RRC Connection Reconfiguration message to the
UE which adds a DRB and it includes the configuration for the DRB like bearer identity,
PDCP & RLC configuration (AM/UM etc). SRB2 is also added at this point with this
message. The UE receives these messages and reconfigures the connection. Then the
UE responds with Security Mode complete and RRC Connection Reconfiguration
Complete messages. As the eNB receives these messages, it sends an Initial Context
Setup Response to MME and this message is considered as the ERAB Success.
Let’s have an in-depth look at both of them and find ways to tackle them
Consider a UE that receives Security Mode Command but fails to maintain radio
connection afterwards. This can happen in following two scenarios:
N310 indicates an interval of 200 consecutive PDCCH decoding failures. Simply put, if
the UE fails to decode PDCCH for 200ms, it will be considered one N310. If the N310
value is 2 then it means that if the UE fails to decode PDCCH for 400 ms, it will have
exceeded the configured N310 threshold. Once, N310 has been exceeded, the UE starts
timer T310 and if the UE is unable to retain the connection (still unable to decode
PDCCH) before T310 expires, the UE will initiate RRC ReEstablishment. Let’s
understand with an example. Consider N310 of 2 and T310 of 500ms, then the UE will
initiate RRC Connection ReEstablishment after 900 ms (N310 = 400ms + T310 =
500ms).
2. Maximum RLC retransmission count exceeded
Consider that the UE receives both the Security Mode Command and the RRC
Connection Reconfiguration message. Now, it has to transmit the Security Mode
Complete and RRC Connection Reconfiguration Complete message in Uplink. However,
if the eNB fails to decode these responses, it will send a NACK to the UE or the eNB may
not send anything if it completely fails to even receive these messages. The RLC layer in
the UE is configured to resend the message if the message is not acknowledged. So, the
RLC layer will keep resending until a valid acknowledgement is received. But the RLC
cannot resend the same message indefinitely and it has a upper limit of retransmissions.
Once that limit is reached, the RLC will not retransmit again and the UE will consider that
the radio link is compromised. This will trigger a RRC ReEstablishment Request.
However, in both these cases, the RRC ReEstablishment Request will be rejected by the
eNB since processing this request requires to have a valid UE context at the eNB. But
since the UE did not respond to Security Mode Command, so the eNB does not consider
the context to be active yet and rejects the RRC ReEstablishment Request. At the same
instance, the eNB will send Initial Context Setup Fail to MME indicating an ERAB Setup
Failure.
Optimization
Such issues can be reduced by increasing the N310 & T310 value. For instance, if the
value of N310 is increased from 2 to 6 and T310 is increased from 500ms to 1000ms,
then the UE will wait for 2200ms instead of previous 900ms and there is more chance
that N311 will be triggered. N311 is the In-Sync value and so it is the opposite of N310.
T310 stops if N311 is triggered. If N311 is 1 then it means that UE needs 100ms of
successful PDCCH decoding to stop T310. So, there is a higher probability of triggering
N311 if the value of N310 and T310 is big.
Similarly, if the RLC retransmission count threshold is increased from 8 to 16, then the
RLC will retransmit 16 times instead of 8 times which will increase the probability that the
eNB might be able to decode the message and prevent RLF.
No Response From UE
In this case, the UE receives the Security Mode Command and the RRC Connection
Reconfiguration messages in downlink but does not respond to these messages in
uplink. This can result in the Inactivity Timer expiry and the eNB will send a UE Context
Release Request to the MME during ERAB setup phase which will cause the ERAB
setup failure. Let’s see why this scenario happens in live networks. Once a UE receives a
downlink message which needs a response, it will need an uplink allocation to send a
response. In order to get an uplink allocation, the UE requests the eNB by using a
Scheduling Request Indicator or SRI. The UE sends a SRI based on the SRI
Configuration shared with it in the RRC Connection Setup Message. The SRI
Configuration tells the UE about the periodicity of the SRI and it determines the subframe
where the UE will send the SRI. So, the eNB will look for that UE’s SRI in that subframe
only and based on that, the eNB allocates an uplink resource to the UE by instructing the
UE on the PDCCH. Now, the vendors have moved to adaptive SRI intervals which can
result in a new SRI configuration in the RRC Connection Reconfiguration message.
There are UEs that do not support this change of SRI configuration and they keep using
the old SRI configuration. So, once they have received the Security Mode Command and
the RRC Connection Reconfiguration messages in downlink and they want to respond in
uplink, they will have to send a SRI first. The UE will be sending SRI according to the old
SRI Configuration shared in RRC Connection Setup message while the eNB will be
looking for the UE’s SRI in the subframe defined in SRI Configuration of RRC Connection
Reconfiguration message. This will result in a scenario where the eNB will consider that
there is no response from UE and once the inactivity timer is expired, the ERAB setup
will fail.
This can also happen if the UE is in poor coverage or if the PUCCH has high
interference. The UE will keep sending SRIs in the correct location on PUCCH but the
eNB might not be able to read them resulting in a similar scenario as explained above.
Optimization
If such a scenario is observed consistently, it will be a good idea to shift from an adaptive
SRI period to a fixed SRI period. This will avoid reconfiguring the SRI periodicity and will
prevent this issue.
Also, using PUCCH enhancements like IRC on PUCCH can help reduce the probability of
such issues.
This is rarely seen in networks when a UM mode (Unacknowledged Mode of RLC) QCI is
used for UEs which do not support UM mode. A common example is the QCI7 which is a
Non-GBR QCI defined for live streaming or voice services and it usually works in UM
Mode. But there are many UEs which do not support UM mode and the eNB simply fails
to add a bearer with UM mode for them. This issue can be seen from the counters as it
will show that ERAB failures on Radio interface are happening only on QCI7 or any other
QCI which is set to UM Mode.
Optimization
Simply changing the RLC mode for the QCI from UM to AM should solve this issue.
Another issue that is a bit rare is the Security Mode Failure issue. In this case, the UE
receives the Security Mode Command from the eNB but responds with a Security Mode
Failure message. Consequently, eNB sends Initial Context Setup Failure to the MME
resulting in ERAB setup failure. This happens if the security configuration on the eNB is
not supported by the UE or sometimes it can happen if the UE cannot handle both the
Security Mode Command and the RRC Connection Reconfiguration together. In most of
the cases, this turns out to be the terminal issue.
MME Induced ERAB Setup Failures
Let’s have a look at the MME induced ERAB failures. This may come as a surprise but
most of the MME induced ERAB setup failures in commercial networks are actually
caused by the radio interface and not the MME. I know it is hard to understand but those
of you who have delved themselves in RRC and S1 traces will understand it more clearly
once I explain this issue.
As explained in the section above, when the UE experiences a RLF after receiving the
Security Mode Command, it can try RRC ReEstablishment which actually tells the eNB
that there was a RLF on the UE’s side. Consider a UE experiencing a RLF before it
receives the Security Mode Command. The UE can only send a RRC ReEstablishment
after security is activated but if the UE experiences a RLF before the Security Mode
Command has been received, it cannot send a RRC ReEstablishment Request.
Now, consider that the UE experiences RLF after RRC Setup Complete message and
before Security Mode Command, this UE will go to idle and retry a new RRC connection
by sending another RRC Connection Request. Let’s say that the UE sends a RRC
Connection Request to another eNB (eNB2) and that eNB2 will start processing it. The
eNB2 does not know that the eNB1 already has a ERAB setup process going on for this
UE. The eNB2 will send a S1 Initial UE Message to MME for this UE and the MME will
see that it already has another ERAB setup process going on with eNB1. So, for MME to
initiate the new ERAB setup process by sending Initial Context Setup Request to eNB2, it
needs to first stop the process on eNB1, as it cannot have separate context of same UE
on two different eNBs. As a result, the MME will send a UE Context Release Command
to eNB1 asking to abort the ERAB setup process. The eNB1 is trying to find the UE over
the air interface and once it receives the Context Release Command from MME, it will
consider that the MME aborted the ERAB setup and will peg it as a MME induced ERAB
setup failure. eNB1 will send an Initial Context Setup Failure to MME and the ERAB
setup on eNB1 will be pegged under MME induced failure. However, this issue was
actually caused due to radio issue but the eNB1 was not able to find that out.
This issue can also happen if the UE sends the second RRC Connection Request to the
same eNB or even to the same cell. At RRC level, the eNB does not check TMSI value
and the UE is referenced by its CRNTI. So, if the same UE sends another RRC
Connection Request to the same eNB, it will allocate a new CRNTI and will consider it a
new connection. But when the eNB will send S1 Initial UE Message to MME, the MME
will check the TMSI and will send UE Context Release Command to the previous session
resulting in ERAB setup failure on the first process.
Another scenario that can cause a MME induced ERAB Setup failure is the Initial Context
Setup Timer on the MME. If that timer is set to small value and eNB is waiting for the UE
to respond to Security Mode Command, the MME will send UE Context Release
Command due to timeout. This will also result in a MME induced ERAB Setup Failure.
Optimization
There is no real optimization on the first scenario as it is purely a coverage issue and
coverage enhancement by physical or soft changes can be done to mitigate it. The
second scenario can be minimized by increasing the Initial Context Setup Timer on the
MME.
In case of any queries or feedback, please drop a comment below and I would love to
respond and help.