This is a complex operation that involves following the possible branches a bpf program can take and calculate all the possible states a register can take for all given possible branches a program can take.
To do this, the verifier has a bpf_reg_state structure. Particularly, the verifier keeps track of the possible minimum and maximum values a register can take when it is interpreted as an uint64_t, int64_t, uint32_t, int32_t.
Additionally, the verifier keeps track of all the bits that are known to be set on a register, this is tracked in the var_off field of bpf_reg_state . Var_off works by keeping a u64 value that acts as a mask, for any given bit that is known to the verifier, the mask will take the value of 0 at that bit’s position and the bit’s value will be tracked in the value member of the var_off.
To better illustrate this, here are a few examples from the verifier’s perspective of the limits of a register for a few given operations. For simplicity we will only illustrate the limits for when the register gets interpreted as a signed 32 bit value. U64, S64 and U32 values would follow a similar idea:
Operation |
S32 min value |
S32 max value |
Var off (value, mask) |
Explanation |
R1 = input() |
S32_MIN (0x80000000) |
S32_MAX (0x7FFFFFFF) |
(0, 0xFFFFFFFF) |
If R1 takes an arbitrary value from the user, the verifier knows nothing about it, therefore the limits are the absolute maximum/minimum.Note that the mask has all 1’s indicating no bit has known value |
R1 = 1337 |
1337 |
1337 |
(1337, 0x0) |
If the verifier is initialized to a constant value, then everything about it is known, and the limits and var off are adjusted accordingly |
R1 = input()R1 |= 2 |
0x80000002 |
S32_MAX (0x7FFFFFFF) |
(2, 0xFFFFFFFD)(hex(FD) == binary(1111 1101) |
Here R1 has the second bit set via the or operation. This gives the verifier enough information to update the possible minimum value as well as marking one bit in the var off mask as known |
After doing some changes to buzzer and implementing a new fuzzing strategy guided by coverage, we noticed the following log
REG INVARIANTS VIOLATION (true_reg1): range bounds violation u64=[0xffffffff, 0xfffffffe] s64=[0xffffffff, 0xfffffffe] u32=[0xffffffff, 0xfffffffe] s32=[0xffffffff, 0xfffffffe] var_off=(0xffffffff, 0x0)
REG INVARIANTS VIOLATION (true_reg2): range bounds violation u64=[0xffffffffffffffff, 0xfffffffffffffffe] s64=[0xffffffffffffffff, 0xfffffffffffffffe] u32=[0xffffffff, 0xfffffffe] s32=[0xffffffff, 0xfffffffe] var_off=(0xffffffffffffffff, 0x0)
REG INVARIANTS VIOLATION (false_reg2): const tnum out of sync with range bounds u64=[0x0, 0xffffffffffffffff] s64=[0x8000000000000000, 0x7fffffffffffffff] u32=[0x0, 0xffffffff] s32=[0x80000000, 0x7fffffff] var_off=(0xffffffffffffffff, 0x0)
Upon closer inspection, it became clear that there was a problem with the limit tracking of the registers; The minimum value is greater than the maximum value. And while this ended up not being too relevant for the bug that turned into a vuln, it led us in the right direction.
This error code is produced here, which in turn is called by the set_min_max function.
After reducing the bpf poc produced by buzzer, the program responsible for producing this error message looked similar to this (note: the read from map is oversimplified because it is not relevant to the bug):
R1 = read_from_map() // The verifier knows nothing about R1
R1 |= 2 // The verifier knows that bit 2 is set but knows nothing about the rest
if R1 != 0x7ffffffd goto b1:
Exit // False branch
b1:
R0 = 0 // True branch
Exit
After the or operation, the verifier knows the following about R1
Operation |
S32 min value |
S32 max value |
var_off |
R1 = input()R1 |= 2 |
0x80000002 |
S32_MAX (0x7FFFFFFF) |
(2, 0xFFFFFFFD)(hex(FD) == binary(1111 1101) |
When the verifier analyzes a branch operation, it splits the register states into the registers of the true branch (true_reg1, true_reg2) and the registers of the false branch (false_reg1, false_reg2).
For the program above, since the second value in the conditional is a constant, the verifier creates a “fake” register initialized to the value of 0x7ffffffd
Further Analysis
While this exploit gives a powerful primitive to read/write arbitrary kernel memory. That said, the vulnerability is only reachable if an attacker has the CAP_BPF capability. This can be either an application or a container which as of now is not as widespread.
Timeline
Date reported: 06/07/2024
Date fixed: 06/13/2024
Date disclosed: 07/15/2024
Summary
A bug in the verifier’s register limit tracking was found by using https://github.com/google/buzzer that allows an attacker to trick the eBPF verifier into thinking a register has a value different from the one it takes when executing the program.
Using this bug, an LPE exploit was written allowing the attacker to gain an arbitrary kernel memory R/W primitive that can then be used to gain a full system compromise.
Severity
Moderate -
Proof of Concept
This is a complex operation that involves following the possible branches a bpf program can take and calculate all the possible states a register can take for all given possible branches a program can take.
To do this, the verifier has a bpf_reg_state structure. Particularly, the verifier keeps track of the possible minimum and maximum values a register can take when it is interpreted as an uint64_t, int64_t, uint32_t, int32_t.
Additionally, the verifier keeps track of all the bits that are known to be set on a register, this is tracked in the var_off field of bpf_reg_state . Var_off works by keeping a u64 value that acts as a mask, for any given bit that is known to the verifier, the mask will take the value of 0 at that bit’s position and the bit’s value will be tracked in the value member of the var_off.
To better illustrate this, here are a few examples from the verifier’s perspective of the limits of a register for a few given operations. For simplicity we will only illustrate the limits for when the register gets interpreted as a signed 32 bit value. U64, S64 and U32 values would follow a similar idea:
After doing some changes to buzzer and implementing a new fuzzing strategy guided by coverage, we noticed the following log
Upon closer inspection, it became clear that there was a problem with the limit tracking of the registers; The minimum value is greater than the maximum value. And while this ended up not being too relevant for the bug that turned into a vuln, it led us in the right direction.
This error code is produced here, which in turn is called by the set_min_max function.
After reducing the bpf poc produced by buzzer, the program responsible for producing this error message looked similar to this (note: the read from map is oversimplified because it is not relevant to the bug):
After the or operation, the verifier knows the following about R1
When the verifier analyzes a branch operation, it splits the register states into the registers of the true branch (true_reg1, true_reg2) and the registers of the false branch (false_reg1, false_reg2).
For the program above, since the second value in the conditional is a constant, the verifier creates a “fake” register initialized to the value of 0x7ffffffd
Further Analysis
While this exploit gives a powerful primitive to read/write arbitrary kernel memory. That said, the vulnerability is only reachable if an attacker has the CAP_BPF capability. This can be either an application or a container which as of now is not as widespread.
Timeline
Date reported: 06/07/2024
Date fixed: 06/13/2024
Date disclosed: 07/15/2024