This is a follow-up to my blog post yesterday Extract the system libraries on macOS Big Sur, in which I explained how to extract the system libraries from the dyld shared cache. Although you can successfully disassemble these extracted libraries, there's still a problem: the otool
command-line tool fails to understand many Objective-C references in the disassembly. Let's take a look at an example from my favorite framework, AppKit. The following is from the implementation of the -[NSApplication init]
method. In order to call [super init]
, the implementation has to get the NSApplication
Objective-C class. For convenience I use my own command-line tool riptool, which is a wrapper around otool
that resolves rip-relative addresses.
00007fff235f7284 movq 0x5f789a0d(%rip) [0x7fff82d80c98], %rax ## Objc class ref: bad class ref
00007fff235f728b movq %rax, 0x8(%rdi)
Oh no, "bad class ref", that's bad! Compare with AppKit from macOS 10.14.6 Mojave (still my primary system):
0000000000003eb0 movq 0x10d3b11(%rip) [0x10d79c8], %rax ## Objc class ref: NSApplication
0000000000003eb7 movq %rax, 0x8(%rdi)
On Mojave, the address 0x10d79c8
is in the __objc_superrefs
section, as otool -l /System/Library/Frameworks/AppKit.framework/AppKit
shows.
sectname __objc_superrefs
segname __DATA
addr 0x00000000010d77f0
size 0x00000000000032a0
If we look at that section with otool -s __DATA __objc_superrefs /System/Library/Frameworks/AppKit.framework/AppKit
we see this:
00000000010d79c0 68 d1 0e 01 00 00 00 00 98 be 0f 01 00 00 00 00
The address 0x10d79c8
has the little-endian value 0x10fbe98
. And nm -n /System/Library/Frameworks/AppKit.framework/AppKit
confirms that this is a pointer to the NSApplication
Objective-C class:
00000000010fbe98 S _OBJC_CLASS_$_NSApplication
This class is in the __objc_data
section:
sectname __objc_data
segname __DATA
addr 0x00000000010eb980
size 0x0000000000022d80
That's what we see on Mojave. What about on Big Sur? The address 0x7fff82d80c98
, the "bad class ref", is also in the __objc_superrefs
section, like on Mojave:
sectname __objc_superrefs
segname __DATA
addr 0x00007fff82d80aa8
size 0x0000000000003480
Here's what the section looks like on Big Sur:
00007fff82d80c98 f8 5b da 62 00 02 00 00 60 d6 d9 62 00 02 00 00
The address 0x7fff82d80c98
has little-endian value 0x62da5bf8
. But that's not a direct pointer to the NSApplication
Objective-C class:
00007fff82da5bf8 S _OBJC_CLASS_$_NSApplication
Still, the class is in the __objc_data
section:
sectname __objc_data
segname __DATA
addr 0x00007fff82d96310
size 0x0000000000024040
Doing the math 00007fff82da5bf8 - 0x62da5bf8 = 0x7FFF20000000
. That's an interesting address, because it's not even in AppKit! If you do dyld_shared_cache_util -list -vmaddr
you find:
0x7FFF23450000 /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit
My theory is that 0x7FFF20000000
is the address of the beginning of the dyld shared cache. Here's the first library in the list:
0x7FFF20040000 /System/Library/Accounts/Notification/INDAccountNotificationPlugin.bundle/Contents/MacOS/INDAccountNotificationPlugin
I don't know why it's offset by 0x40000
from the beginning, but close enough! It seems that prior to Big Sur, Objective-C references in a Mach-O file are offsets from the beginning on the file, whereas on Big Sur, Objective-C references in a Mach-O file are offsets from the beginning of the dyld shared cache. Roughly speaking.
I'm sharing this information with the hope that it helps to create better tools for disassembling system libraries on macOS Big Sur. It's all very preliminary, so anything I say is subject to revision or correction and cannot be used against me in a court of law or public opinion.