I recently needed to step through some handwritten assembly on my MacBook, and found the setup extremely quirky. There’s little documentation, so I had to read through endless man pages and Stack Overflow discussions to make things work. I’m documenting the caveats for my future self and for anyone who wants to hack on assembly on their Macs.

NOTE: This guide was tested on macOS 14.1.1 running on an Intel CPU. Procedure (as well as assembly) will be entirely different on Macs with Apple CPUs.

TL;DR See the article repository

Installing the Necessary Tools

You’ll have to install the Xcode command-line tools by running xcode-select --install in your terminal. It’s entirely possible that this package is already installed on your machine, since lots of programs like Homebrew require it to work.

Writing macOS Assembly

I’ll use this ‘hello-world’ program to demonstrate some of macOS’s quirks

.intel_syntax noprefix

.section __DATA,__data # .data
message: .asciz "Hello, world!\n"
.set message_size, . - message

.section __TEXT,__text # .text
.global entry
entry:
	# write(1, message, message_size)
	mov rax, 0x02000004
	mov rdi, 1
	lea rsi, message[rip]
	mov rdx, offset message_size
	syscall
	# exit(0)
	mov rax, 0x02000001
	mov rdi, 0
	syscall

.global _main
_main:
	call entry

First things first, if you prefer AT&T syntax, then I suggest contacting a certified psychiatrist in your area. For anyone else, the correct syntax flavor can be selected using the .intel_syntax directive. noprefix argument allows using registers without the % prefix.

Mach-O expects different section names than Linux ELF. .text is __TEXT,__text and .data is __DATA,__data. The different section names are documented here. You could also just use the .text and .data directives provided by the assembler if you don’t want to specify full section names.

Recent macOS versions disabled support for 32-bit executables, so you’ll need to write position-independent code. Note the lea rsi, message[rip] instruction. If you instead try loading the the message addresss using mov you’ll get an error saying that 32-bit absolute addressing is no longer supported.

Program execution will begin at _main. This is defined either by the macOS SDK or the macOS libc, but do your own research. I’ll also address the entry subroutine in a minute.

Building

Here are the necessary build commands in a sample Makefile

.PHONY: clean

all: main

clean:
	rm -rf main.o main main.dSYM

main.o: main.s
	as -g -o $@ $?

main: main.o
	ld -L /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem -o $@ $?

main.dSYM: main
	dsymutil $?

I used the -g assembler flag to generate debug symbols in the object file.

The most annoying part is linking. Apparently, someone at Apple thought that it was not a good idea to add the most core OS library (-lSystem) into the linker’s default search path, so you’ll need to add it manually using the -L flag.

macOS made a peculiar design choice for debugging symbols. Instead of putting those directly in the executable, they are put into a dSYM companion file. This file can be generated using the dsymutil tool.

Debugging

LLDB is the system debugger on macOS. It’s very similar to GDB, but its commands are better structured. It’s very well documented here.

lldb

Once again, if you want to use the correct assembly syntax, you’ll need to run settings set target.x86-disassembly-flavor intel. You can either do that interactively, or you could write it in .lldbinit. You can either save this file in current working directory and load it using --local-lldbinit or save it under ~ and it will get loaded automatically when you enter the debugger.

I’ve noticed that if you set your breakpoint at _main, then LLDB will skip over it entirely. I’m not sure why this happens, but I assume that LLDB is configured to skip over system library code by default. The obvious workaround is to create a separate subroutine and call it from _main. Also, I tried to use main as the subroutine name, but LLDB complained that breakpoint was ambiguous. I presume this symbol is defined by libc or some other core library.