Dude, where's my main?
So I was writing a debugger. You know the sort of thing: breakpoints,
stepping, checking the value of variables. It was going wonderfully until I
tried to debug my debugger with my debugger. main
was just... gone. It
would run, but trying to set a breakpoint on it crashed the program. It took
me weeks to figure out why. This is that story.
# Introduction
The program we're going to use to demonstrate the debugger is a classic C "Hello, world!":
// programs/hello.c
#include <stdio.h>
int main(void) {
printf("Hello, world!\n");
return 0;
}
Let's compile this program with debug information and do a few basic
operations in the debugger. It's written in Rust, so we invoke it by
using Rust's build tool, cargo
:
$ gcc -g programs/hello.c -o programs/hello
$ cargo run -- programs/hello
> symbol main
0x401c60 main
> register rip
rip 0x401b30
> break main
> cont
> register rip
rip 0x401c60
> cont
Hello, world!
Our "Hello, world!" program is initially in a stopped state at some address
that isn't main
, so we set a breakpoint on main
, continue, and then
verify that we are at the first instruction of main. Then we continue the
program and we see it print out "Hello, world!" as expected. Neat!
This got me thinking: maybe I could debug my debugger in my debugger. That would be awesome.
$ cargo run -- target/debug/rust-debugger
> symbol main
0xd2f90 main
> register rip
rip 0x7fb95aa0c100
> break main
libc err: Input/output error
Oh. What happened?
# Understanding the problem
The error message we get comes from a piece of code in my debugger that takes
any error code we get from a libc
function and runs it through
strerror. So which libc
function did it come from?
#
ptrace
The only libc
function that gets called when setting a breakpoint is the
ptrace
function. ptrace
is a Linux-specific API that debuggers use to
control the processes they're debugging. With it you can read and write
memory, read and write registers, advance by 1 instruction, and loads more.
When you set a breakpoint in the debugger, the debugger writes 0xCC
to the
address we want to pause execution at. 0xCC
is an x86 instruction that
triggers a software interrupt specific to debuggers. It stops the debuggee
and returns control to the debugger.
The error string "Input/output error" corresponds to the error code EIO
.
Checking the man page for ptrace
and looking for EIO
tells us ptrace
will return EIO
when:
Request is invalid, or an attempt was made to read from or write to an invalid area in the tracer's or the tracee's memory, or there was a word-alignment violation, or an invalid signal was specified during a restart request.
We can rule out the request being invalid, as it worked fine in our other program. Same for word-alignment violation. We're not doing anything funky with signals, so that leaves us with a read or write to an invalid area in the debuggee's memory.
Let's look again at the output we got in our debugger:
> symbol main
0xd2f90 main
> register rip
rip 0x7fb95aa0c100
There's a huge difference between the address of the main
symbol and the
value of the instruction pointer at the start of the program's execution. The
difference was much smaller when debugging our "Hello, world!" program, so
where does the address for main
come from?
# Debug info
There are a couple of ways to get the address of a function in a compiled
program. The method I went with is to use the debugging info compilers
provide you with when you compile with the -g
flag.
This debug info is in a format called DWARF. DWARF is a big and complex beast, and with good reason! It has to describe where functions have been inlined, which file each function was defined in, where functions start and begin, where to find every local variable in every function, and a whole lot more.
We're only interested in where functions begin. DWARF debug info is in a
tree-like structure, much like code is, and each type of node is identified
with a "tag." The tag that represents a function definitoin is
DW_TAG_subprogram
:
$ readelf --debug-dump programs/hello
[...]
<1><2a>: Abbrev Number: 2 (DW_TAG_subprogram)
<2b> DW_AT_low_pc : 0x401c60
<33> DW_AT_high_pc : 0x11
<39> DW_AT_name : main
[...]
The value of DW_AT_low_pc
looks familiar. It's the address we found main
at when we ran this program in our debugger, and that's because this is
exactly where we get it from. Let's check for main
in our debugger:
$ readelf --debug-dump target/debug/rust-debugger
[...]
<2><804db>: Abbrev Number: 16 (DW_TAG_subprogram)
<804dc> DW_AT_low_pc : 0xd2f90
<804e4> DW_AT_high_pc : 0x2ef
<804ee> DW_AT_name : main
[...]
That address matches up as well, except this time it's wrong. What's
different about rust-debugger
for this to not be where main
ends up?
I was stumped by this for a good while before noticing a difference in the
output of readelf -h
:
$ readelf -h programs/hello
ELF Header:
Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x401b30
$ readelf -h target/debug/rust-debugger
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x93160
What caught my attention was the Type
field. In programs/hello
it's a
plain old executable, but in target/debug/rust-debugger
it's a shared
object. Aren't shared objects the type for libraries? Why would an executable
be reporting itself as a shared object?
It turns out that this is how "position-independent executables" look. They're executables that have been compiled in such a way that they can live in any part of the address space. This is achieved by making sure there are no absolute memory references. Instead, all memory references are done by making them relative to the current instruction pointer.
This triggered an old memory: is this somehow related to "address space layout randomisation" (ASLR)? I've done some reverse engineering challenges in my time, and know that one of the ways operating systems try to protect programs from attackers is to load them in to a random part of the address space. How could I confirm this?
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7f11d4997100
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7f810cd16100
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7fc08a2b8100
Well, that's good enough for me. It looks like it's only random in a portion of the address space. Still, it changes each time. Can we disable this?
It turns out we can! Linux has the concept of "personalities" for processes. From what I can see in the man page this seems to be some sort of compatibility layer for running programs meant for non-Linux systems. Fortuantely for us, it also lets us disable ASLR. Children inherit their parent's personality, so if we do the equivalent of the following C snippet in our Rust code:
#include <sys/personality.h>
int main() {
personality(ADDR_NO_RANDOMIZE);
}
Our child should be loaded at a non-random address. Let's try it!
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7ffff7fd3100
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7ffff7fd3100
$ cargo run -- target/debug/rust-debugger
> register rip
rip 0x7ffff7fd3100
The randomization is gone, but our process is still loaded at an adress that doesn't line up with the addresses in the binary. This isn't the answer to our problem.
# Finding the solution
My search eventually led me to this patch to gdb
from 2009. It's a
patch introducing support for position-independent executables. My thinking
was that gdb
works on these binaries, so there must be something it does
to correct the symbol addresses.
Combing through the patch, I found a function called
svr4_relocate_main_executable
. It has a huge comment above it that explains
some mathematics you can do when the entrypoint of the running process is not
the same as the entrypoint given in the binary on disk. Promising.
However, the comment also explains that this mathematics doesn't work if your binary has an "interp section."
$ readelf -S target/debug/rust-debugger | grep interp
[ 1] .interp PROGBITS 00000000000002e0 000002e0
Rats. I spent some time trying to get it work despite this, but I got nowhere. I would either get the same error as before, or no error but the breakpoint would never get hit. I assume I was modifying a valid but arbitary part of memory.
A little more searching later and I stumbled on this StackOverflow answer. At the very end of it the user mentions how GDB knows where the binary has been loaded:
This begs the question: how does GDB discover where ld-linux itself has been loaded. The kernel tells it via the auxiliary vector, AT_BASE entry.
# Auxillary vector
I'd never heard the term "auxillary vector" before but at this point I was happy to follow any lead.
When the operating system loads a program from disk in to memory and prepares
to execute it, it passes in some key pieces of information. Most people are
familiar with the arguments passed in from the command line (argv
) and
environment variables (envp
). These values are found on the stack, just
above where the stack begins when execution is passed to main
.
Just above argv
and envp
, though, is our newly discovered auxv
. We can
use the environment variable LD_SHOW_AUXV=1
to print the contents of the
auxillary vector for any program:
$ LD_SHOW_AUXV=1 ls
AT_BASE: 0x7f0790d8f000
AT_FLAGS: 0x0
AT_ENTRY: 0x5641572c6260
AT_PLATFORM: x86_64
[...]
Cargo.lock programs/ target/
Cargo.toml src/
I've trimmed the output for brevity. There's a lot of info in there. The
things that are most interesting to us are the AT_BASE
and AT_ENTRY
values. Let's fire up our debugger with this environment variable and see
what it says:
$ LD_SHOW_AUXV=1 cargo run -- target/debug/rust-debugger
AT_BASE: 0x7f80cd45f000
AT_ENTRY: 0x55ba24f62160
> cont
AT_BASE: 0x7ffff7fd1000
AT_ENTRY: 0x5555555d9160
The first auxv
values are those of the debugger process, and we have
to continue the debuggee process in order to see its auxillary vector because
it starts in a paused state. Armed with this, and the knowledge it won't move
around because we've disabled ASLR, we can try setting some breakpoints:
> register rip
rip 0x7ffff7fd3100
> break 5555555d9160
> cont
> register rip
rip 0x5555555d9160
I won't lie, I lost my shit when the breakpoint worked. We're not there yet,
though. Are we actually at main
? How would we know?
A function of the debugger I haven't mentioned until now is being able to
disassemble things. If you run disas
with no arguments, it disassembles
from the current instruction pointer:
> disas
0x5555555d9160 f30f1efa endbr64
0x5555555d9164 31ed xor ebp,ebp
0x5555555d9166 4989d1 mov r9,rdx
0x5555555d9169 5e pop rsi
0x5555555d916a 4889e2 mov rdx,rsp
0x5555555d916d 4883e4f0 and rsp,0FFFFFFFFFFFFFFF0h
[...]
We can use the objdump
command to see if this matches up with our main
function:
$ objdump -d target/debug/rust-debugger | less
00000000000b0a60 <main>:
b0a60: 48 83 ec 18 sub $0x18,%rsp
b0a64: 8a 05 14 e5 48 00 mov 0x48e514(%rip),%al
b0a6a: 48 63 cf movslq %edi,%rcx
b0a6d: 48 8d 3d ac f8 ff ff lea -0x754(%rip),%rdi
b0a74: 48 89 74 24 10 mov %rsi,0x10(%rsp)
b0a79: 48 89 ce mov %rcx,%rsi
b0a7c: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
b0a81: 88 44 24 0f mov %al,0xf(%rsp)
b0a85: e8 d6 6d ff ff callq a7860 <_ZN3std2rt10lang_start17h004f6ce3435c8ab1E>
b0a8a: 48 83 c4 18 add $0x18,%rsp
b0a8e: c3 retq
What?! They don't match. How can this be? I was so sure.
It took me a few minutes to calm down from the shock before I remembered that
programs don't actually start at main
. There's a little bit of code that
runs before main
called _start
:
$ objdump -d target/debug/rust-debugger | less
0000000000085160 <_start>:
85160: f3 0f 1e fa endbr64
85164: 31 ed xor %ebp,%ebp
85166: 49 89 d1 mov %rdx,%r9
85169: 5e pop %rsi
8516a: 48 89 e2 mov %rsp,%rdx
8516d: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
[...]
It matches! Yes!
Because we know that the whole executable gets shifted by the same amount,
because instruction pointer relative addresses wouldn't work otherwise, we
can do a little bit of mathematics here to figure out what the address of
main
should be:
0x5555555d9160 - 0x85160 + 0xb0a60 = 0x555555604a60
Let's try it:
> break 555555604a60
> cont
> register rip
rip 0x555555604a60
> disas
0x555555604a60 4883ec18 sub rsp,18h
0x555555604a64 8a0514e54800 mov al,[rel 555555A92F7Eh]
0x555555604a6a 4863cf movsxd rcx,edi
0x555555604a6d 488d3dacf8ffff lea rdi,[rel 555555604320h]
0x555555604a74 4889742410 mov [rsp+10h],rsi
0x555555604a79 4889ce mov rsi,rcx
0x555555604a7c 488b542410 mov rdx,[rsp+10h]
0x555555604a81 8844240f mov [rsp+0Fh],al
0x555555604a85 e8d66dffff call 00005555555FB860h
0x555555604a8a 4883c418 add rsp,18h
0x555555604a8e c3 ret
We did it! \o/
# Wrapping up
I learned a lot on this journey. I learned that Linux processes have "personalities" that modify their runtime environment. I learned that executables can be position-independent just like libraries, and this is part of how address space layout randomisation works.
Last and certainly not least, I learned that Linux tells a program things
about its runtime environment through a thing called an "auxillary vector,"
which lives on the stack just above the environment variables, and that's how
gdb
knows where to find main
in position-independent executables.
Now I know all of this, I can go ahead and fix my breakpoint code so that it works with position-independent executables.
# Appendix
The code for the debugger is available on GitHub, and if you're interested in the exact state things were in when I wrote this article you can check it out like so:
$ git clone https://github.com/samwho/rust-debugger
$ cd rust-debugger
$ git checkout 00bcdd4772496913d1417feb7f8886d40aed46d5
The commits I made that fixed our debugger for position independent executables are here and here.