(You can also directly rename symbols in the disassembler view with 'n' when you have an idea what they're for, but we won't be doing that to keep the symbol name consistent across screenshots.) Let's focus on this block for now (first screenshot is of the disassembly, second is of the decompilation): !( "Main disasm, first part") !( "Main lines 1-27") Note that `var_144` is set to a function pointer. The decompiler view does not support in-place text editing, and disassembly-view comments do not carry over to decompiliation view, so in the following screenshots I have copied the decompiled C to a code editor for annotating with explanatory comments. Rewriting these from relative offsets to the stack base to a name like `var_40` is a nice thing that the decompiler view does for us. This creates the storage area for the local variables. In the disassembly view of a function, you will normally see it move the stack pointer. `esp/rsp` refer to the top of the stack, `ebp/rbp` to the base of the current stack frame, and `eax/rax` is normally where a function's return value is received. If you have never looked at assembly before, just know that the registers are named storage locations inside the processor itself, and compilers try to place variables into registers as much as possible because it is much faster than reading and writing RAM. In optimized code, you may also see partial registers `ax, bx, cx, dx` (16 bits) and `ah/al, bh/bl, ch/cl, dh/dl` (8 bits). ![]() In a 32-bit program you will see `eax, ebx, ecx, edx, esi, edi, ebp, esp`, and in a 64-bit one you will also see `rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp` and some unnamed registers that look like `r`. Hopper does not attempt to do much type guessing, nor does it completely write out the use of registers, although it collapses as much as it can manage. Variables are "var" followed by a number. Internal functions will be named "sub" for "subroutine" followed by their address. External library functions will show their names just like in source code, and are your waypoints that will make it much easier to understand what's going on. It will pop up a C-like reconstruction of the function. In the upper right hand corner of Hopper is the magical Pseudo Code button. We are now faced with a wall of inscrutable assembly, but there is no need to panic. Place your cursor over the `sub_1000018b0` symbol - the address of the first function called, which presumably is `main` - and press enter to jump to it. Skip down to the first `call` instruction. When you open metamail with the "Read Executable" button in the upper left, you will be dumped here: !( "Entrypoint") This is the prologue before we reach C's `main`. I also ran the `strip` utility to remove any debug symbol names of internal functions that may be there, which would make it too easy (you will almost never see these in distributed closed-source programs). The source code is (), but it is so old that I had to change a few functions to return 0 instead of void to get llvm to compile it for OSX. Having reviewed it before, I strongly recommend that you do not use it on a production system (not that there is any use for it in this millennium). ![]() Honest.) We will be looking at an ancient piece of C code for Unix called metamail, which was a mimetype helper that received base64-encoded attachments from email clients and opened the appropriate viewing program. ![]() (I am not affiliated with Hopper or its creator nor am I receiving a handsome sum to promote it. It is still under active development, and (like every other decompiler) is not perfect, so learning how to spot when it goes awry is an important facet of making use of it. At the time of writing, the newest version is 2.2.0 the App Store is still holding it hostage for review, but you can also buy the app directly from the creator. It supports both 32-bit and 64-bit executables for Windows and OSX (no Linux or iOS/ARM support yet). ![]() Hopper is a disassembler with a very-close-to-C "pseudocode" decompiler that does not roundtrip with your C compiler but is quite good for examining other people's binaries. Heck, you could get an iMac and still have change for a coffee. It is only for OSX, but it is literally cheaper to buy a Mac Mini and Hopper than the (naturally more mature and well-featured) x86 Hex-Rays decompiler. For this tutorial you will need a strong knowledge of C, only the slightest familiarity with assembly, the ability to understand Unix man pages, and (), which is $29. We will start at line one and proceed linearly, just to get a feel for how to read decompiled code. The binary we examine is non-malicious and non-obfuscated, and is not run through the highest optimization settings of the compiler. _by abadidea - (this is now also on the corporate blog with an expanded introduction: () ) # No source code? _No problem!_ # This is aimed at beginners in static analysis.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |