(adsbygoogle = window.adsbygoogle || []).push({});
Note: This thread took a lot of time and effort for me to write. If you do have some constructive feedback, feel free to leave it, everyone makes mistakes . But if you could please leave out negative, nonconstructive comments it would be appreciated as again, I put time and effort into explaining everything to help those who wish to learn. Thanks c:
Another Note: There are two different syntax', there's Intel syntax and AT&T syntax. We're using Intel. An example of this is the very common "mov" instruction. In Intel, the usage is mov [destination], [source]. In AT&T syntax, it's the opposite (mov [source], [destination]). So in this thread if you see "mov ebp, esp", we're actually moving esp into ebp, not ebp into esp.
Introduction
Why did I write this thread? Because I haven't written a thread in a very long time and I thought this might help some people in the community. Anyhow, not too long ago I looked into x86 assembly and disassembled some C code (you can do this in the Visual Studio IDE, or via IDA Pro otherwise). There's some interesting stuff going on behind that high-level C code, and understanding some lower level assembly allows you to understand what's actually going on behind your program, which will help you in debugging and coding in general (I know it did for me). Note this won't be much help to you if you're looking into PS3 development as the PS3 runs on PowerPC (PPC), not Intel x86.
Endianness
I won't be covering basic binary in this tutorial (I might go into it in a future tutorial if requested, but there's most likely plenty of resources out there already), but endianness is something that is important. Big endian is typically used for networks as well as the internet, where the Intel x86 processor uses the little endian architecture, therefore little endian will be what we're using.
Originally posted by another user
Side Note
You can actually test if your machine is little endian or big endian with some C++ code. Below is an example. You can determine this with simpler code, I coded it more complex than it needs to be.
What endianness deals with, is where the most significant bit is stored in memory. In big endian machines, the most significant bit is stored at the smallest memory address, and the least significant is stored at the biggest memory address. Conversely, in little endian machines, the most significant bit is stored at the biggest memory address and the least significant stored at the smallest address.
Originally posted by another user
What is a significant bit?
If we think about this in real life terms, if you go to a retail store and the total change you should receive is $56.12, the 5 is the most significant digit. Think of it this way, if the amount you received compared to the amount you should have received had only one digit changed, would you rather it be the two, the one, the six, or the five? If the 6 were changed to a 2, you'd lose $4, but if the 5 were changed to a 2, you'd lose $30. You'd much rather the 1 or the 2 to be different than the 6 or the 5 if you had to choose.
Below is a little example of endianness and how the most significant bit/least significant bit is stored in terms of memory;
You must login or register to view this content.
Merged from Source:
You must login or register to view this content.
Signed and Unsigned
Quite simply, signed integers are integers that can store values in the negative, as well as the positive range. Unsigned integers are integers that can only store positive values, thus excluding the negative range, and therefore allowing a greater positive range. An easy way to think of this, is typically negative numbers have a "sign" (-), where positive numbers (although they can be written), typically do not have a sign (eg. you don't usually see +45, you just see 45 written).
Registers
Your CPU (Central Processing Unit) has registers, or spaces for data to be stored. There are conventions for these registers that compilers use, as well as those who wish to hand-code assembly. Below are some of the most common register conventions and their basic descriptions.
Register |
Description |
EAX |
Stores subroutine/function return values, you'll see this used right before stack deconstruction and return |
EBX |
The base pointer to data |
ECX |
Counter register, typically used for iteration/loops |
EDX |
IO (Input/Output) pointer |
EDI |
Destination pointer, used typically for strings |
ESI |
Source pointer, used typically for strings |
ESP |
Stack pointer (referring to the memory stack), you'll see this at the top of every subroutine/function |
EBP |
Stack base pointer, you'll see esp get moved into this register a fair bit, near the top of every subroutine/function |
EIP |
Pointer to the next instruction |
Now this can get confusing, but registers are classified as caller-save registers, and callee-save registers. The difference? C typically uses something called cdecl (short for C declaration), and Microsoft C++ (Win32 API) typically uses stdcall (short for standard call). In cdecl (what we're dealing with), the caller is responsible for cleaning up the stack (clearing memory). In standard call, the callee is responsible for cleaning up the stack.
The Stack
The memory stack is a section where your program's memory is stored (variables, pointers, etc), and much like a stack, data is stacked on-top of each other and tore down piece by piece (hence the name). I won't go into this too much but below is a little diagram of a stack. Below that is a very basic c program and a picture of the stack window in visual studio.
You must login or register to view this content.
Source:
You must login or register to view this content.
You must login or register to view this content.
Source: My own source code, taken with lightshot
R/M32 Address Forms
This allows you to move memory to memory, register to memory, or memory to register, rather than just the conventional register to register. However, it is NOT possible to move memory to memory. Below is a few examples of r/m32 forms in practice;
mov eax, [ebx]
mov eax, [ebx + ecx * X] (x can equal 1, 2, 4, 8...)
mov eax, [ebx + ecx * X + Y] (x can equal 1, 2, 4, 8... // y can equal one byte, 0-255 or 4, 0-2^31)
The most complex form you'll see of this is most likely [base + index * scale + displacement], if I remember correctly (correct me if I'm wrong), these are used for arrays, as base = address, index = array index, scale = size, displacement = offset.
LEA - Load Effective Address
Load effective addresses, notated using LEA, allows the r/m32 form to be used but for a different purpose. Where square brackets [] in r/m32 indicated memory, in lea it actually means dereference, and is used with pointer arithmetic. As an example, say ebx = 0x2, and edx = 0x1000. If we used "lea ebx, [edx + ebx * 2]" as an instruction, eax would be set to 0x1004, NOT the value at 0x1004.
Some basic flags
*Note this table only lists some flags, not all possible flags.
Flag |
Description |
ZF (ZeroFlag) |
Set if the instruction returns 0 |
SF (SignFlag) |
Set to one if a number is negative, none if the number is positive |
CF (CarryFlag) |
Set if the result of x arithmetic sets the (n+1) bit |
PF (ParityFlag) |
IO (Input/Output) pointer |
Debugging in visual studio
Debugging is important, but since this guide is not on debugging I won't go into it extensively. The three main functions you need to know is step into, step over, and step out of.
Step into: Allows you to "step into" or go to an instruction or statement. Icon:
You must login or register to view this content.
Step over: Allows you to "step over" or skip an instruction or statement. Icon:
You must login or register to view this content.
Step out of: Allow you to "step out of" or exit an instruction or statement. Icon:
You must login or register to view this content.
Another important part of debugging is being able to set breakpoints, and these are very simple to do. Different IDE's may have different methods for doing this, but in Visual Studio you just need to left click to the left of the line number that you wish to break at.
You must login or register to view this content.
Source: My own source code, taken with lightshot
Disassembling C code in Visual Studio
To disassemble C code, set a breakpoint (I prefer right at the main function header), and right click on the code view while debugging. In the context menu, click "Go To Disassembly". You'll see a lot of assembly, some you probably won't see in this thread. This is because of some things visual studio implements into your code, such as sanity checks. You can turn these off if you wish, below are some pictures of how to turn off these checks and what settings to change to make your assembly simpler.
You must login or register to view this content.
You must login or register to view this content.
You must login or register to view this content.
You must login or register to view this content.
You must login or register to view this content.
Sources: Visual Studio, taken with lightshot
Disassembling C code in IDA Pro
Disassembling code in IDA Pro is fairly straight-forward as well, first you must download and install IDA Pro Freeware (non-commercial). When you open it you'll get a dialogue with buttons "New" and "Go", you can choose either. I'm going to go with "Go". Now you should just be able to drag your programs' .exe executable into the window and it should disassemble it. If not, just go to File -> Open and browse for your file.
A basic C -> ASM example - calculation function
Below is a basic C program that takes two given numbers (from execution arguments) and multiplies them together, aka a 2D area calculator using X and Y. Below that is it's equivalent x86 Assembly, this is with optimizations and sanity checks turned off by the way, so my example will contain just the basic assembly.
C Code:
#include <stdlib.h>
int calcArea(int x, int y)
{
return x * y;
}
int main(int argC, char **argv)
{
int x;
int y;
x = atoi(argv[0]); // atoi == string to integer, we'll consider this a blackbox function, we don't care right now how it works
y = atoi(argv[1]); // atoi == string to integer, we'll consider this a blackbox function, we don't care right now how it works
return calcArea(x, y);
}
x86 Assembly Code:
sub:
00AF1000 push ebp
00AF1001 mov ebp,esp
00AF1003 mov eax,dword ptr [x]
00AF1006 imul eax,dword ptr [y]
00AF100A pop ebp
00AF100B ret
main:
00AF1010 push ebp
00AF1011 mov ebp,esp
00AF1013 sub esp,8
00AF1016 mov dword ptr [ebp-8],0CCCCCCCCh
00AF101D mov dword ptr [ebp-4],0CCCCCCCCh
00AF1024 mov eax,dword ptr [argv]
00AF1027 mov ecx,dword ptr [eax]
00AF1029 push ecx
00AF102A call atoi (0AF10B4h)
00AF102F add esp,4
00AF1032 mov dword ptr [x],eax
00AF1035 mov edx,dword ptr [argv]
00AF1038 mov eax,dword ptr [edx+4]
00AF103B push eax
00AF103C call atoi (0AF10B4h)
00AF1041 add esp,4
00AF1044 mov dword ptr [y],eax
00AF1047 mov ecx,dword ptr [y]
00AF104A push ecx
00AF104B mov edx,dword ptr [x]
00AF104E push edx
00AF104F call calcArea (0AF1000h)
00AF1054 add esp,8
00AF1057 add esp,8
00AF105A cmp ebp,esp
00AF105C call _RTC_CheckEsp (0AF10C0h)
00AF1061 mov esp,ebp
00AF1063 pop ebp
00AF1064 ret
[/SPOILER]
For the first example I will break this down, however for the rest I will not as I don't want this thread to be too too long.
sub:
00AF1000 push ebp
00AF1001 mov ebp,esp
You'll see these instructions at the top of functions, it's pushing the ebp register value onto the stack, and moving the stack frame base pointer into the ebp register. This essentially allows the previous ebp to be saved on the stack while it's overwritten.
00AF1003 mov eax,dword ptr [x]
00AF1006 imul eax,dword ptr [y]
This moves the variable X into eax. The next instruction, "imul" (we haven't talked about this yet), multiplies y by eax, and stores it in eax. Remember, eax is conventionally the return register. Think of this in terms of high level code "eax *= y" or "eax = eax * y".
00AF100A pop ebp
00AF100B ret
This instruction cleans up a bit, it pops ebp (the base pointer) off the top of the stack and stores it back into the ebp register. This can be thought of as an "undo" of the 'push ebp; mov ebp, esp' instructions.
main:
00AF1010 push ebp
00AF1011 mov ebp,esp
Already covered but again, pushes ebp onto the stack and moves the stack frame base pointer into ebp.
00AF1013 sub esp,8
What we do here is we subtract 8 from the stack pointer, this allocates 8 bytes of memory on the stack for variables a and b, 4 bytes to each variable.
00AF1016 mov dword ptr [ebp-8],0CCCCCCCCh
00AF101D mov dword ptr [ebp-4],0CCCCCCCCh
00AF1024 mov eax,dword ptr [argv]
00AF1027 mov ecx,dword ptr [eax]
This part may seem difficult, but what we're trying to do is store our values. We're storing 0xCCCCCCCC (3435973836 in decimal) in each variable, 0xCCCCCCCC means unassigned memory. We're then assigning argv (defaults to argv[0] first argument) to eax and moving eax into ecx for our atoi function.
00AF1029 push ecx
00AF102A call atoi (0AF10B4h)
Calls the atoi function that explicitly converts the pointer to argv[0] from string to integer.
00AF102F add esp,4
We no longer need "a" or "argv[0]" in memory as we've moved it into ecx, so we can free memory by moving the stack pointer. Basically anything below the stack pointer is considered undefined, and you should NEVER try to access memory below the stack pointer.
00AF1032 mov dword ptr [x],eax
00AF1035 mov edx,dword ptr [argv]
00AF1038 mov eax,dword ptr [edx+4]
00AF103B push eax
00AF103C call atoi (0AF10B4h)
Here we move eax into x, argv[0] into edx register, and we move edx + 4 into the eax register. Why? Because edx + 4 is argv[1], as each variable is allocated 4 bytes. We then push eax onto the stack and call our atoi function again.
00AF1041 add esp,4
00AF1044 mov dword ptr [y],eax
00AF1047 mov ecx,dword ptr [y]
00AF104A push ecx
00AF104B mov edx,dword ptr [x]
00AF104E push edx
Again we free memory by moving the stack pointer, and we also move eax into "y". The compiler then moves "y" into ecx, pushes it onto the stack, and moves "x" into edx, and similarly pushes it onto the stack as well.
00AF104F call calcArea (0AF1000h)
00AF1054 add esp,8
00AF1057 add esp,8
Here we call our written "calcArea" subroutine and free up some memory by adding to the stack pointer.
00AF105A cmp ebp,esp
00AF105C call _RTC_CheckEsp (0AF10C0h)
We don't really care about these instructions as the compiler generates them because we didn't turn off security checks, but what this basically does it checks if the esp (stack pointer) as well as the stack is correct.
00AF104F call calcArea (0AF1000h)
00AF1054 add esp,8
00AF1057 add esp,8
Moves the value on the stack 2 items up into the eax (return) register, and pushes it onto the stack.
00AF1061 mov esp,ebp
00AF1063 pop ebp
00AF1064 ret
Again like with our sub() function, we move the base pointer into the stack pointer, and we pop ebp off the top of the stack into the ebp register. This is called at the end of every function. We then finally return.
The Jump and Compare Instructions
The jump instruction as the name implies allows one to jump to a different instruction. It implicitly changes EIP to the given address. Below are some basic jumps;
Short (1 byte jump) (relative) An example is jmp 0x0E
Near (4 byte jump)
Absolute (Hardcoded Address)
Absolute Indirect (r/m32 form)
In-case you're curious, an infinite loop is typically jmp - 2. This is used in malware analysis and reverse engineering if the debugger wants to freeze at a certain point.
We use jumps for loops and conditional blocks of code. For conditional loops and blocks we introduce the compare instruction, which is simply notated as "cmp>". How do we use this? Simple, cmp val1, val2 (note val1 and val2 are not valid options, but rather a usage). Below is an example
cmp eax, dword ptr[ebp - 8]
Typically right under a compare instruction you immediately have a conditional jump instruction. Below is a list of common JCC;
*Note: In brackets in the instruction column is what the abbreviation stands for, it is not part of the instruction.
JCC Instruction |
JCC Alternative Instruction |
Description |
JE (Jump if Equal) |
JZ (Jump if Zero) |
Jumps if and only if the ZF (ZeroFlag) is set to one. |
JNE (Jump if Not Equal) |
JNZ (Jump if Not Zero) |
Jumps if and only if the ZF (ZeroFlag) is set to zero. |
JLE (Jump if Lesser than or Equal to) |
JNG (Jump if Not Greater) |
Jumps if ZF (ZeroFlag) is set to one, OR if the SF (SignFlag) does not equal the OF (OverflowFlag). |
JNL (Jump if Not Lesser than) |
JGE (Jump if Greater than or Equal to) |
Jumps if and only if the SF (SignFlag) equals the OF (OverflowFlag). |
JBE (Jump if Below or Equal to) |
-- |
Jumps if CF (CarryFlag) is set to one, OR if ZF (ZeroFlag) is set to one. |
JB (Jump if Below) |
-- |
Jumps if and only if CF (CarryFlag) is set to one. |
C -> ASM example - conditional blocks
C Code:
#include <stdio.h>
int main()
{
int a = 15, b = 20;
if(a == b)
return 1;
if(a > b)
return 2;
if(a < b)
return 3;
return -1;
}
x86 Assembly Code:
00201000 push ebp
00201001 mov ebp,esp
00201003 sub esp,8
00201006 mov dword ptr [ebp-8],0CCCCCCCCh
0020100D mov dword ptr [ebp-4],0CCCCCCCCh
00201014 mov dword ptr [a],0Fh
0020101B mov dword ptr [b],14h
00201022 mov eax,dword ptr [a]
00201025 cmp eax,dword ptr [b]
00201028 jne main+31h (201031h)
0020102A mov eax,1
0020102F jmp main+52h (201052h)
00201031 mov ecx,dword ptr [a]
00201034 cmp ecx,dword ptr [b]
00201037 jle main+40h (201040h)
00201039 mov eax,2
0020103E jmp main+52h (201052h)
00201040 mov edx,dword ptr [a]
00201043 cmp edx,dword ptr [b]
00201046 jge main+4Fh (20104Fh)
00201048 mov eax,3
0020104D jmp main+52h (201052h)
0020104F or eax,0FFFFFFFFh
00201052 mov esp,ebp
00201054 pop ebp
00201055 ret
A little break down, we allocate space via subtracting 8 from the stack pointer. We then store the unassigned memory value in each set of four bytes, and we move 0x0F (15) into "a", and 0x14 (20) into "b". We now compare eax (var a) and "b". Based on this compare, we will "jne" or "jump if not equal" to main + 0x31 (00201031). Otherwise, if it is equal, we move one into eax (because of "return 1;"), and eax is the return register. We then jump to main + 0x52 (00201052).
If the first condition is false, then we go to another compare, comparing ecx (var a) with "b". Based on this compare we will "jle" or "jump if lesser than or equal to" to main + 0x40 (00201040). Otherwise, we will move two into eax, and jump to main + 0x52 (00201052).
Finally we compare edx (var a) with "b". Based on this compare we will "jge" or "jump if greater than or equal to" to 0x4F, which will never happen as one of the three conditions will be correct, we should never return -1 in this program. Otherwise, we will move three into eax and jump to main + 0x52 (00201052), where we see the conventional "move base pointer into stack pointer, pop ebp off stack into ebp register, return".
C -> ASM example - a basic for() loop
C Code:
#include <stdio.h>
// testing comment for plugin
int main()
{
int i;
for(i = 0; i < 15; i++)
printf("i = %d\n", i);
}
x86 Assembly Code:
00EB1000 push ebp
00EB1001 mov ebp,esp
00EB1003 push ecx
00EB1004 push esi
00EB1005 mov dword ptr [ebp-4],0CCCCCCCCh
00EB100C mov dword ptr [i],0
00EB1013 jmp main+1Eh (0EB101Eh)
00EB1015 mov eax,dword ptr [i]
00EB1018 add eax,1
00EB101B mov dword ptr [i],eax
00EB101E cmp dword ptr [i],0Fh
00EB1022 jge main+41h (0EB1041h)
00EB1024 mov esi,esp
00EB1026 mov ecx,dword ptr [i]
00EB1029 push ecx
00EB102A push offset _RTC_ErrorLevels-8 (0EB5000h)
00EB102F call dword ptr [__imp__printf (0EB30DCh)]
00EB1035 add esp,8
00EB1038 cmp esi,esp
00EB103A call _RTC_CheckEsp (0EB1060h)
00EB103F jmp main+15h (0EB1015h)
00EB1041 xor eax,eax
00EB1043 pop esi
00EB1044 add esp,4
00EB1047 cmp ebp,esp
00EB1049 call _RTC_CheckEsp (0EB1060h)
00EB104E mov esp,ebp
00EB1050 pop ebp
00EB1051 ret
Another breakdown, yay. First we push the base pointer unto the stack, move stack pointer into base pointer register. Then we push ecx, which is the iteration register (seems appropriate considering we're using an iteration loop). We also push the source pointer onto the stack. We then reserve a space in memory for i, and store the unassigned memory hex value to it, 0x0CCCCCCCC. We then move 0 into "i" as we start the loop at 0.
Before incrementing i, we jump to main + 0x1E (00EB101E). Here we compare 0xF (15) with "i". Based on this compare if "jge" or "jump if greater than or equal to" returns true, we jump to main + 0x41 (00EB1041), which is past the loop. If not, we move the stack pointer into the source pointer, and move "i" into ecx. We now push ecx as well. All the other stuff like _RTC_ErrorLevels and _RTC_CheckEsp are security checks implemented by visual studio. We don't care about these.
We now add 8 to the stack pointer, freeing some memory, and compare esi and esp. We now go back to main + 0x15 (00EB1015) where we increment and reiterate.
We exclusive or eax and eax, clearing it out, and we pop the source pointer off the stack and into the esi register. We add four to the stack base pointer to free up some memory, and compare ebp and esp, however again this is for a security check to ensure esp and the stack are all in-tact. Then we clean up the last bit by moving the base pointer into the stack pointer, pop the base pointer into the ebp regiser, and return.
C -> ASM example - unsigned multiply and divide
C Code:
#include "stdio.h"
/* Testing Comment
for plugin */
int main()
{
unsigned int a, b, c;
a = 0x63;
b = a * 4;
c = b / 6;
return c;
}
x86 Assembly Code:
01081000 push ebp
01081001 mov ebp,esp
01081003 sub esp,0Ch
01081006 mov dword ptr [ebp-0Ch],0CCCCCCCCh
0108100D mov dword ptr [ebp-8],0CCCCCCCCh
01081014 mov dword ptr [ebp-4],0CCCCCCCCh
0108101B mov dword ptr [a],63h
01081022 mov eax,dword ptr [a]
01081025 shl eax,2
01081028 mov dword ptr [b],eax
0108102B mov eax,dword ptr [b]
0108102E xor edx,edx
01081030 mov ecx,6
01081035 div eax,ecx
01081037 mov dword ptr [c],eax
0108103A mov eax,dword ptr [c]
0108103D mov esp,ebp
0108103F pop ebp
01081040 ret
Breakdown time. We as always push the base pointer unto the stack and move the stack pointer into the ebp register, and we also subract 0x0C from esp... What's 0x0C? Well convert that to decimal, that's 12. Why are we subtracting 12 from the esp? Because we have three variables, a, b, and c, and we allocate 4 bytes to each variable. We then move the "unassigned memory" value into each variable on the stack, and we move 0x63 into a (due to a = 0x63).
Now we must move "a" into the eax register, and the next instruction is something I haven't covered in this thread. "shl" means "shift left". For those familiar with bitwise, you know what this means. Otherwise, you may not. I won't cover bitwise in this thread, but if it's requested I will. Why this works is because we're multiplying by 4. In binary, we have 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, etc. This means that if we multiply by one of those numbers, like 16 or 4, we can just shift the bits to the left. Similarly, to divide, we can just shift the bits to the right using "shr". For a non-binary column number, we have to use "imul" or "idiv". We're using "shl" shifting in this case instead of "imul" because we can, and bitwise is faster.
Now we'll move eax into "b", and move "b" back into eax. Now we see an xor statement, why is that there? Well lets work through it. We're xor'ing two of the same value by the way, which is 00000001.
00000001
^ 00000001
============
00000000
What this essentially does is "clears" edx so there's no interference with the calculation, it is generally used right before a divide instruction.
Now the compiler moves 6 into ecx, because we're dividing by 6. We now use the "div" instruction, with eax and ecx, dividing "b" by 6 respectively. Much like previous instructions, there's only two arguments, meaning that you can think of this div operation as "b /= 6" in high-level.
Now we just move eax into "c", put "c" back into eax for our return register, and clean up the stack. And we're done.
Conclusion
This thread took forever to write, but I hope it helped some people understand the basics of x86 assembly. I also included things such as basic debugging so those who wish can go and write their own C programs and disassemble. While you can of course hand-code assembly, this thread is more for understanding how basic programs and pieces of code work at the fundamental level, but if you wish to get into hand-coding assembly, by all means, I hope this thread helps!
All the material in this thread is from my notes from when I was learning assembly, if you wish to trace back to where I got my notes for this thread and possibly learn a bit more yourself, it was from an online course by Xeno Kovah. The link can be found You must login or register to view this content..