Calling Conventions

Introduction

Calling conventions are used by all programs without the user even realizing it. But before saying more about them, we must first make sure we understand what happens when a function gets called. Let's say we have a function named "add" that we'll be calling like this:

void add(int a, int b, int c, int d) {
int e = a+b+c+d;
return e;
}
add(1,2,3,4);

When the code is translated to assembly, the parameters of the function are first pushed to stack (the order is dependent of the compiler). Let's say the parameters on the stack are pushed in reverse order, so first the number 4 is pushed, then 3, then 2 and finally the number 1. After that the address of the next instruction stored in EIP is pushed to the stack when the CALL instruction is called. The function must then store the old EBP register to the stack and initialize its own stack frame. Normally, the EBP and ESP registers are used to hold the start and end addresses of the stack frame. The ESP points to the top of the stack, while the EBP points to the parameters being passed to the function. After the EBP pointer is pushed to the stack, the EBP points to the current address of the ESP register. After that, the free space for the local variables is reserved for the function operation; therefore the ESP register is changed to point to the top of the stack. At that point the stack looks like the following:

Low Memory Addresses
-----------------------
| e |
-----------------------
| old EBP |
-----------------------
| return address |
-----------------------
| a |
-----------------------
| b |
-----------------------
| c |
-----------------------
| d |
-----------------------
High Memory Addresses

Once everything is in its place, the local variable e is computed and stored on the stack and later returned to the caller function. The old EBP is restored and the ESP is pointing at the location where the EBP was pointing last. The return address is then taken from the stack and the function returns to that address so the program can continue to execute.

Calling a Function

We've seen what happens when a function is called. But we must be aware that a lot of that process depends upon the calling convention used. Are function parameters copied to the stack by the called or caller function? Are parameters removed from the stack by the called or caller function? Are parameters copied in normal or reverse order? The answer is very simple: it depends upon the calling convention. But one thing is certain: the called and caller functions must agree to the calling convention being used, because otherwise how would the called function know what the caller function intended? Let's say that the caller function copies the parameters on the stack in normal order, but the called function expects them to be in reverse order? This isn't right, it won't work the way we expect it to. Well in our example it will, because it doesn't matter if we write "a+b+c+d" or "d+c+b+a", but in other cases it might.

Keep in mind that both caller and called function must use the same calling convention, because otherwise strange things might start happening, such as the functions returning invalid data or something even worse.

We must also be aware of the fact that it is the compiler's job to use the same calling conventions both for called and caller function when compiling the program.

There are a number of calling conventions out there and in this article we'll try to describe most of them, their similarities and their differences. We ask ourselves the following questions when describing each of the calling conventions:

1) In which order are the parameters passed to the function: normal or reverse order?

2) How is the result of the called function passed back to the caller function?

The most common calling conventions for C/C++ are: stdcall, cdecl, fastcall, and thiscall. Let's take a look at the assembly instructions that need to be executable during each function call:

First we need to push the parameters to the stack for them to be available in the function. We can do that with an assembly code like this:

push 1
push 2
push 3
push 4

Note that we can also pass the parameters in registers to the function; but in such case we must we sure that the function is gracefully handling the values in registers. Then we need to call the function with a CALL instruction that writes the EIP address (return pointer to the stack). This needs to be done, so the called function knows where to return after it is done executing. The assembly code looks like the following:

call function

Once the function is called, we need to initialize the stack frame, which we can do with this assembly code:

push ebp
mov ebp, esp
sub esp, 10

We first stored the old EBP address on the stack with the push instruction. Then we've overwritten the EBP register with the current ESP register, so we can access the parameter we previously pushed to the stack. Afterwards we're subtracting 0x10 bytes from the ESP register, which lets us know that the function needs 0x10 == 16 bytes for its local variables. And we're done; we can start executing the function code now. Once the function has finished executing, we need to return the value. Usually, we're returning the value in an EAX register, so we need to copy the value from wherever it is to the EAX register. We can do that with the following instruction, where we're copying the value from EBX to EAX to be returned as a result from the function:

mov eax, ebx

After that, we need to clean the stack. We can do that with the code below:

mov esp,ebp
pop ebp

We're changing the ESP and EBP register values to its original state to ensure that the called function didn't change anything and the program will be able to continue gracefully. At the end, we must return to the caller function by executing the below command:

ret

At last, we must also clean up the parameters passed to the called function. At the moment of return, the parameters are still pushed on the stack, because the called function didn't clean them, so it's our job to do that; this is also the last step that needs to be performed for the function call to succeed:

add esp, 10

Notice that we're adding 16 bytes to the ESP pointer, which effectively clears the four numbers passed to the function: the numbers were 1,2,3,4. This has nothing to do with the subtraction of the 16 bytes for the local variables the function needed, but has to do with the parameters that were passed to the function. It's merely coincidence that exactly four numbers (where each number is 4 bytes) were passed to the function, which together applies 16 bytes.

Calling Conventions

So far we've discussed how the function is called beneath the surface. We need to completely understand the logic behind that if we are going to talk about the calling conventions. We've already mentioned that calling conventions are standardized methods of calling the function to avoid erroneous behavior.

Now it's time to actually talk about different calling conventions. In the previous section we described how to call a function; that basic concept is the same for all calling conventions. What differs is the order of executing the basic building blocks of calling the function.

Let's look at the three most common calling conventions. The __cdecl calling convention is shown in the picture below. We can see that arguments of the function are pushed on the stack just before calling the function. The called function then initializes its frame points and does its job. Upon completion it cleans the stack and returns to the caller function. The caller function then subtracts 16 bytes from the stack to clean after the pushed function arguments.

image0

The picture below presents the __stdcall calling convention, which is basically the same as the __cdecl calling convention, except that it is the called function that cleans the stack (notice the "ret 10" instruction; this is the replacement instruction for the "add esp,10"). Because the stack is cleaned in the function itself, the program that uses __stdcall calling convention is smaller than the program that uses __cdecl calling convention, because the code for stack cleanup must be generated for each function separately (outside of the function).

image1

Note that any calling convention where the called function cleans the stack can only be used when the function being called knows how many parameters it was called with. Because how would the function know how to clean after itself, if it doesn't know the number of parameters it received? Therefore the calling conventions where the called function cleans the stack after itself can only be used for functions that accept a fixed number of parameters.

There's one more calling convention that we will mention. It's the __fastcall, which tries to pass as many arguments as it can in the actual registers (and not by pushing them to the stack). We can see that in the picture below:

image2

Notice that we're passing the argument 1 in the EAX, argument 2 in EBX, argument 3 in ECX, and argument 4 in EDX register. This isn't usually the case, because we have a limited number of registers, so not every argument could be passed to the stack. Usually only the first two arguments are passed in registers and the others are pushed to the stack. This is the fastest way to call a function, because we don't have to store the arguments on the stack, which is a slow operation in comparison to the register operations.

Finally, there is the __thiscall calling convention that is used by the function in object-oriented programs that need the this object reference. Let's take a look at the picture below:

image3

We can see that the calling convention is almost exactly the same as __stdcall calling convention except that it also passes the this reference in the ECX register to the function.

Conclusion

Remember that the compiler can call a function however it wants; it doesn't have to follow any rules as long as the caller and the called function understand each other. This is why calling conventions are different if we compile the program with different compilers. We need to remember the calling conventions when reverse engineering some program, because they may come handy if we're trying to figure out the calling convention of the function. We should try to figure that out for every function that interests us, because we can then be completely sure that we know how the rest of the program interacts with the function. We can then write our own assembly code that calls that function successfully. A special note needs to be appended on the exported system functions accessible via shared libraries. You might ask, why is that? It's because those libraries are compiled in advance and can't just take the code and compile them again; all we have are the binaries that are ready to be used. Whenever we're calling such a function, the compiler must know exactly which calling convention the exported system function uses, so it can write the matching caller function, because otherwise everything would fall into pieces. I hope you now have a good understanding of what the calling conventions are and why are they useful in everyday programming life.

References

[1] Calling conventions, https://en.wikipedia.org/wiki/Calling_convention. [2] https://msdn.microsoft.com/en-us/library/k2b2ssfy.aspx

[3] http://www.codeproject.com/Articles/1388/Calling-Conventions-Demystified [4] Chris Eagle, The IDA Pro Book: The unofficial guide to the world’s most popular disassembler.

Comments