Presenting Registers

Introduction

Let's present all of the registers, as seen in OllyDbg:

Let's explain this picture a little better. At the top of the picture, the general purpose registers are given. The EBP and ESP registers are generally used with stack frames, while the other registers can be used by the program in whichever way the program wants. Those registers are: eax, ebx, ecx, edx, esi and edi. What follows is the EIP register, which is particularly important because it contains the address that points to the next instruction that will be executed. Then there are the 6 flags from the EFLAGS register, they are C, P, A, Z, S, T, D and O. In the next column, segment registers are shown, being ES, CS, SS, DS, FS and GS. The "CS 001B 32bit 0(FFFFFFFF)" means that the code segment register contains selector 001B, which is a 32-bit segment register starting at 0x00000000 and ending at 0xFFFFFFFF (which means that the segment spans entire address space).

The EFLAGS Register

The EFLAGS register is a 32-bit register that contains 32 flags where each has a meaning of its own. Among all those flags are also 8 flags that present the state of the processor at any given time. Notice the flags, as seen in the OllyDbg Registers window: C, P, A, Z, S, T, D, O are part of the EFLAGS register. In this subsection of the article, we'll describe each of those registers and try to understand why they are used. The flags mentioned above correspond to the following:

CF (carry flag)
PF (parity flag)
AF (adjust flag)
ZF (zero flag)
SF (sign flag)
TF (trap flag)
DF (direction flag)
OF (overflow flag)

The carry flag is used to represent the addition of two numbers that can't be represented by the register. The same is true for the overflow flag, the only difference being that CF is used to present the overflow for an unsigned addition, while OF is used to present the overflow for a signed addition. Let's now present an example of this. Below, we can see the C++ code written specifically for this purpose:

#include "stdafx.h"

int _tmain(intargc, _TCHAR* argv[])
{
intcf;
int of;
__asm {
pusheax
pushebx

; OF
moveax, 0x40000000
movebx, 0x40000000
addeax, ebx
mov [of], eax

; CF
moveax, 0x80000000
movebx, 0x80000000
addeax, ebx
mov[cf], eax

popebx
popeax
};
printf("SOF: %dn", of);
printf("UOF: %un", of);
printf("SCF: %dn", cf);
printf("UCF: %un", cf);

getchar();
return 0;
}

In the example above, we're first declaring two integers cf and of. After that, there's an __asm{} block where we've inputted the assembly code directly. In the assembly code we're storing the value of eax and ebx, which we'll later restoring as it ensures that we don't accidently overwrite some value that is needed for the successful execution of the program. We can also see two blocks: one regarding overwrite and the other the carry flag. In the first example, we're storing the value 0x40000000 into the registers eax and ebx and then adding the values together and storing the value into the 'of' variable. In the second example, we're storing the values 0x80000000 into the registers eax and ebx and adding the values together and storing the value to the cf variable. After the __asm block, we're printing the code in signed and unsigned manner.

SOF means signed overflow flag, while the UOF means the unsigned overflow flag. The same is true for the carry flag as well. Let's try to figure out what's happening in the code above. In the first example, we're adding two 0x40000000 values together that results in the 0x80000000 value, which sets the 31 ^st bit to 1, which means that if we're operating with the number in a signed manner, it's actually a negative number, as we can see printed in the SOF line (the negative number was printed because we used the '%d" specified in the printf function call, which prints the signed decimal number). Contrary to that, we can look at the number as it's unsigned, which means we need to use the '%u' specifier that displays the number as unsigned decimal integer. In this case, even the 31 ^st bit isn't used to specify the number signness, because the unsigned integers are always positive integers. Since the 31 ^st bit is set to 1 in this case, the OF flag is set, because of the overflow of the 31 ^st bit.

In the second example,we're adding two 0x80000000 values together, which means that we're overflowing the 32 ^nd bit, which cannot be represented with a 32-bit register anymore (but needs a 33-bit register, which we don't have). This is why the 32 ^nd bit is discarded, the CF set, and the result is 0 on both cases because if the 1 from 0x100000000 is discarded, we're left with only 0x00000000, which is exactly the number zero.

If we take a look at the instructions we've inputted in the __asm block in the executable with OllyDbg, we'll find the following:

We've already set the breakpoint at the 0x004136EE instruction, which is the beginning instruction in our __asm block. After executing the first "add eax, ebx" instruction, the SF and OF flags will be set, as can be seen on the picture below. This is exactly what we've been talking about: the OF flag is set because this flag is used for signness operations, so since we've overflown the 31 ^th bit, this flag is set. Note that the flags that have been set in the previous instruction are presented in red.

After executing the second "add eax, ebx" instruction, the CF flag is set, as can be seen on the picture below. This is because we've overflown the 32-bit register, so the value can't be presented anymore. Also notice that the value in the register eax is 0x00000000, as we've previously discussed.

But there are also other flags we haven't yet talked about. The zero flag is set whenever the result of an addition, subtraction or other operations is zero and is unset whenever the result is not zero. The sign flag is set if the value of the most significant bit in an operation is 1. If the sign flag is set, then the number is negative, otherwise it's positive. The parity flag is set if the number of bits in the result is odd. The adjust flag is set when an arithmetic carry or borrow has been generated out of the 4 least-significant bits [1]. The direction flag is set whether we're decrementing the addresses when operating on strings.

Segment Registers

There are several segment registers that are being used by the operating system and are listed below:

stack segment (SS): pointer to the stack of the current program
code segment (CS): pointer to the code of the current program
data segment (DS): pointer to the data of the current program
extra segment (ES): pointer to the extra data being used by the current program
F segment (FS): pointer to more extra data
G segment (GS): pointer to more extra data

Often, the segment registers are left out of the introduction to the x86 assembly, but they are very important nevertheless. Segment registers are used to store pointers to the memory used by the currently executable program. The segment registers can be seen on the picture below:

The segment registers actually contain the index to the descriptor table, which contains a descriptor that describes one segment of memory. Thus, the segment registers are used to access the virtual address space of the process. The segments are always set-up so they can access all 4GB address space. We won't go into the details how segment registers are used; just remember that they are needed because of the segmentation memory model.

FPU Registers

If you're reading this, then you're probably aware that floating point operations in CPU take considerately more time than their integer alternatives. Because of this, processors contain the FPU (Floating Point Unit) that has 8 registers named ST0, ST1, ST2, ST3, ST4, ST5, ST6 and ST7 and are 80-bits wide. Those registers are used to store floating point numbers that are either 32-bits long or 64-bits long. In the C++ program, the types double and floats are synonyms for those.

Let's take a look at the following example written in C++:

#include "stdafx.h"

int _tmain(intargc, _TCHAR* argv[])
{
floatnum = 123.88;
__asm {
fld [num]
fsqrt
fst [num]
};
printf("Number: %fn", num);

getchar();
return 0;
}

We're saving the constant floating point number 123.88 into the variable num. Then the __asm block follows that and calculates the square root of the floating point number stored in variable num. The fld instruction loads the value in variable num into the register ST0. The fsqrt instruction computes the square root of the ST0 register and stores the result back into the ST0 register. The fst instruction then saves the value from ST0 register into the variable num. After the __asm block, we're printing the value of the floating point number stored in the variable num, which should contain the square root of the number 123.88. When we compile and run the program, the program will look like this:

Let's use Wolfram Alpha to calculate the result of the square root of the 128.88. To do that, we must visit the URL address http://www.wolframalpha.com and input the sqrt(123.88) into the input box and press enter. The result is shown on the picture below:

We can see that we've gotten the same result, 11.1301, which is the square root of the 123.88 floating point number. This proves that our program works as expected. Let's take a look how OllyDbg presents the instructions:

We've already set the breakpoint to the address 0x004113E7, which is exactly the address of the fld instruction that we've inputted into the __asm block in C++ code. The previous instruction at address 0x004113DE sets the ST0 register value to 123.88, as can be seen on the picture below:

After executing the fsqrt instruction, the value in the ST0 register is the square root of the 123.99 floating point number, which can be seen on the picture below:

Once we're done with executing that instruction, we've saving the value back into the variable num and printing it to the console window.

Conclusion

In this post, we've seen the basic usage of registers. We've presented the way registers are used in OllyDbg, but it's the same among all debuggers. We've seen what the EFLAGS register is for and especially the ST0 registers that are often left from the assembly tutorials and reverse engineers don't exactly know what they are used for. This articles should fill this gap; so if you're interested in reverse engineering and understanding the registers, it should come quite handy to you.

References:

[1]: X86 Assembly/Floating Point, http://en.wikibooks.org/wiki/X86_Assembly/Floating_Point.

References:

[1]: Adjust flag, http://en.wikipedia.org/wiki/Adjust_flag.

Comments