Translating Virtual to Physical Address on Windows: Segmentation

Introduction

In this tutorial, we'll go over the process of translating a virtual address to physical address the way a processor does it. To begin, let's present a short overview of how segmentation and paging is done on operating systems. At first, the virtual or logical address must be translated to a linear address. The picture taken from [1] presents this further:

On the picture above we can see a 16-bit segment register and a virtual address that's 32-bit long. The TI bit in the segment register specifies whether we must look for the corresponding register in a GDTR or LDTR. GDTR is a global segment descriptor table and is available for the system and all programs. Each program also has its own local descriptor table LDTR. The above 12-bits of the segment register are used as an index into the GDTR/LDTR to select the appropriate segment descriptor.

The segment descriptor among a lot of different fields also contains a base address that is added to the 32-bit virtual address that we're trying to access. This effectively constructs the linear address, which is not a physical address. In case we're not using paging, the linear address equals the physical address, but otherwise this is not true. When paging is used, the linear address is taken as input and a physical address is calculated. This can be seen on the picture below, which is again taken from [1]:

On the picture above, we can see that the linear address contains three fields. In case PAE is also enabled, the linear address is separated into four fields, but let's not talk about that right now. In most cases PAE is disabled anyway, so we can safely assume that the linear address is separated into three fields as shown above.

The lower 12 bits are directly used as part of the physical address. The 10-bits of directory index is used as index into the page directory that contains PDEs (page directory entry), while the middle 10 bits are used as index into the page table that contains PTEs (page table entry). Both PDEs and PTEs are used together with the lowest 12-bits to construct the physical address from the linear address. Also, notice that the control register CR3 contains a base pointer into the Page Directory?

In this article, we'll use the program provided below to try to discover how the virtual address is translated into the physical address. It's one thing to know the whole process in theory, but it's a whole new level to do something practical to confirm the theory.

The program below is written in C++ and compiled in Visual Studio: the program basically creates two variables, one on stack and the other on heap and assigns the numbers 10 and 20 to them. Then it prints the values to the screen and calls getchar() to stop the program, so it doesn't end before we've been given the chance to observe what was written on the screen. There's also an additional assembly instruction "int 3" that causes the program to invoke a software breakpoint, which is useful when we're trying to call a program at some predetermined location so we can inspect the code easily.

An alternative would be loading the program into a debugger and then manually looking for an instruction that interests us and setting the breakpoint in the debugger manually. We can see that the program is quite simple, which makes it perfect for what we're trying to achieve in this article. The source code of the program can be seen below:

#include "stdafx.h"
#include <stdio.h>
#include <windows.h>

int _tmain(int argc, _TCHAR* argv[]) {
__asm { int 3 }

int x;
int *y = new int();
x = 10;
*y = 20;
printf("Number x: %d.n", x);
printf("Number y: %d.n", *y);

getchar();

return 0;
}

In this article we'll basically be following the scheme presented on the picture below taken from [3]:

We can see that the CPU uses logical or virtual addresses that are translated to linear addresses with a segmentation unit, which are later translated into actual physical addresses.

Checking if PAE Enabled

Now we can download the Intel reference manual from [2], where we can read about the internal workings of the Intel processor. Let's first examine some registers to check whether PAE and paging are enabled.

If PAE is enabled, the PAE flag in the CR4 register will be enabled. Let's use the r instruction to print the value of the CR4 register, which is represented in hexadecimal format. If we would like to display the number in a binary form, so we can observe specific flags more easily, use the .formats command and pass the number as an argument. Both commands can be seen on the picture below:

Let's take a look at the format of the CR4 register's flags, which is taken from [2]:

Notice that the PAE flag is the 5 ^th bit from the right to left? The 5 ^th bit in the actual value of the CR4 register is 1, which means that PAE is enabled. Since PAE is enabled, we must also check whether the 4 ^th PSE (Page Size Extensions) bit is enabled. It is enabled, which means that pages are 2MB in size and not 4KB only.

The PAE is usually used if we want to address more than 4GB of physical memory in x86 machines, but the operating system itself must support it in order to be used. Keep in mind that PAE is usually used with x86 systems, because they use 32-bit addresses which have a limitation of 4GB. However, as the addresses don't have a 32-bit limitation on x64 systems, there's no need for PAE because the system can already address enough physical memory even if PAE is disabled.

Also keep in mind that on 32-bit systems, PAE is not used just to allow the system to address more physical memory, but is also used to provide DEP (Data Execution Protection). If we would like to check whether the PAE is supported on our Windows system, we can take a look in the C:WINDOWSsystem32 directory and look for ntkrnlpa.exe (supports PAE) and ntoskrnl.exe (doesn't support PAE). On the picture below, we can see that both files are present, so the PEA can be enabled or disabled.

To actually determine whether PAE has been enabled or disabled, we can open the regedit.exe program and take a look into the HKLMSystemCurrentControlSetControlSession ManagerMemory ManagementPhysicalAddressExtension entry. We can see that on the picture below, the value of that entry is 0, which means that PAE is disabled.

Checking if Segmentation is Enabled

We've already talked about segmentation and so far, we should already know that the system has a data structure that's called Global Descriptor Table (GDT) that holds descriptors. The upper bits of the segment register value are used as index into the GDT to get to the right descriptor, but the base address of the global descriptor table is stored in the GDTR register.

We can use the "r gdtr" command to show the value of the GDTR register, but let's try it another way. First use the "rm ?" command to dump the register masks that control how registers are displayed by the r command. All the register masks can be seen on the picture below:

Let's use most of the register masks to dump quite a lot of registers, which can be shown on the picture below where we've used the masks 0x8, 0x20, 0x80 and 0x100 to dump various register values:

Notice that the GDTR register holds the value 0x8003f000, which is the value we're interested in. Besides the GDTR register, it's also a good idea to keep the GDTL (Global Descriptor Table Length) register in mind that specifies the length of the GDT table. The value of the GDTL register is 0x3ff bytes, which is a hexadecimal representation of the decimal number 1023; this provides the information about the GDT table length in bytes.

We can use d command to dump the whole contents of the GDT table, which contains descriptors where each descriptor is 8 bytes long. Let's present the first few descriptors from the table GDT table:

We used the d command to dump the memory contents from the 0x8003f000 memory, but we also specified the number of bytes to dump. The number of bytes must have a letter 'L' followed by the hexadecimal representation of the number of bytes to dump. The first three descriptors from the picture above are presented below:

0000000000000000
0000ffff00cf9b00
0000ffff00cf9300

We can see that handling the descriptors in such a way is very hard, because we need to manually extract the value of specific fields from the memory dump. The good thing is that Windbg provides the command dg that can be used specifically for dumping descriptors. The dg command takes two parameters where the first selects the first segment descriptor and the second selects the last segment descriptor in the table.

Keep in mind that the dg command automatically knows where the GDT table is located so we don't have to specify the address of the table manually. To dump the same descriptors as we've presented in the previous image, we could use the "kd 0 f0 " command as shown on the picture below:

Notice that the Windbg was automatically able to parse the descriptors and present their parameters in a column view as seen above. The output from the dg command uses the following columns:

Sel: the selector
Base: the base address of the linear address space segment
Limit: the length of the linear address space segment
Type: the type of the segment
Pl: the segment in ring 0 (kernel mode) or ring 3 (user mode)

Do you notice on the picture above that some of the segments use the same base address and they span the entire linear address space, thus using the Limit of ffffffff? Some of the segments are also in kernel and some in user mode, but nevertheless they occupy the same region of memory. This is a clear indication that segmentation is not used, because otherwise the segment descriptors wouldn't point to the same base address. Actually, segmentation is used, because it's not optional on x86 machines, but the operating system can minimize its effect so much that we don't even know we're using it anymore. Thus, the operating system relies barely on paging to translate the virtual addresses to physical addresses and provide appropriate protection mechanisms.

On the picture above, we can see a beginning null descriptor at 0x0000, which is the default and must be present in every GDT table. Then we have four segments that start at a base address 0x00000000 and span the entire linear address space up until the 0xFFFFFFFF address. Two of the segments are code segments where one is located in user mode and the other is located in kernel mode; the same is true for the two data segments.

Since all the segments share the entire linear address space, the effect of segmentation has been minimized so much that we can't talk about segmentation any more. In this case segmentation has been used only to lay the ground for the paging scheme, which, though optional, is apparently used exhaustively by the Windows operating system.

Conclusion

In this tutorial, we've seen how to check whether segmentation is enabled on the system and how to go about it. The Windows system doesn't really use segmentation since the virtual addresses are the same as linear addresses, but it must nevertheless use it, because segmentation is not optional like paging.

References:

[1] x86 memory management and Linux kernel,

http://manavar.blogspot.com/2011/05/x86-memory-management-and-linux-kernel.html.

[2] http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.

[3] W4118: segmentation and paging, http://www.cs.columbia.edu/~junfeng/os/Flectures/l05-mem.pdf.

[4] Common WinDbg Commands, http://www.windbg.info/doc/1-common-cmds.html.

[5] Understanding !PTE , Part 1: Let's get physical,

http://blogs.msdn.com/b/ntdebugging/archive/2010/02/05/understanding-pte-part-1-let-s-get-physical.aspx.

[6] Understanding !PTE, Part2: Flags and Large Pages,

http://blogs.msdn.com/b/ntdebugging/archive/2010/04/14/understanding-pte-part2-flags-and-large-pages.aspx.

[7] Part 3: Understanding !PTE - Non-PAE and X64,

http://blogs.msdn.com/b/ntdebugging/archive/2010/06/22/part-3-understanding-pte-non-pae-and-x64.aspx.