The Import Directory: Part 2

You can take a look at the previous article before reading this one. If you already understand the basics of IAT table, then you can skip the first article, but otherwise you should read that before continuing below.

Presenting the Example Import Directory

Let's use the !dh command to dump the PE header. Below we can see that we've dumped the PE header that's located at the 0x00400000 virtual address. Note that we presented only the Import Directory entry, because we're interested only in that right now.

0:002> !dh 00400000 -f
….
18000 [ 3C] address [size] of Import Directory
…

We can see that the RVA to the import directory is 0x18000 and is 0x3C bytes in size. The Import Directory points to the IMAGE_IMPORT_DESCRIPTOR structures, which is 20 bytes in size. Since the size of the Import Directory is 0x3C (60 bytes) and the size of the IMAGE_IMPORT_DESCRIPTOR structure is 0x14 (20 bytes), there are 3 structures available in the Import Directory.

Let's dump the three structures contained in the Import Directory table. On the picture below we've calculated the address of the Import Table, which is 0x00400000+18000, and then we also added the relative offsets to the IMAGE_IMPORT_DESCRIPTOR structures. Since the size of each structure in an array is 0x14 bytes, we must add 0x14 bytes to the address to access the next element:

We can see that the array actually contains two elements and that the last one is set to zero to denote the end of the Import Directory. The names of both elements are denoted by the Name element in the IMAGE_IMPORT_DESCRIPTOR structure. The Name element holds the RVAs to the actual name of the library. In the next picture, we've dumped the memory at those RVAs as hex and ASCII representations:

Notice that we've actually printed the names of the loaded libraries, the msvcr100dl.dll and kernel32.dll? This proves that the executable is using the imported functions from those libraries.

Let's now dump the OriginalFirstThunk array. We've seen that the RVA address to the OriginalFirstThunk array of the msvcr100d.dll library is 0x180FC. This is why we can use the command on the picture below to dump the four bytes of the IMAGE_THUNK_DATA structure.

The IMAGE_THUNK_DATA structure contains the element AddressOfData, which points to the IMAGE_IMPORT_BY_NAME structure. Let's present that structure again for clarity:

If we now dump the IMAGE_IMPORT_BY_NAME structure from the address 0x00400000+000184fe, we'll see the following:

Notice that the name presents only one character '_'. We've already established that the Name element actually contains a null terminated ASCII name, so it's best if we dump the contents of memory on that address to see the actual name. We've dumped the contents of memory with the db command, which can be seen below:

The first type bytes are Hint, while the rest of the bytes, until the first NULL byte, are part of the Name element. Because of this, the actual name of this function is _crt_debugger_hook. We can also use the da command to dump only the ASCII characters, but we have to add an additional 0x2 bytes to the address to jump over the Hint element. We can see the same string dumped on the picture below:

We've seen that the OriginalFirstThunk array contains RVA addresses to the IMAGE_THUNK_DATA structures that in turn point to the IMAGE_IMPORT_BY_NAME structure that contains the name of the function of certain library. All of the RVA addresses that point to the IMAGE_THUNK_DATA can be seen on the picture below:

Notice that the last element is NULL element 0x00000000, which terminates the array. The rest of the dwords are RVA addresses to the IMAGE_THUNK_DATA structure. We can see that it would take a lot of work to traverse the entries manually, so we'll just write a simple script that will do it for us. Let's first check if the current expression evaluator is set to MASM. We can do this with the .expr command:

In order to write a script, we must first take a look at basic WinDbg scripting instructions. If we want to declare a variable, we must use the "r" prefix and the name must be $t0-$t19. If we want to obtain the value of the variable, we must use the prefix "@" like this: @$t0. We can use the script parameters as $arg1 - $argN in the script itself.

Whenever we want to execute the script, we need to use the following command:

kd> $$><"script_path"

I've coded a script that traverses the OriginalFirstThunk array and prints all the names from that array. The script can be seen below:

$$
$$ This script reads the whole OriginalFirstThunk array and prints
the
$$ function names stored in that array. The address of the
OriginalFirstThunk
$$ must be passed to the script as the first parameter.
$$

.block
{
$$ the address of OriginalFirstThunk array
r $t0 = ${$arg1}+0x00400000

.for (r @$t1 = 0; @$t1 < 1000; r @$t1 = @$t1 + 4) {
$$ Calculate the address of element in OriginalFirstThunk.
r $t2 = @$t0 + @$t1

$$ Get the value of the Name element.
r $t4 = poi(@$t2)

.if(@$t4 = 0) {
.break
}
.else {
$$ Calculate the address of Name element in IMAGE_IMPORT_BY_NAME.
r $t3 = poi(@$t2)+0x00400000+0x2
da @$t3
}
}
}

Let's explain the script a little further. At first we calculate the actual address of the OriginalFirstThunk array: we add 0x00400000 (base address) to the first input argument. In our case, the input argument should be 0x180fc, so the whole address will be 0x004180fc, which is the address of the OriginalFirstThunk of the msvcr100d.dll library.

There's no need to say that the script only works if the base address of the PE header is 0x00400000; if we would like to have a more versatile script, we only need to make small changes to find the PE header base address dynamically. We didn't do this in our case, since it's not important for this exercise.

After that, we have a for loop which counts from 0 to 1000, increasing the counter by t 4 and executing the for loop body each time. In the loop body, we calculate the address of each element in the OriginalFirstThunk array and read the value from that address. If the address contains the value 0, then we've come to the end of the array and we terminate the loop. Otherwise, we take that value and add 0x400002, which constructs the whole address to the actual null-terminated ASCII name. Then we print that value to the output and repeat the loop.

Let's see what happens when we run the script. We saved the script into C:scripts directory as importnames.wds script, but the extension can be anything we like, even.txt. We're passing one argument 0x180fc to the script, which is the RVA to the OriginalFirstThunk.

0:002> $$>a<C:scriptsimportnames.wds 0x180fc
00418500 "_crt_debugger_hook"
004184f0 "_wsplitpath_s"
004184e4 "wcscpy_s"
004184d4 "_wmakepath_s"
004184ba "_except_handler4_common"
004184b0 "_onexit"
004184a8 "_lock"
0041849a "__dllonexit"
00418490 "_unlock"
0041847e "_invoke_watson"
0041846e "_controlfp_s"
0041845a "?terminate@@YAXXZ"
0041844c "_initterm_e"
00418440 "_initterm"
0041842e "_CrtDbgReportW"
0041841a "_CrtSetCheckCount"
0041840c "__winitenv"
00418404 "exit"
004183fa "_cexit"
004183ec "_XcptFilter"
004183e4 "_exit"
004183d2 "__wgetmainargs"
004183c4 "_amsg_exit"
004183b2 "__set_app_type"
004183a8 "_fmode"
0041839c "_commode"
00418388 "__setusermatherr"
00418372 "_configthreadlocale"
00418360 "_CRT_RTC_INITW"
00418348 "printf"
0041833e "getchar"

Let's also dump all the names from the FirstThunk array in the msvcr100d.dll library, which has a RVA of 0x1827c. In order to do that, we have to change the script a little bit, because the OriginalFirstThunk and FirstThunk don't actually use the same data structures. The new script is very similar to the previous one and can be seen below:

$$
$$ This script reads the whole OriginalFirstThunk array and prints
the
$$ function names stored in that array. The address of the
OriginalFirstThunk
$$ must be passed to the script as the first parameter.
$$

.block
{
$$ the address of OriginalFirstThunk array
r $t0 = ${$arg1}+0x00400000

.for (r @$t1 = 0; @$t1 < 1000; r @$t1 = @$t1 + 4) {
$$ Calculate the address of element in OriginalFirstThunk.
r $t2 = @$t0 + @$t1

$$ Get the value of the Name element.
r $t4 = poi(@$t2)

.if(@$t4 = 0) {
.break
}
.else {
$$ Calculate the address of Name element in IMAGE_IMPORT_BY_NAME.
r $t3 = poi(@$t2)+0x00400000+0x2
.printf "Addr: %xn", @$t4
}
}
}

We won't explain the script in detail, since it's very similar to the previous one. The only difference is the else conditional body, where we print the read value to the stdout, where in the previous case we printed the value pointed to by the current value and there was one more pointer in the hierarchy.

When we run the script, the following will be printed to the screen:

0:002> $$>a<C:scriptsimportnames2.wds 0x1827c
Addr: 10322e30
Addr: 10327ce0
Addr: 10274390
Addr: 10326190
Addr: 10323040
Addr: 10319d40
Addr: 102496d0
Addr: 10319fa0
Addr: 10249720
Addr: 10316310
Addr: 103329b0
Addr: 102fb0c0
Addr: 10248680
Addr: 10248650
Addr: 103151e0
Addr: 10319ac0
Addr: 10362730
Addr: 10248080
Addr: 102480c0
Addr: 1031d090
Addr: 102480a0
Addr: 10248ce0
Addr: 10248100
Addr: 10245130
Addr: 103635f8
Addr: 103632fc
Addr: 10247580
Addr: 1031ecd0
Addr: 10321270
Addr: 10267ee0
Addr: 1025f660

Notice that we passed the RVA of the OriginalFirstThunk 0x1827c to the new script. The script printed the same number of elements as before, but now the function addresses were printed, instead of the function names in the previous script.

Let's verify that the printed address actually belongs to the function we've identified. The last element printed in both cases is "getchar" and "1025f660", which means that the getchar() function must be located at address 0x1025f660. We can check whether this is true by simply using the u command. The picture below shows us that our script works and that we've correctly identified the address of the getchar() function:

In the beginning of the article we've identified that the executable uses two libraries, the msvcr100d.dll and the kernel32.dll library. Previously, we've dumped the names and addresses of the functions in the msvcr100d.dll library. Now let's dump all the function names of the kernel32.dll library. We can see all the names below:

0:002> $$>a<C:scriptsimportnames.wds 0x1803c
00418516 "CloseHandle"
0041874c "UnhandledExceptionFilter"
00418738 "GetCurrentProcess"
00418724 "TerminateProcess"
00418716 "FreeLibrary"
00418702 "GetModuleHandleW"
004186f2 "VirtualQuery"
004186dc "GetModuleFileNameW"
004186ca "GetProcessHeap"
004186be "HeapAlloc"
004186b2 "HeapFree"
00418698 "GetSystemTimeAsFileTime"
00418682 "GetCurrentProcessId"
0041866c "GetCurrentThreadId"
0041865c "GetTickCount"
00418642 "QueryPerformanceCounter"
00418632 "DecodePointer"
00418614 "SetUnhandledExceptionFilter"
00418604 "LoadLibraryW"
004185f2 "GetProcAddress"
004185e6 "lstrlenA"
004185d4 "RaiseException"
004185be "MultiByteToWideChar"
004185aa "IsDebuggerPresent"
00418594 "WideCharToMultiByte"
0041857e "HeapSetInformation"
00418560 "InterlockedCompareExchange"
00418558 "Sleep"
00418542 "InterlockedExchange"
00418532 "EncodePointer"
00418524 "CreateFileW"

Notice that this time we had to use the RVA of the kernel32.dll's OriginalFirstThunk, which is 0x1803c. To print the appropriate addresses, we must use the RVA of kernel32.dll's FirstThunk, which is 0x181bc. We can see all of the functions' addresses printed below:

0:002> $$>a<C:scriptsimportnames2.wds 0x181bc
Addr: 7c809be7
Addr: 7c864042
Addr: 7c80de95
Addr: 7c801e1a
Addr: 7c80ac7e
Addr: 7c80e4dd
Addr: 7c80ba71
Addr: 7c80b475
Addr: 7c80ac61
Addr: 7c9100c4
Addr: 7c90ff2d
Addr: 7c8017e9
Addr: 7c8099c0
Addr: 7c8097d0
Addr: 7c80934a
Addr: 7c80a4c7
Addr: 7c9132ff
Addr: 7c8449cd
Addr: 7c80aeeb
Addr: 7c80ae40
Addr: 7c80be56
Addr: 7c812f81
Addr: 7c809c98
Addr: 7c81f424
Addr: 7c80a174
Addr: 7c839471
Addr: 7c809842
Addr: 7c802446
Addr: 7c80982e
Addr: 7c9132d9
Addr: 7c810cd9

Let's also verify that the function addresses are correct by checking whether the last element matches.

The address 0x7c810cd9 matches the CreateFileW function, which means that our scripts work as intended.

If we now dump the PE header with the !dh command, we'll see that the RVA to the Import Address Table Directory is 0x181BC, which is exactly the RVA of the kernel32.dll's FirstThunk.

0:002> !dh 00400000 -f

…

181BC [ 180] address [size] of Import Address Table Directory
…

This proves that the IAT table must be traversed through the Import Directory structures as we saw in this tutorial. If we dumped the contents of the memory at IAT (RVA0x181BC), we would see that we're actually accessing the same functions that we already identified.

Conclusion

We've seen the distinction between load-time and run-time dynamic linking. With load-time dynamic linking, we must specify the required libraries that we'll be using during the program compilation, and of course the used functions are written to the program's IAT table. With run-time dynamic linking, the IAT is not used, because we'll know the function that we're referencing at run-time and not at compile-time. We can bring a library to the current process's address space by running the LoadLibrary() function and then scanning through its exported functions.

The IAT table is used to support dynamic linking, which needs to be done when the application is run. Since the application uses functions from standard libraries, we must write them into the IAT table, so that the operating system knows which libraries to load into the process's address space when the process is being executed. Alternatively, we could use run-time linking, in which case the IAT table won't be necessary, because we have to load the library and execute its functions at run-time.

References:

[1] Import Address Table,

http://en.wikipedia.org/wiki/Import_Address_Table#Import_Table.

[2] Dynamic-link library,

http://en.wikipedia.org/wiki/Dynamic-link_library#Symbol_resolution_and_binding.

[3] CreateFile function,

http://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx.

[4] Linker Options, http://msdn.microsoft.com/en-us/library/y0zzbyt4.aspx.

[5] PE File Structure,

http://www.thehackademy.net/madchat/vxdevl/papers/winsys/pefile/pefile.htm.

[6] Tutorial 6: Import Table, http://win32assembly.programminghorizon.com/pe-tut6.html.

[7] What's the difference between "Import Table address" and "Import Address Table address" in Date Directories of PE?,