50. Parsing PE Headers

Parsing PE Headers

Introduction

Early on in a beginner module, the PE file format structure was briefly discussed. The module focused more on the theory rather than a programmatical perspective of accessing each header. This module will explain the process of extracting components of a PE file and provide more insight into the file structure, which will ultimately become a prerequisite for more advanced modules.

Review the introductory PE file structure module if the PE structure is not well understood.

PE Structure

Recall the diagram below from the introductory module which shows a simplified structure of the PE format. Every header shown in the image is defined as a data structure that holds information about the PE file.

Relative Virtual Addresses (RVAs)

Relative Virtual Addresses (RVAs) are addresses that are used to reference locations within a PE file. They are used to specify the location of various data structures and sections within the PE file, such as code, data, and resources.

An RVA is a 32-bit value that specifies the offset of a data structure or section from the beginning of the PE file. It is called a "relative" address because it specifies the offset from the beginning of the file, rather than an absolute address in memory. This allows the same file to be loaded at different addresses in memory without requiring any changes to the RVAs within the file.

RVAs are used extensively in the PE file format to specify the location of various data structures and sections within the file. For example, the PE header contains several RVAs that specify the location of the code and data sections, the import and export tables, and other important data structures.

To convert an RVA to a virtual address (VA), the operating system adds the base address of the module (the location in memory where the module is loaded) to the RVA. This allows the operating system to access the data at the specified location within the module, regardless of where the module is loaded in memory.

DOS Header (IMAGE_DOS_HEADER)

The DOS header is located at the beginning of a PE file and contains information about the file, such as its size, and characteristics. Most importantly, it contains the RVA (offset) to the NT header.

The following snippet demonstrates how to retrieve the DOS header.

// Pointer to the structure
PIMAGE_DOS_HEADER pImgDosHdr = (PIMAGE_DOS_HEADER)pPE;
if (pImgDosHdr->e_magic != IMAGE_DOS_SIGNATURE){
	return -1;
}

Since the DOS header is located at the very beginning of a PE file, retrieving the DOS header is only a matter of typecasting the pPE variable to a PIMAGE_DOS_HEADER. This provides a pointer to the DOS header structure. After that, a DOS signature check is performed to verify that the DOS header is valid.

NT Header (IMAGE_NT_HEADERS)

The e_lfanew member of the DOS header is an RVA to the IMAGE_NT_HEADERS structure. To reach the NT header, simply add the base address of the PE file in memory to the offset (e_lfanew). This is done in the following code snippet.

// Pointer to the structure
PIMAGE_NT_HEADERS pImgNtHdrs = (PIMAGE_NT_HEADERS)(pPE + pImgDosHdr->e_lfanew);
if (pImgNtHdrs->Signature != IMAGE_NT_SIGNATURE) {
	return -1;
}

The if statement is an NT Signature check to confirm the validity of the IMAGE_NT_HEADERS structure.

File Header (IMAGE_FILE_HEADER)

Since the file header is a member of the IMAGE_NT_HEADERS structure, it is can be accessed using the following line of code.

IMAGE_FILE_HEADER		ImgFileHdr	= pImgNtHdrs->FileHeader;

File Header Members

The members of the IMAGE_FILE_HEADER structure are described below.

Optional Header (IMAGE_OPTIONAL_HEADER)

Since the optional header is a member of the IMAGE_NT_HEADERS structure, it is can be accessed using the following code.

IMAGE_OPTIONAL_HEADER	ImgOptHdr = pImgNtHdrs->OptionalHeader;
if (ImgOptHdr.Magic != IMAGE_NT_OPTIONAL_HDR_MAGIC) {
	return -1;
}

The if statement is used to verify the optional header. IMAGE_NT_OPTIONAL_HDR_MAGIC's value depends on whether the application is 32 or 64-bit.

Depending on the compiler architecture, the IMAGE_NT_OPTIONAL_HDR_MAGIC constant will automatically expand to the correct value.

Optional Header Important Members

The most important members of the IMAGE_OPTIONAL_HEADER structure are explained below.

DataDirectory (IMAGE_DATA_DIRECTORY)

The Data Directory can be accessed from the optional's header last member. This is an array of IMAGE_DATA_DIRECTORY meaning each element in the array is an IMAGE_DATA_DIRECTORY structure that references a special data directory. The IMAGE_DATA_DIRECTORY structure is shown below.

typedef struct _IMAGE_DATA_DIRECTORY {
    DWORD   VirtualAddress;
    DWORD   Size;
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;

The fields of the structure contain information such as:

Accessing Data Directories

Some of the predefined data directories in a PE file include:

The data directories can be accessed using the following line of code.

IMAGE_DATA_DIRECTORY DataDir = ImgOptHdr.DataDirectory[#INDEX IN THE ARRAY#];

For example, retrieving the data directory of the export directory is done as follows:

IMAGE_DATA_DIRECTORY ExpDataDir = ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT];

Export Table (IMAGE_EXPORT_DIRECTORY)

Unfortunately, this structure is not officially documented by Microsoft at the time of writing this module. Therefore, to understand the structure, unofficial documentation is used which can be found on the internet.

Export Table Structure

The export table is a structure defined as IMAGE_EXPORT_DIRECTORY which is shown below.

typedef struct _IMAGE_EXPORT_DIRECTORY {
    DWORD   Characteristics;
    DWORD   TimeDateStamp;
    WORD    MajorVersion;
    WORD    MinorVersion;
    DWORD   Name;
    DWORD   Base;
    DWORD   NumberOfFunctions;
    DWORD   NumberOfNames;
    DWORD   AddressOfFunctions;     // RVA from base of image
    DWORD   AddressOfNames;         // RVA from base of image
    DWORD   AddressOfNameOrdinals;  // RVA from base of image
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;

Retrieving The Export Table

The IMAGE_EXPORT_DIRECTORY structure is used to store information about the functions and data that are exported from a PE file. This information is stored in the data directory array with the index IMAGE_DIRECTORY_ENTRY_EXPORT. To fetch it from the IMAGE_OPTIONAL_HEADER structure:

PIMAGE_EXPORT_DIRECTORY pImgExportDir = (PIMAGE_EXPORT_DIRECTORY)(pPE + ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress);

Where pPE is the base address of the loaded PE in memory and ImgOptHdr is the IMAGE_OPTIONAL_HEADER structure previously calculated.

Export Table Important Members

The most important members of the export table are the following:

Import Address Table (IMAGE_IMPORT_DESCRIPTOR)

The import address table is an array of IMAGE_IMPORT_DESCRIPTOR structures with each one being for a DLL file that contains the functions that were used from these DLLs.

Import Address Table Structure

The IMAGE_IMPORT_DESCRIPTOR structure is also not officially documented by Microsoft although it is defined in the Winnt.h Header File as follows:

typedef struct _IMAGE_IMPORT_DESCRIPTOR {
    union {
        DWORD   Characteristics;
        DWORD   OriginalFirstThunk;
    } DUMMYUNIONNAME;
    DWORD   TimeDateStamp;
    DWORD   ForwarderChain;
    DWORD   Name;
    DWORD   FirstThunk;
} IMAGE_IMPORT_DESCRIPTOR;

Retrieving The Import Address Table

To fetch the import address table from the IMAGE_OPTIONAL_HEADER structure:

IMAGE_IMPORT_DESCRIPTOR* pImgImpDesc = (PIMAGE_IMPORT_DESCRIPTOR)(pPE + ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);

Where pPE is the base address of the loaded PE in memory and ImgOptHdr is the IMAGE_OPTIONAL_HEADER structure previously calculated.

Additional Undocumented Structures

Several undocumented structures can be accessed via the IMAGE_DATA_DIRECTORY array in the optional header but are not documented in the Winnt.h header file. These include the Import Address Table and Export Table discussed earlier, as well as additional structures. Below are a few more examples of undocumented structures.

PIMAGE_TLS_DIRECTORY pImgTlsDir  = (PIMAGE_TLS_DIRECTORY)(pPE + ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_TLS].VirtualAddress);
PIMAGE_RUNTIME_FUNCTION_ENTRY pImgRunFuncEntry = (PIMAGE_RUNTIME_FUNCTION_ENTRY)(pPE + ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXCEPTION].VirtualAddress);
PIMAGE_BASE_RELOCATION pImgBaseReloc = (PIMAGE_BASE_RELOCATION)(pPE + ImgOptHdr.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].VirtualAddress);

PE Sections

Be aware of the important PE sections such as .text.data.reloc.rsrc. Additionally, there may be more PE sections depending on the compiler and its settings. Each of these sections has a IMAGE_SECTION_HEADER structure that contains information about it. The IMAGE_SECTION_HEADER structure is defined below.

typedef struct _IMAGE_SECTION_HEADER {
  BYTE  Name[IMAGE_SIZEOF_SHORT_NAME];
  union {
    DWORD PhysicalAddress;
    DWORD VirtualSize;
  } Misc;
  DWORD VirtualAddress;
  DWORD SizeOfRawData;
  DWORD PointerToRawData;
  DWORD PointerToRelocations;
  DWORD PointerToLinenumbers;
  WORD  NumberOfRelocations;
  WORD  NumberOfLinenumbers;
  DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

IMAGE_SECTION_HEADER Important Members

Some of IMAGE_SECTION_HEADER's most important members;

Retrieving The IMAGE_SECTION_HEADER Structure

The IMAGE_SECTION_HEADER structure is stored in an array within the PE file's headers. To access the first element, skip past the IMAGE_NT_HEADERS since the sections are located immediately after the NT headers. The following snippet shows how to retrieve the IMAGE_SECTION_HEADER structure, where pImgNtHdrs is a pointer to IMAGE_NT_HEADERS structure.

PIMAGE_SECTION_HEADER pImgSectionHdr = (PIMAGE_SECTION_HEADER)(((PBYTE)pImgNtHdrs) + sizeof(IMAGE_NT_HEADERS));

Looping Through The Array

Looping through the array requires the array size which can be retrieved from the IMAGE_FILE_HEADER.NumberOfSections member. The subsequent elements in the array are located at an interval of sizeof(IMAGE_SECTION_HEADER) from the current element.

PIMAGE_SECTION_HEADER pImgSectionHdr = (PIMAGE_SECTION_HEADER)(((PBYTE)pImgNtHdrs) + sizeof(IMAGE_NT_HEADERS));

for (size_t i = 0; i < pImgNtHdrs->FileHeader.NumberOfSections; i++) {
	// pImgSectionHdr is a pointer to section 1
	pImgSectionHdr = (PIMAGE_SECTION_HEADER)((PBYTE)pImgSectionHdr + (DWORD)sizeof(IMAGE_SECTION_HEADER));
	// pImgSectionHdr is a pointer to section 2
}

Demo

This demo shows the PeParser project which is shared in this module. It can be used to parse PE files using the methods discussed throughout the module. Keep in mind, PeParser should be compiled as a 32-bit binary to parse a 32-bit program and 64-bit for a 64-bit program.