Chapter 7

PE Format

Structure of EXE and DLL files in Windows

πŸ“¦ What is PE?

πŸ“– Definition: PE (Portable Executable)

The executable file format for Windows. Includes .exe, .dll, .sys, .ocx and more.
"Portable" because the same format works on all Windows versions.

When you open an EXE file in a Hex Editor or RE tool, you see the PE structure. Understanding this format is essential for analyzing malware and suspicious files.

πŸ”¨ Who Creates the PE Structure?

The developer only writes source code. The compiler and linker automatically create all the PE headers, signatures, and structure!

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        FROM CODE TO EXE                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

     DEVELOPER                    COMPILER                   LINKER
    writes this:                creates this:             creates this:
         β”‚                           β”‚                         β”‚
         β–Ό                           β–Ό                         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   main.c        β”‚         β”‚   main.obj      β”‚       β”‚   program.exe   β”‚
β”‚                 β”‚         β”‚                 β”‚       β”‚                 β”‚
β”‚ #include <...>  β”‚         β”‚ Machine code    β”‚       β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚                 β”‚  ───►   β”‚ (just the code, β”‚ ───►  β”‚ β”‚ DOS Header  β”‚ β”‚
β”‚ int main() {    β”‚ compile β”‚  no headers)    β”‚ link  β”‚ β”‚ DOS Stub    β”‚ β”‚
β”‚   printf("Hi"); β”‚         β”‚                 β”‚       β”‚ β”‚ PE Sig      β”‚ β”‚
β”‚   return 0;     β”‚         β”‚ Symbols:        β”‚       β”‚ β”‚ File Header β”‚ β”‚
β”‚ }               β”‚         β”‚ - main          β”‚       β”‚ β”‚ Opt Header  β”‚ β”‚
β”‚                 β”‚         β”‚ - printf (ref)  β”‚       β”‚ β”‚ Sections    β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚ β”‚ .text       β”‚ β”‚
                                                      β”‚ β”‚ .data       β”‚ β”‚
     YOU write                COMPILER does           β”‚ β”‚ Imports     β”‚ β”‚
     only this!               the translation         β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                                                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                      
                                                       LINKER builds
                                                       the complete
                                                       PE structure!
            
πŸ’‘ Summary: Who Does What?
Role Creates
Developer Source code only (what the program does)
Compiler Machine code (translated instructions)
Linker Complete PE structure (DOS Header, PE Signature, all headers, sections, imports)

πŸ—‚οΈ General PE Structure

HEADERS DOS Header 64 bytes DOS Stub (optional) "PE\0\0" 4 bytes File Header (COFF) 20 bytes Optional Header (despite name, it's required!) ~240 bytes Section Headers (Table of Contents) .text header .data header .rsrc header 40 bytes Γ— n SECTIONS (actual data) ↓ points to ↓ .text (executable code) .data (initialized variables) .rsrc, .rdata, ... (resources, read-only data) Legend Section Header: Metadata (40 bytes) Section: Actual content (code/data)
πŸ’‘ Section Headers vs Sections - What's the difference?
Term What is it Analogy
Section Headers Metadata describing each section (name, size, location, permissions) Table of Contents in a book
Sections The actual code/data content (.text, .data, .rsrc) The actual chapters in a book

Each Section Header (40 bytes) tells you WHERE to find its Section in the file!

πŸ“‹ DOS Header

πŸ“– Definition: DOS Header

The first 64 bytes of every PE. Starts with MZ (0x4D5A) - initials of Mark Zbikowski, a Microsoft programmer.

Important fields (these are actual field names from the Windows IMAGE_DOS_HEADER structure):

Field Location Value Meaning
e_magic Offset 0x00 (fixed) 4D 5A = "MZ" Magic number - confirms this is an executable
e_lfanew Offset 0x3C (fixed) Varies per file Pointer to PE Signature location
πŸ’‘ How e_lfanew Works

The field e_lfanew is always at offset 0x3C - this never changes.
But the value inside tells you where to find the PE Signature:

Step 1: Go to offset 0x3C
Step 2: Read the 4-byte value there (e.g., 80 00 00 00 = 0x80)
Step 3: Jump to that offset (0x80)
Step 4: You should find: 50 45 00 00 ("PE\0\0")
                
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Offset 0x00: 4D 5A ...           ; e_magic = "MZ" (always)     β”‚
β”‚ ...                                                            β”‚
β”‚ Offset 0x3C: 80 00 00 00         ; e_lfanew = 0x80             β”‚
β”‚              ↓                     (this VALUE varies)         β”‚
β”‚              └─────────────────────────────┐                   β”‚
β”‚ ...                                        ↓                   β”‚
β”‚ Offset 0x80: 50 45 00 00         ; PE Signature found here!    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            

πŸ“‹ PE Signature & File Header

πŸ“– Definition: PE Signature

4 bytes that confirm this is a Windows PE file: 50 45 00 00
In ASCII: "P" (0x50) + "E" (0x45) + null (0x00) + null (0x00) = "PE\0\0"

πŸ’‘ Why Both MZ and PE Signatures?
Signature Bytes Purpose
MZ 4D 5A Says "I might be an executable" (legacy DOS compatibility)
PE\0\0 50 45 00 00 Confirms "Yes, I'm definitely a Windows PE file!"

Historical reason: PE files start with MZ so old DOS systems show "This program cannot be run in DOS mode" instead of crashing. Windows uses e_lfanew to find the real PE data.

File Header (COFF Header) - 20 bytes:

Field Size Description
Machine 2 CPU type (0x14c=i386, 0x8664=AMD64)
NumberOfSections 2 How many sections are in the file
TimeDateStamp 4 Compilation time (Unix timestamp)
SizeOfOptionalHeader 2 Size of Optional Header
Characteristics 2 Flags (EXE, DLL, etc.)

πŸ“‹ Optional Header

Despite the name, it's required for EXE and DLL! Contains critical information:

Field Description
Magic 0x10b = PE32, 0x20b = PE32+ (64-bit)
AddressOfEntryPoint RVA of entry point - where code execution starts!
ImageBase Preferred load address (usually 0x00400000)
SectionAlignment Section alignment in memory
FileAlignment Section alignment in file
SizeOfImage Total size in memory
Subsystem GUI (2), Console (3), Driver (1)
DataDirectory Array of 16 entries - imports, exports, resources...
πŸ“– Definition: RVA (Relative Virtual Address)

Address relative to ImageBase. Actual memory address = ImageBase + RVA.

πŸ’‘ Entry Point

AddressOfEntryPoint is the RVA of the first instruction to execute.
This is the first place to look at in RE!

Actual address = ImageBase + EntryPoint
For example: 0x00400000 + 0x1000 = 0x00401000

πŸ“¦ Sections

πŸ“– Definition: Section

A region in the file with a specific purpose - code, data, resources, etc. Each section has a name, size, and permissions.

Section Content Typical Permissions
.text Program code Read + Execute
.data Initialized global variables Read + Write
.rdata Read-only data, strings, imports Read only
.bss Uninitialized variables Read + Write
.rsrc Resources (icons, dialogs, strings) Read only
.reloc Relocation info (for loading at different address) Read only

Section Header Structure

Each Section Header is exactly 40 bytes and contains info about one Section:

πŸ’‘ Section Header Entry (40 bytes)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Field              β”‚ Size    β”‚ Example              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Name               β”‚ 8 bytes β”‚ ".text"              β”‚
β”‚ VirtualSize        β”‚ 4 bytes β”‚ 0x1000               β”‚
β”‚ VirtualAddress     β”‚ 4 bytes β”‚ 0x1000 (RVA)         β”‚
β”‚ SizeOfRawData      β”‚ 4 bytes β”‚ 0x800                β”‚
β”‚ PointerToRawData   β”‚ 4 bytes β”‚ 0x400 ← WHERE in fileβ”‚
β”‚ ... (other fields) β”‚ 12 bytesβ”‚                      β”‚
β”‚ Characteristics    β”‚ 4 bytes β”‚ 0x60000020 (R+X)     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                

The key field: PointerToRawData

This tells you the file offset where the actual Section data starts!

Example: 3 sections = 3 headers = 3 Γ— 40 = 120 bytes

Section Headers area in file:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ .text header β”‚ PointerToRawData = 0x400 β”‚ β†’ Go to offset 0x400 for code
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ .data header β”‚ PointerToRawData = 0x800 β”‚ β†’ Go to offset 0x800 for data
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ .rsrc header β”‚ PointerToRawData = 0xC00 β”‚ β†’ Go to offset 0xC00 for resources
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
            

πŸ“₯ Import Table

πŸ“– Definition: Import Table

A list of functions the program imports from other DLLs. Windows fills in the addresses at load time.

Where is the Import Table?

This can be confusing! The Import Table involves two places:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  WHERE IS THE IMPORT TABLE?                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

   OPTIONAL HEADER                              SECTIONS
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚ ...                 β”‚                    β”‚ .text               β”‚
  β”‚ DataDirectory[0]    β”‚                    β”‚ (code)              β”‚
  β”‚ DataDirectory[1] ───┼───── POINTER β”€β”€β”€β”€β”€β–Ίβ”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
  β”‚   (Import Table)    β”‚      (RVA)         β”‚ .rdata or .idata    β”‚
  β”‚ DataDirectory[2]    β”‚                    β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
  β”‚ ...                 β”‚                    β”‚ β”‚ IMPORT TABLE    β”‚ β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚ β”‚ (actual data)   β”‚ β”‚
                                             β”‚ β”‚ - kernel32.dll  β”‚ β”‚
   Contains: POINTER                         β”‚ β”‚   - CreateFile  β”‚ β”‚
   (tells you WHERE                          β”‚ β”‚   - ReadFile    β”‚ β”‚
   to find imports)                          β”‚ β”‚ - user32.dll    β”‚ β”‚
                                             β”‚ β”‚   - MessageBox  β”‚ β”‚
                                             β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
                                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                             
                                              Contains: ACTUAL DATA
                                              (the real import list)
            
πŸ’‘ Two Parts to Understand
Location What's there Analogy
DataDirectory[1] A pointer (RVA) saying "imports are at address X" Street address on an envelope
Inside a Section
(usually .rdata or .idata)
The actual Import Table data (DLL names, function names) The actual house at that address

Common imported functions (tells us what the program does):

πŸ’‘ Important in RE

Scanning the Imports gives an important hint about what the program does!
Malware sometimes tries to hide imports (dynamic loading, obfuscation).

πŸ“€ Export Table

πŸ“– Definition: Export Table

A list of functions the file exports for use by others. Common in DLLs.

Just like Import Table, the Export Table has a pointer and actual data:

DataDirectory[0]  β†’  POINTER (RVA) saying "exports are at address X"
                           ↓
.edata section    β†’  ACTUAL EXPORT DATA (function names, addresses)
            

πŸ“‹ DataDirectory - The Master Pointer Table

The DataDirectory in the Optional Header is an array of 16 pointers.
Each entry points to different data stored in the sections:

Index Name Points to
0 Export Table Functions this file exports (for DLLs)
1 Import Table Functions imported from other DLLs
2 Resource Table Icons, dialogs, strings, etc.
3 Exception Table Exception handling info
4 Certificate Table Digital signatures
5 Base Relocation For loading at different address
6 Debug Debug information
12 IAT Import Address Table
14 CLR Runtime .NET metadata
πŸ’‘ Key Concept: Headers Have Pointers, Sections Have Data
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     HEADERS vs SECTIONS                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚   OPTIONAL HEADER                      SECTIONS                 β”‚
β”‚   (DataDirectory)                      (actual content)         β”‚
β”‚                                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚   β”‚ [0] Export ──┼────► POINTER ────►│ .edata: Export data  β”‚   β”‚
β”‚   β”‚ [1] Import ──┼────► POINTER ────►│ .rdata: Import data  β”‚   β”‚
β”‚   β”‚ [2] Resource─┼────► POINTER ────►│ .rsrc:  Icons, etc.  β”‚   β”‚
β”‚   β”‚ [5] Reloc  ──┼────► POINTER ────►│ .reloc: Reloc data   β”‚   β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                 β”‚
β”‚   Contains: RVA + Size               Contains: ACTUAL DATA      β”‚
β”‚   (WHERE to find it)                 (the real content)         β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                

Remember: Headers never contain the actual data - they only contain pointers that tell you where to find the data in the sections!

πŸ”§ Tools for PE Analysis

πŸ“‹ Chapter Summary