π¦ What is PE?
The executable file format for Windows. Includes .exe, .dll, .sys, .ocx and more.
"Portable" because the same format works on all Windows versions.
When you open an EXE file in a Hex Editor or RE tool, you see the PE structure. Understanding this format is essential for analyzing malware and suspicious files.
π¨ Who Creates the PE Structure?
The developer only writes source code. The compiler and linker automatically create all the PE headers, signatures, and structure!
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FROM CODE TO EXE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
DEVELOPER COMPILER LINKER
writes this: creates this: creates this:
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β main.c β β main.obj β β program.exe β
β β β β β β
β #include <...> β β Machine code β β βββββββββββββββ β
β β ββββΊ β (just the code, β ββββΊ β β DOS Header β β
β int main() { β compile β no headers) β link β β DOS Stub β β
β printf("Hi"); β β β β β PE Sig β β
β return 0; β β Symbols: β β β File Header β β
β } β β - main β β β Opt Header β β
β β β - printf (ref) β β β Sections β β
βββββββββββββββββββ βββββββββββββββββββ β β .text β β
β β .data β β
YOU write COMPILER does β β Imports β β
only this! the translation β βββββββββββββββ β
βββββββββββββββββββ
LINKER builds
the complete
PE structure!
| Role | Creates |
|---|---|
| Developer | Source code only (what the program does) |
| Compiler | Machine code (translated instructions) |
| Linker | Complete PE structure (DOS Header, PE Signature, all headers, sections, imports) |
ποΈ General PE Structure
| Term | What is it | Analogy |
|---|---|---|
| Section Headers | Metadata describing each section (name, size, location, permissions) | Table of Contents in a book |
| Sections | The actual code/data content (.text, .data, .rsrc) | The actual chapters in a book |
Each Section Header (40 bytes) tells you WHERE to find its Section in the file!
π DOS Header
The first 64 bytes of every PE. Starts with MZ (0x4D5A) - initials of Mark Zbikowski, a Microsoft programmer.
Important fields (these are actual field names from the Windows IMAGE_DOS_HEADER structure):
| Field | Location | Value | Meaning |
|---|---|---|---|
e_magic |
Offset 0x00 (fixed) | 4D 5A = "MZ" |
Magic number - confirms this is an executable |
e_lfanew |
Offset 0x3C (fixed) | Varies per file | Pointer to PE Signature location |
The field e_lfanew is always at offset 0x3C - this never changes.
But the value inside tells you where to find the PE Signature:
Step 1: Go to offset 0x3C
Step 2: Read the 4-byte value there (e.g., 80 00 00 00 = 0x80)
Step 3: Jump to that offset (0x80)
Step 4: You should find: 50 45 00 00 ("PE\0\0")
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Offset 0x00: 4D 5A ... ; e_magic = "MZ" (always) β
β ... β
β Offset 0x3C: 80 00 00 00 ; e_lfanew = 0x80 β
β β (this VALUE varies) β
β βββββββββββββββββββββββββββββββ β
β ... β β
β Offset 0x80: 50 45 00 00 ; PE Signature found here! β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π PE Signature & File Header
4 bytes that confirm this is a Windows PE file: 50 45 00 00
In ASCII: "P" (0x50) + "E" (0x45) + null (0x00) + null (0x00) = "PE\0\0"
| Signature | Bytes | Purpose |
|---|---|---|
MZ |
4D 5A |
Says "I might be an executable" (legacy DOS compatibility) |
PE\0\0 |
50 45 00 00 |
Confirms "Yes, I'm definitely a Windows PE file!" |
Historical reason: PE files start with MZ so old DOS systems show "This program cannot be run in DOS mode" instead of crashing. Windows uses e_lfanew to find the real PE data.
File Header (COFF Header) - 20 bytes:
| Field | Size | Description |
|---|---|---|
Machine |
2 | CPU type (0x14c=i386, 0x8664=AMD64) |
NumberOfSections |
2 | How many sections are in the file |
TimeDateStamp |
4 | Compilation time (Unix timestamp) |
SizeOfOptionalHeader |
2 | Size of Optional Header |
Characteristics |
2 | Flags (EXE, DLL, etc.) |
π Optional Header
Despite the name, it's required for EXE and DLL! Contains critical information:
| Field | Description |
|---|---|
Magic |
0x10b = PE32, 0x20b = PE32+ (64-bit) |
AddressOfEntryPoint |
RVA of entry point - where code execution starts! |
ImageBase |
Preferred load address (usually 0x00400000) |
SectionAlignment |
Section alignment in memory |
FileAlignment |
Section alignment in file |
SizeOfImage |
Total size in memory |
Subsystem |
GUI (2), Console (3), Driver (1) |
DataDirectory |
Array of 16 entries - imports, exports, resources... |
Address relative to ImageBase. Actual memory address = ImageBase + RVA.
AddressOfEntryPoint is the RVA of the first instruction to execute.
This is the first place to look at in RE!
Actual address = ImageBase + EntryPoint
For example: 0x00400000 + 0x1000 = 0x00401000
π¦ Sections
A region in the file with a specific purpose - code, data, resources, etc. Each section has a name, size, and permissions.
| Section | Content | Typical Permissions |
|---|---|---|
.text |
Program code | Read + Execute |
.data |
Initialized global variables | Read + Write |
.rdata |
Read-only data, strings, imports | Read only |
.bss |
Uninitialized variables | Read + Write |
.rsrc |
Resources (icons, dialogs, strings) | Read only |
.reloc |
Relocation info (for loading at different address) | Read only |
Section Header Structure
Each Section Header is exactly 40 bytes and contains info about one Section:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Field β Size β Example β
ββββββββββββββββββββββΌββββββββββΌβββββββββββββββββββββββ€
β Name β 8 bytes β ".text" β
β VirtualSize β 4 bytes β 0x1000 β
β VirtualAddress β 4 bytes β 0x1000 (RVA) β
β SizeOfRawData β 4 bytes β 0x800 β
β PointerToRawData β 4 bytes β 0x400 β WHERE in fileβ
β ... (other fields) β 12 bytesβ β
β Characteristics β 4 bytes β 0x60000020 (R+X) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The key field: PointerToRawData
This tells you the file offset where the actual Section data starts!
Example: 3 sections = 3 headers = 3 Γ 40 = 120 bytes
Section Headers area in file:
βββββββββββββββββββββββββββββββββββββββββββ
β .text header β PointerToRawData = 0x400 β β Go to offset 0x400 for code
ββββββββββββββββΌβββββββββββββββββββββββββββ€
β .data header β PointerToRawData = 0x800 β β Go to offset 0x800 for data
ββββββββββββββββΌβββββββββββββββββββββββββββ€
β .rsrc header β PointerToRawData = 0xC00 β β Go to offset 0xC00 for resources
βββββββββββββββββββββββββββββββββββββββββββ
π₯ Import Table
A list of functions the program imports from other DLLs. Windows fills in the addresses at load time.
Where is the Import Table?
This can be confusing! The Import Table involves two places:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WHERE IS THE IMPORT TABLE? β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
OPTIONAL HEADER SECTIONS
βββββββββββββββββββββββ βββββββββββββββββββββββ
β ... β β .text β
β DataDirectory[0] β β (code) β
β DataDirectory[1] ββββΌβββββ POINTER ββββββΊβββββββββββββββββββββββ€
β (Import Table) β (RVA) β .rdata or .idata β
β DataDirectory[2] β β βββββββββββββββββββ β
β ... β β β IMPORT TABLE β β
βββββββββββββββββββββββ β β (actual data) β β
β β - kernel32.dll β β
Contains: POINTER β β - CreateFile β β
(tells you WHERE β β - ReadFile β β
to find imports) β β - user32.dll β β
β β - MessageBox β β
β βββββββββββββββββββ β
βββββββββββββββββββββββ
Contains: ACTUAL DATA
(the real import list)
| Location | What's there | Analogy |
|---|---|---|
DataDirectory[1] |
A pointer (RVA) saying "imports are at address X" | Street address on an envelope |
| Inside a Section (usually .rdata or .idata) |
The actual Import Table data (DLL names, function names) | The actual house at that address |
Common imported functions (tells us what the program does):
CreateFile,ReadFile- file accessRegOpenKey- Registry accesssocket,connect- network communicationVirtualAlloc- memory allocationCreateProcess- running programs
Scanning the Imports gives an important hint about what the program does!
Malware sometimes tries to hide imports (dynamic loading, obfuscation).
π€ Export Table
A list of functions the file exports for use by others. Common in DLLs.
Just like Import Table, the Export Table has a pointer and actual data:
DataDirectory[0] β POINTER (RVA) saying "exports are at address X"
β
.edata section β ACTUAL EXPORT DATA (function names, addresses)
π DataDirectory - The Master Pointer Table
The DataDirectory in the Optional Header is an array of 16 pointers.
Each entry points to different data stored in the sections:
| Index | Name | Points to |
|---|---|---|
| 0 | Export Table | Functions this file exports (for DLLs) |
| 1 | Import Table | Functions imported from other DLLs |
| 2 | Resource Table | Icons, dialogs, strings, etc. |
| 3 | Exception Table | Exception handling info |
| 4 | Certificate Table | Digital signatures |
| 5 | Base Relocation | For loading at different address |
| 6 | Debug | Debug information |
| 12 | IAT | Import Address Table |
| 14 | CLR Runtime | .NET metadata |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β HEADERS vs SECTIONS β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β OPTIONAL HEADER SECTIONS β
β (DataDirectory) (actual content) β
β β
β ββββββββββββββββ ββββββββββββββββββββββββ β
β β [0] Export βββΌβββββΊ POINTER βββββΊβ .edata: Export data β β
β β [1] Import βββΌβββββΊ POINTER βββββΊβ .rdata: Import data β β
β β [2] ResourceββΌβββββΊ POINTER βββββΊβ .rsrc: Icons, etc. β β
β β [5] Reloc βββΌβββββΊ POINTER βββββΊβ .reloc: Reloc data β β
β ββββββββββββββββ ββββββββββββββββββββββββ β
β β
β Contains: RVA + Size Contains: ACTUAL DATA β
β (WHERE to find it) (the real content) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Remember: Headers never contain the actual data - they only contain pointers that tell you where to find the data in the sections!
π§ Tools for PE Analysis
- PE-bear - excellent graphical PE analysis tool
- CFF Explorer - advanced PE editor
- pestudio - PE analysis for malware identification
- Detect It Easy (DIE) - identify packers and compilers
- pefile (Python) - library for PE analysis in code
π Chapter Summary
- PE - Windows executable file format
- DOS Header - starts with MZ, points to PE Header
- File Header - CPU type, number of sections
- Optional Header - Entry Point, ImageBase, imports
- Sections - .text (code), .data (data), .rsrc (resources)
- Import Table - imported functions (very important for analysis!)
- RVA - relative address, need to add ImageBase