UPDATE (November 2022)The latest version of this utility,
v1.30 is attached to the post
https://forum.lazarus.freepascal.org/index.php/topic,46617.msg459635.html#msg459635This version supersedes all previous versions and
is the preferred version. More information about it is provided in the above link.
END UPDATE
UPDATE (July 2022)use the
UPDATE (November 2022) above instead.
The
latest version of this utility,
v1.20 is attached to the post
https://forum.lazarus.freepascal.org/index.php/topic,46617.msg448906.html#msg448906This version supersedes all previous versions and is the preferred version. More information about it is provided in the above link.END UPDATE
UPDATE (May, 2020):use the
UPDATE (November 2022) above instead.
The
latest version of this utility, v1.10 is attached to the post
https://forum.lazarus.freepascal.org/index.php/topic,46617.msg361962.html#msg361962END UPDATE
UPDATE (March, 2020):use the
UPDATE (November 2022) above instead.
This is version 1.00
IF you want to dump PE files applicable to the ARM32 and ARM64 architecture then use version 1.02 found in a later post (not 1.01). Note: the ARM32/64 has a few minor bugs that have _not_ been corrected.
version 1.02 is attached to the post
https://forum.lazarus.freepascal.org/index.php/topic,46617.msg354933.html#msg354933END UPDATE
Hello,
I decided to share one of my personal utilities. This utility is a PE dump program that outputs the contents of a PE file (.exe, .dll and, various other extensions.)
I'll disappoint you upfront, source is _not_ included nor do I intend to make it available in the near future.
I originally wrote this utility in C because no existing PE file dump program provided the level of detail I needed. After getting somewhat familiar with FPC, I thought it would be a good and simple exercise to port it to Pascal, this is the result.
What follows is _not_ an explanation of the PE file structure, only a short explanation, for those interested, on of how to use the program and interpret its output.
1. to execute the program, simply type
PeBytesF <executable>
the program always dumps the entire PE file. There are _no_ switches to control what is dumped, it always dumps _everything_. Note that in some cases, though rare, the resulting dump can be as large as half a gigabyte.
The output is _very_ detailed. The program inspects and identifies _every_ byte that is part of the PE specification.
In spite of that, it is reasonably quick. Outputting about 110,000 lines per second on an average machine, in most cases it executes in a few seconds but, in some cases, Chrome_child.dll being one of them, it can take anywhere between 20 to 30 seconds depending on how fast the machine is.
What follows are a few examples of its output and what it means.
40.0000 - DOS HEADER -
40.0000 0 [ 2] DOS signature : 5a4d (MZ)
40.0002 2 [ 2] bytes on last page of file : 90 ( 144)
40.0004 4 [ 2] page count : 3
40.0006 6 [ 2] relocations count : 0
40.0008 8 [ 2] size of header in paragraphs : 4
40.000a a [ 2] minimum extra paragraphs needed : 0
40.000c c [ 2] maximum memory in paragraphs : ffff (65,535)
40.000e e [ 2] initial relative SS : 0
40.0010 10 [ 2] initial SP : b8
40.0012 12 [ 2] checksum : 0
40.0014 14 [ 2] initial IP : 0
40.0016 16 [ 2] initial relative CS : 0
In the above,
1. The first column is the virtual address where the field will be located in memory if the exe/dll is loaded at its preferred load address.
2. The second column is the field's file offset. This file offset can be used to locate and edit the value of the field in a hex editor (one of the reasons I wrote this utility.)
3. The third column is the field size in bytes. All fields that are "additional" information, that is, information _not present_ in the PE file are in brackets.
4. Columns 4 (label) and 5 (field value) are self explanatory.
5. values in parentheses are alternative interpretation of the previous column's value - usually simply showing a hex value in decimal but, as shown for the DOS signature, the alternative interpretation is a string in that particular case.
Note that the "DOS HEADER" title has a virtual address but _no_ file offset nor size. This is to make it even more obvious that it is just a header and the text shown on that line is _not_ present in the PE file.
Other formatting conventions
40.0084 - IMAGE_FILE_HEADER -
40.0084 84 [2] Machine : 8664
40.0084 + IMAGE_FILE_MACHINE_AMD64
40.0086 86 [2] Number of sections : 11 ( 17)
40.0088 88 [4] Time Date Stamp : 5ad8.467e (2018/04/19 7:34:22)
40.008c 8c [4] Pointer to symbol table : 744.cc00
40.0090 90 [4] Number of symbols : c19e (49,566)
40.0094 94 [2] Size of optional header : f0 ( 240)
40.0096 96 [2] Characteristics : 27
40.0096 + IMAGE_FILE_RELOCS_STRIPPED 1
40.0096 + IMAGE_FILE_EXECUTABLE_IMAGE 2
40.0096 + IMAGE_FILE_LINE_NUMS_STRIPPED 4
40.0096 + IMAGE_FILE_LARGE_ADDRESS_AWARE 20
In the above,
1. If a field's value has a mnemonic, as is the case for the value of the "Machine" field, the mnemonic text _always_ appears _below_ it.
2. The "Characteristics" field is broken into its constituent bit fields and the mnemonics corresponding to each bit is _always_ on a separate line.
Note also that, the virtual address of each mnemonic is the same as that of the complete field. To make it more evident that they are a "piece" of the field above them, each virtual address is followed by a "+" sign. That indicates those lines are simply interpretations of the line above.
More formatting conventions,
40.0098 - IMAGE_OPTIONAL_HEADER -
40.0098 98 [2] Magic : 20b
40.0098 + IMAGE_NT_OPTIONAL_HDR64_MAGIC
...
...
40.00a8 a8 [4] Entry point rva : 14e0 [Va: 40.14e0] [FO: ae0] [ .text]
40.00ac ac [4] Base of code (rva) : 1000 [Va: 40.1000] [FO: 600] [ .text]
In the above, note the following:
1. Every RVA (relative virtual address) is followed by 3 bracketed fields, the first is its equivalent virtual address, the second is the field's file offset that corresponds to that virtual address and the third is the name of the section that contains that virtual address.
In the above example, the hex value 14e0 corresponds to a virtual address of 40.14e0 and a file offset of ae0 which is found in the .text section. This information is shown for every RVA found in the PE file and it is one of the reasons this program's output can occasionally be very large.
Similarly, some PE fields are VA (virtual addresses) instead of RVAs (as in the case of TLS callbacks), in those cases, the first field is the VA's equivalent RVA.
and more....
40.02f0 SECTION: 10 of 17
40.02f0 2f0 [8] Name : /4
752.6920 [15] .debug_aranges
40.02f8 2f8 [4] Virtual size : 85a0 (34,208)
40.02fc 2fc [4] (Relative) Virtual address : 74.a000
40.0300 300 [4] Size of raw data : 8600 (34,304)
40.0304 304 [4] File offset/pointer to raw data : 72.5000
40.0308 308 [4] File offset/pointer to relocations : 0
40.030c 30c [4] File offset/pointer to linenumbers : 0
40.0310 310 [2] Number of relocations : 0
40.0312 312 [2] Number of linenumbers : 0
40.0314 314 [4] Characteristics : 4250.0040
40.0314 + SCN_CNT_INITIALIZED_DATA 40
40.0314 + SCN_ALIGN_16BYTES 5
40.0314 + SCN_MEM_DISCARDABLE 200.0000
40.0314 + SCN_MEM_READ 4000.0000
In the above,
Note that the DWARF debug section name is shown under the "Name" field _but_ no virtual address is shown, only a file offset. This indicates a value (in this case the name of the DWARF debug section) that is _not_ mapped by the Windows loader. IOW, if a program wants to read that value, it will have to load that part of the file _itself_, which means, it will be loaded somewhere/anywhere in the virtual address space, thus making the value of the virtual address impossible to predict by just inspecting the PE file.
This occurs with debugging information and also with security certificates.
other important considerations:
Most PE files have a hint name table and an import address table. Those tables, in the PE file, are usually but not always identical. When the table are identical, this causes this program to output the Hint-Name fields twice. Once for the Hint Name Table and a second time for the Import Address Table.
There are some exceptions, among them:
1. The Borland linker does not populate the hint name table. As a result, for Borland executables, only an Import Address Table is shown.
2. If the executable is bound then the Import Address Table consists of virtual addresses instead of references to hint names.
The second case is very easy to recognize, it appears as follows:
78ca.bc94 - IMPORTS DIRECTORY - number of import descriptors : 4
78ca.bc94 IMPORT descriptor : 1 - Library : ntdll.dll
78ca.bc94 8.a694 [ 4] Hint name table : 8.bd10 [Va: 78ca.bd10] [FO: 8.a710] [ .rdata]
78ca.bc98 8.a698 [ 4] Time Date Stamp : ffff.ffff
78ca.bc9c 8.a69c [ 4] Forwarder chain : ffff.ffff
78ca.bca0 8.a6a0 [ 4] Library name rva : 8.bd00 [Va: 78ca.bd00] [FO: 8.a700] [ .rdata]
78ca.bd00 8.a700 [10] Imports library name : ntdll.dll
78ca.bca4 8.a6a4 [ 4] Import address table : 8.2000 [Va: 78ca.2000] [FO: 8.0a00] [ .rdata]
78ca.bd10 HINT NAME TABLE : 91 entries
78ca.bd10 8.a710 [8] 8.c740 [Va: 78ca.c740] [FO: 8.b140] [ .rdata]
78ca.c740 8.b140 [ 2] Hint: 34b
78ca.c742 8.b142 [12] Name: RtlFreeHeap
78ca.bd18 8.a718 [8] 8.c74e [Va: 78ca.c74e] [FO: 8.b14e] [ .rdata]
78ca.c74e 8.b14e [ 2] Hint: 79f
78ca.c750 8.b150 [11] Name: swprintf_s
78ca.bd20 8.a720 [8] 8.c75c [Va: 78ca.c75c] [FO: 8.b15c] [ .rdata]
78ca.c75c 8.b15c [ 2] Hint: 3a3
78ca.c75e 8.b15e [21] Name: RtlInitUnicodeString
...
...
more hint name table entries
...
...
78ca.2000 IMPORT ADDRESS TABLE : 91 entries
78ca.2000 8.0a00 [8] 78ea.3200
78ca.2008 8.0a08 [8] 78e9.7350
78ca.2010 8.0a10 [8] 78ea.5280
78ca.2018 8.0a18 [8] 78e9.15b0
78ca.2020 8.0a20 [8] 78e7.7f70
78ca.2028 8.0a28 [8] 78ea.1430
78ca.2030 8.0a30 [8] 78ea.1400
78ca.2038 8.0a38 [8] 78ea.1630
78ca.2040 8.0a40 [8] 78ea.1480
78ca.2048 8.0a48 [8] 78e7.6eac
78ca.2050 8.0a50 [8] 78ed.f370
78ca.2058 8.0a58 [8] 78ea.3000
78ca.2060 8.0a60 [8] 78ea.2fc0
78ca.2068 8.0a68 [8] 78ea.4d40
In the above, when both, the hint name table and the Import Address Table are present and the executable is not bound then, the tables will have the same information. When the executable is bound, the Hint Name Table will point to Hint-Name pairs while the Import Address Table will contain pointers to where the functions they import are supposed to be found.
Since this post may be reaching the limit of what the forum software allows, additional information will continue in posts after this one.
Download and enjoy.
PS: I would like this program/utility to be available ONLY as an attachment to this post. IOW, don't re-post it anywhere else. Thank you.